DQ Series: [Data] Methodology is applied Ideology. For many, ensuring data quality is a responsive action, meaning usually, the most used datasets have reasonable data quality controls while periphery datasets are left to one side. These periphery datasets can be caught in a negative feedback loop, being overlooked by users because they are not consistent with the larger datasets.

Our Data Ideology is that consistency of quality, delivery and usability is the key to empowering end data users to truly unlock the value in all their data. This ideology is rooted in the belief that beyond a simplistic understanding of “good” or “bad”, data quality is a nuanced issue and often inextricably tied to the data’s intended use.

To capture this nuance, our Data Quality methodology is comprised of three main components which provide a foundation for users to create their own Data Quality Landscape:

  1. Core Data Quality checks (objective correctness and completeness checking)
  1. Data Expectation checks (is this data in line with reasonable and historical expectations of the data?)
  1. Data Quality Behavioural checks (does this data look right for me and my needs)

Core Data Quality Checks: "The data, the whole data, and nothing but the data."

They say the only thing worse than no data, is bad data – or more commonly in the AI/ML sphere: “junk in, junk out”. INQDATA integrates fully with Market Data Providers to manage consistent and ongoing data delivery, while monitoring the full data transfer process to capture and instantly remediate any data problems or delivery issues. This component of our system is responsible for objective data quality checks like completeness, correctness and ensuring timeliness of data.

Data Expectation Checks: “Is this a dataum I see before me?”

The most impactful data quality issues for end users, like missing or incomplete data, often go unnoticed without proactive monitoring from the market data vendor, we also actively monitor to identify changes in the continuous data onboarding by leveraging receive is within expectations for each instrument, time periods (intra-day, and inter-day time bucketing), regions etc. This active monitoring allows us to identify data gaps, changes.

Data Quality Behavioural Checks: “One man's [data] trash is another man's [data] treasure.”

An inescapable problem with data quality is if you ask two people a question, you'll get three different answers. Namely, not everyone agrees on what is good quality data and often, data points that one desk would exclude may be exactly the data points that another team would like to investigate. Similar to everyone having a different point of view, at INQDATA we believe data users have independent views on what they want from their data, and give users the ability to cultivate their own data landscape.  

We do this by providing users with a series of Data Quality Behavioural Checks which identify particular behaviours within their datasets; such as outliers in their price series or identifying trades with sizes greater than X standard deviations from that daily average. These configurable kdb+ powered data behavioural checks form the basis of a robust labelling infrastructure within INQDATA, which is fully integrated with our data access APIs to allow users to return just the model-ready data of interest to them. Users can combine and exclude labelled data to define their unique view of quality data for each specific use case.  

A cornerstone of our ideology is to meet clients where their needs lie, building success through a new, user owned data quality landscape. INQDATA are excited to announce further additions to our user configurable Data Behaviour checks in the next post of the series… so watch this space!  

If INQDATA sounds like something you or your team are interested in hearing more about, contact us at info@inqdata.com.

Alternatively if you just want to chat Data Quality feel free to contact me directly at rebecca.kelly@inqdata.com.

Back to Home