INQDATA - News and Insights - User Defined Data Quality Landscapes

Per John Mason, Group Head of Pricing at LSEG - “For every $1 spend on financial market data, a further $8 was being spent by financial services organizations on processing, storing and transforming the data before it could be analyzed".

When it comes to data – particularly data used for AI & ML Modelling – we should all be aware of the 80/20 rule. Data users spend approximately 80% of their time on the pre-processing required to cleanse their data before putting it to use. Additional complexities associated with ingestion, storage and delivery further compound this effect when dealing with larger datasets Problems such as identifying data issues and working with large-scale data can increase the barrier to usability for these datasets so only the most experienced and expert users can derive any benefit.

INQDATA simplify the data processing and cleansing for our users through data pipeline management (ingest, storage and serving) and the provision of model-ready data by using three-step approach to Data Quality:

Core Data Quality Checks capture clear data quality issues like Null, Duplicate and objectively incorrect data;

Data Expectation Checks benchmark data received against expected values to identify missing data and data outliers; and

User Defined Behavioural checks allow users to create their own data landscape – their view of "quality" data for their unique sensitivities.

This third step empowers users to create personalized data views and curate their own data quality standards, becoming the foundation of our user-centric data experience.

*INQDATA multi-layer method for producing model-ready data*

Pre-processed data from internal teams, though beneficial, can sometimes miss the mark for specific user needs. Consider quantitative analysts calculating a moving VWAP for algorithms – they might prefer to exclude outliers that distort the results. Conversely, the compliance team might be particularly interested in those very outliers for deeper investigation. This highlights the need for flexible data access that empowers users to tailor the data to their unique requirements.

*Many users want to exclude “valid” datapoints from their analysis e.g. VWAP Calculations*

‍

INQDATA’s library of configurable behaviour checks empower users to identify and label data points matching specific activity patterns. These labels seamlessly integrate with our APIs allowing users to include, exclude or label these points within their retrieved data. This personalized approach lets users craft their unique data landscape without affecting others, ensuring everyone gets the view they need.

Using INQDATA, users can leverage our configurable behaviour checks to label specific data points based on their needs. Consider the VWAP example from above: quantitative analysts might activate the "Intra-day deviations from average" behaviour check, which identifies trade sizes exceeding a certain threshold (e.g., 5 standard deviations) from the daily average.

*User behaviour labelling allows the exclusion of data points that match a particular behaviour*

Similarly, they might have identified outliers using an “Intra-day tick-by-tick movement” behaviour check, or one of many others. This library of checks empowers users to tailor the data landscape to their unique requirements, with seamless integration into aggregations and filtering within the API.

*User labels are fully integrated into data return APIs e.g. time-based aggregations*

Interested in learning more about INQDATA or think it could be a fit for your needs? Reach out today.

‍

Back to Home

🍪 Do you like cookies?