Data collection tools and infrastructure

Automatic Data Collection

Automatic collection and enrichment of massive of financial and non-financial data of companies

The software, due to the automatic generation of data arrays from various open sources, allows improving machine learning models. Moreover, arrays are analyzed for abnormal values and outliers

System requirements include the ability to: divide the formed massive into homogeneous groups based on the AverageKNN filtering method, form additional synthetic data to restore the balance of covariates using the SMOTE (Synthetic Minority Oversampling Technique) method, detect abnormal data samples using the method proposed by Tomek

Key product highlights

Adaptation of data flows to fetch the segment you need most combined with possibility of data enrichment to deal with the missing values 

[an error occurred while processing the directive]