Feature Factory is a fundamental shift in how enterprise data science teams develop curated data and accumulate data know-how as reusable assets. Feature spaces and the ability to discover features through a data-centric, programmatic approach leads to enhanced collaboration, better efficiency, increased model quality, greater reusability, reproducibility, scalability, and transparency. Break down silos and capitalize on the wealth of information at your disposal.
Connect to multiple data sources, data lakes, or data warehouses and ingest the data as Spark Dataframes in Python
Specify your target variable and the source tables as Dataframes you will use to build features. Define your search criteria and run dotData Feature Factory from your favorite Python IDE or notebook
Explore and evaluate discovered features interactively from Python
Iterate feature discovery experiments to derive better quality and higher-order features. insightsExplore, optimize, and tune features interactively. Choose which features to extract for further analysis, modeling, or reporting from within Python
Populate feature stores and continuously update features in production applications
Install dotData Feature Factory in your AWS EMR instance to accelerate feature discovery for your data science team.
Quickly deploy dotData Feature Factory via pip-install – even on your own personal laptop.
When SMBC, one of the world’s largest banks, wanted to get the maximum value from their feature engineering investment, they turned to dotData. Download the case study and read how the went from 2,000 features a year to over 2,000,000.
Take our five-minute self-assessment to see if your data and organization could benefit from dotData’s Feature Factory revolution.