Your Enterprise Grade
Feature Factory

Discover, evaluate and manage features for Machine Learning and Advanced Analytics.

Discover features with dotData

Why dotData Feature Factory?

dotData Py is an enterprise-grade feature discovery platform that helps data science and data engineering teams iterate feature engineering faster and build production-quality feature pipelines automatically.

“Its exceptional data management and feature engineering capabilities make it especially suitable for the most challenging use cases…feature engineering is powerful and scalable, even across tens of tables with billions of rows.”

Key Features

dotData Feature Discovery

Feature Discovery & Engineering

Search millions of hypotheses

Search millions of feature hypotheses from enterprise-scale relational data across tens of tables, thousands of columns, and billions of rows.

Find the right features for your models

Leverage supervised learning techniques to address feature over-fitting, collinearity, stability, drift, and more.

Feature discovery and engineering

Explore different featurization techniques such as categorical encoding, numeric aggregation, temporal recency and periodicity, geo-location grid, text topic encoding, etc.

Understand feature quality with dotData

Feature Quality & Selection

Automated data prep and cleansing

Built-in data & feature cleansing such as string value canonicalization, record duplication removal, missing value imputation, outlier elimination, etc.

Optimize features for your models

Apply stream feature selection and optimization techniques to evaluate millions of features and select the most impactful ones in hours.

Temporal features that work

Prevent temporal data leakage and guarantee point-in-time correctness with advanced temporal relation search.

Gain Feature Transparency

Transparency & Insights

Score feature importance

Derive supervised feature importance (such as feature-wise AUC, permutation importance, and sample-wise SHAP) as well as feature statistics as feature metadata.

Feature explanations for full transparency

Produce natural language feature explanations to contextually understand the features and feature blueprint to visually understand the data lineage.

Feature queries built-in

Generate feature queries to reveal every single step of the feature generation processes with 100% transparency.

Generate Product Quality ML Code

Feature Pipeline & Query

Production-ready pipelines

Build “production-ready” data and feature pipelines that include complete steps from data cleansing through multi-table aggregation to feature transformation.

Feed your Feature Store

Support Dataframe as the standard input and output format that can be connected with any type of data storage and feature stores.

Fine-tune your queries

Customize feature pipelines and queries and tune them to the requirements of your production environment.

What Challenges does Feature Factory Solve?

SMBC Corp - dotData Feature Factory Client

How SMBC Accelerated Their Feature Development Process 48X

When SMBC – one of the World’s largest banks – wanted to accelerate their AI/ML development process, they turned to dotData’s Feature Factory platform. Download the case study to see how they accelerated development times by 4,800%.

Key Features of dotData Feature Factory