dotData Launches dotData Py Lite, Putting the Power of AI Automation on Every Data Scientist

dotData announces dotData Feature Factory Signaling a Paradigm Shift in Enterprise Data Solutions

May 2, 2023

Empower all enterprise data solutions with powerful, data-centric data and feature discovery and capitalize feature discovery assets.

SAN MATEO, California, May 02, 2023 – dotData, a pioneer and leading provider of platforms for feature discovery, announced the public availability of dotData Feature Factory. The newly released platform provides advanced functionality that empowers data scientists with a data-centric approach to feature engineering powered by reusable feature discovery assets that have never been available until now. dotData Feature Factory enables a paradigm shift in enterprise data solutions and will replace dotData Py, a Python-based data science automation engine first introduced in 2018.

This new product provides our heart and core as an independent product,” said Ryohei Fujimaki, Ph.D., founder and CEO of dotData. “In past years, we have kept validating that Feature Discovery is the biggest pain in enterprise data solutions. The vision of the new dotData Feature Factory is to enhance all data solutions for enterprise organizations,” continued Mr. Fujimaki.

Jump-Start Feature Discovery

By its very nature, feature discovery is a slow, laborious process that requires deep data and domain knowledge. Enterprise data is vast, often including hundreds of tables, thousands of columns, and billions of rows. The combination of domain knowledge requirements and the mass of data makes getting started challenging for most organizations. dotData Feature Factory automatically identifies and suggests feature spaces from enterprise data, including relational, transactional, and temporal data, allowing you to kick-start feature discovery and identify key signals from day one.

Data-Centric, Programmatic Feature Engineering 

Feature engineering has traditionally been a highly manual, artisanal, and iterative process. Data scientists and domain experts often have many ideas but are constrained in their ability to explore new and innovative discovery paths because of a lack of time and resources. Feature Factory lets users programmatically define feature spaces and auto-generate 100X broader feature hypotheses using a data-centric approach to feature engineering that augments your existing data and feature knowledge.

Build Reusable Feature Discovery Assets

Feature engineering is more than just writing a simple SQL query. Complex data operations and transformations, from ETL through data cleansing and feature transformation, take a lot of time and are frequently iterated. The ad-hoc nature of feature engineering means that when features are identified for specific use cases, the process used to create them is often discarded or lost in a sea of old notebooks – even if the same process might have been useful in new use cases. dotData Feature Factory halts the process of re-inventing the wheel with an Analytic Database and Feature Descriptor that records every data and feature transformation step, giving data scientists the ability to capture data transformation know-how and build reusable feature discovery assets for themselves and their teams.

From Data Science Notebooks to Production-Ready Feature Pipelines in Seconds, Not Days

Feature discovery is often done by individual data scientists right in their Jupyter Notebook. Notebooks quickly become an unwieldy jumble of code, are typically poorly managed and organized, and most lack standardization. This makes it incredibly challenging for data engineers and ML engineers to convert data science notebooks into production-ready features. dotData Feature Factory makes it easy for data scientists to build transparent, readable, maintainable, and easily scalable features that cover edge cases when processing features with new data, accelerating and simplifying the process of moving from experiments to production. 

A Paradigm Shift for Enterprise Data 

dotData Feature Factory provides a fundamental shift in how enterprise organizations develop curated data and accumulate domain and data “know-how”s as reusable assets.. By generating feature spaces and discovering features in a data-centric and programmatic manner, Feature Factory offers enhanced collaboration, increased efficiency, improved model quality, reusability, reproducibility, scalability, and transparency. This innovative approach breaks down silos, enabling organizations to capitalize on the wealth of information available while improving the effectiveness of their downstream data solutions, including ML and AI. 

About dotData

dotData’s pioneering automated feature discovery and engineering platform solves the hardest challenge of AI/ML projects. Our Feature Factory technology discovers hidden gems for empowering your business as transparent, explainable features by connecting the dots within large-scale data sets in hours, without human bias. It enables data scientists to explore 100X more features, including those you’ve yet to imagine, and arguments AI/ML projects in an agile manner to deliver business value faster. In an era of rapid change, AI-discovered insights can be a game changer for business growth and innovation across industries. The power of dotData’s platform and ability to provide game-changing insights is why Fortune 500 organizations across the globe use dotData.

For more information, visit, and join the conversation on Twitter and LinkedIn

*Microsoft, Azure, and related product names are trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries. 


Name: Walter Paliska
Phone: 415-460-7844



dotData Inc.

dotData Automated Feature Engineering powers our full-cycle data science automation platform to help enterprise organizations accelerate ML and AI projects and deliver more business value by automating the hardest part of the data science and AI process – feature engineering and operationalization. Learn more at, and join us on Twitter and LinkedIn.