dotData Launches dotData Py Lite, Putting the Power of AI Automation on Every Data Scientist

dotData Announces dotData Feature Factory 1.1 with GenAI-Powered Assistance

May 7, 2024

dotData Feature Factory, a groundbreaking platform for feature discovery, boosts data quality, introduces interactive feature customization, enhances AutoML, and adds GenAI-assisted feature discovery in its version 1.1 release.

SAN MATEO, California, MAY 7, 2024 – dotData, a pioneer and leading provider of feature discovery platforms, today announced the general availability of version 1.1 of its Feature Factory platform. Version 1.1 introduces significant new capabilities, including enhanced data quality assessment, support for user-defined features and interactive feature selection, preview support for Generative AI-assisted (GenAI) feature discovery, integration with PyCaret AutoML, and more. 

“Version 1.1 of dotData Feature Factory reaffirms our commitment to innovation in feature discovery,” said Ryohei Fujimaki, dotData’s CEO and Founder. “With version 1.1, we are introducing an innovative mechanism to fuse data knowledge (AI-driven features) and domain knowledge (user-defined features) by leveraging generative AI. This approach brings higher-order data insights to our users.”

Smart Data Quality Assessment and Corrective Actions

Feature Factory Version 1.1 introduces significant enhancements to its smart data quality assessment capabilities. These improvements help users ensure the validity and reliability of their data and configurations. The updated function conducts a comprehensive series of assessments, including analysis of target distribution, data ranges across multiple tables, enrichment rates, target leakage, duplicated rows, and string canonicalization, among others. Upon detecting any data quality issues, Feature Factory automatically provides corrective actions and data transformations, complete with executable code snippets. This functionality enables users to swiftly identify and rectify potential data quality issues.

Enhanced Support for AutoML with PyCaret

Machine learning, a key application of features discovered by Feature Factory, sees significant enhancements in Version 1.1 with improved support for AutoML via PyCaret—a cutting-edge, Python-based AutoML library. This new integration enables users to swiftly evaluate discovered features and harness the capabilities of advanced AutoML to optimize their models. Importantly, while the workflow between feature discovery and AutoML is seamlessly integrated, users retain the ability to fine-tune and control PyCaret settings based on their expertise. This ensures a flexible and powerful toolset for model development.

Introducing User-defined Features

Feature Factory Version 1.1 unveils two versatile and innovative methods for users to incorporate their feature ideas into the platform. The first method is the user-defined feature primitive, which utilizes a straightforward and intuitive feature description language provided by Feature Factory. This allows users to easily define and integrate their own features into the existing feature space. The second method is the SQL transformer. This tool enables users to apply any SQL transformation to their data and incorporate these transformations directly into their Analytic Database. With these new capabilities, users can seamlessly combine automated programmatic feature discovery with their own domain knowledge-based feature engineering practices, enhancing both the flexibility and power of their analytical endeavors.

Preview Support of GenAI-Assisted Feature Discovery

Feature Factory Version 1.1 introduces a preview of groundbreaking functionality that allows for the creation of custom features using natural language. Users can simply describe the desired feature in English, and a GenAI assistant will automatically generate the necessary code. If the description is unclear or additional data knowledge is required, the assistant will prompt users for further information. This interactive dialogue enables users to define features without needing to code. It is important to note that the resulting features are automatically converted into either user-defined primitives or SQL transformers, both new additions in Version 1.1. These features are then immediately available in your feature space, enhancing usability and integration.

Interactive Feature Selection

Feature Factory Version 1.1 enhances the feature engineering process with a powerful new capability: interactive feature selection. This tool allows users to selectively evaluate features through an intuitive interface provided by Feature Factory. The interactive selector widget not only ranks candidate features but also highlights those with significant statistical promise. Users can assess and compare features based on their predictive power and interpretability. This functionality empowers users to tailor and optimize the feature pipeline generated by dotData, leveraging both domain expertise and analytical intuition.

Sign up for a personalized walkthrough of Feature Factory 1.1 today

About dotData

dotData’s pioneering automated feature discovery and engineering platform addresses the most complex challenge of AI/ML projects. Our Feature Factory technology uncovers hidden gems, providing transparent, explainable features by connecting the dots within large-scale data sets in hours, eliminating human bias. It allows data scientists to explore up to 100X more features, including those yet to be imagined, and augments AI/ML projects in an agile manner, delivering business value faster. In an era of rapid change, insights discovered through AI can be a game-changer for business growth and innovation across industries. The power of dotData’s platform to provide game-changing insights is why Global Fortune 500 organizations trust dotData.

For more information, visit and join the conversation on X/Twitter and LinkedIn.

Walter Paliska


dotData Inc.

dotData Automated Feature Engineering powers our full-cycle data science automation platform to help enterprise organizations accelerate ML and AI projects and deliver more business value by automating the hardest part of the data science and AI process – feature engineering and operationalization. Learn more at, and join us on Twitter and LinkedIn.