dotDataPy is a rich and scalable Python library that enables advanced users to access dotData’s data science automation functionality – including AI-powered feature engineering and automated machine learning – with just a few lines of Python code.
AutoML & Data Science Automation Library for Python
Simple Integration, Powerful Features
dotDataPy can be easily integrated with Jupyter notebooks and other Python development environments, enabling users to leverage the advanced Python ecosystem fully, including rich visualization like Matplotlib and Plotly, state-of-the-art machine learning/deep learning tools like scikit-learn, Spark MLlib, PyTorch, and TensorFlow, as well as flexible DataFrames like pandas and PySpark.
The AI-powered features can be easily plugged into customized ML algorithms, enabling advanced users to further refine their models and extract new business insights.
All the Power of dotData
dotDataPy is designed to provide all the power of our award-winning enterprise platform in a python library that is designed to fit within the workflow and environment preferred by most data scientists. You still benefit from the same intelligent feature engineering automation and machine learning automation technology in a package that fits in the workflow you love.
The AutoML algorithm explores not only machine learning algorithms, but also feature preprocessing methods such as missing value imputation, outlier filtering, and standardization. In conjunction with AI-powered feature engineering, dotDataPy automates and streamlines the end-to-end data science process on a single environment as a one-stop-shop.
Highly Accuracy, Greater Transparency
dotDataPy’s automated machine learning conducts hundreds of trials to finely tune state-of-the-art machine learning algorithms (including proprietary ones) for the best accuracy in various optimization criteria. The fully-automated process frees up the time and resources of data scientists and gives them the freedom to produce high-quality machine learning models, thus enabling teams to execute more data science projects than ever before.