Thought Leadership

How to Evaluate and Select the Right AutoML Platform

If you are in the market looking for automated machine learning  (AutoML) tools, there are plenty of choices. Forrester Research recently published a report highlighting nine Automation Focussed Machine Learning Solutions and named dotData a leader. The report underscores the importance of Feature Engineering and Explainability as key differentiating factors for leaders in the AutoML space. But if you are new to machine learning or are part of a BI and analytics team with a mandate to incorporate predictive analytics, how do you decide which AutoML tool is right for you? What are some of the factors that you should consider as you make your decision?

The end-user & skill set

Any data science project is going to start with identifying business use cases and requirements. The process is also heavily dependent on the available resources of the business as well as the skill-set of the primary intended users. In order to make the best possible choice, organizations should start their evaluation by asking some fundamental questions:

  1. Who will be the primary intended users of the AutoML platform? The Data Science Team or the BI team?
  2. What are the skill-level and data science expertise of the primary user?
  3. Is the primary programming environment of the intended users Python?

The motivation for using an AutoML platform may be completely different depending on the user persona. If the intended users are data scientists, the primary environment is Python/R, then you need a platform that offers a great amount of customization. Advanced analytical developers and data scientists may want to use an AutoML platform to generate new features but prefer to tweak models manually. On the other hand, BI & analytics team may be struggling with the long lead times to prepare data, need help with algorithm selection and want to use a tool that automates the data science workflow.

The data science workflow


How much of this process do you need to automate?

Top factors

Here is a quick rundown of major attributes to think through while evaluating an AutoML platform:

  1. Data Ingestion and Preparation:
    How much manipulation of data must be performed before it is ready for ingestion by the AutoML platform? Can you upload data to the AutoML platform without having to write additional SQL code?
  2. Feature Engineering Automation:
    How much manual work is involved in Feature Engineering? Can the system automatically explore all available database entity relationships and discover and evaluate features based on available columns and relationships?
  3. Machine Learning:
    Does the system support state-of-the-art ML algorithms like scikit-learn, XGBoost, LightGBM, TensorFlow and PyTorch? Can the users perform an automated hyper-parameter search of ML algorithms?
  4. Production & Operationalization:
    How easy is it to deploy ML models in a production environment? Can you monitor models, discover data drift, and quickly retrain models if production data changes over time?

Platform Accessibility, Ease of Use, and Deployment Flexibility:
Can all steps of the data science process be executed seamlessly within a single platform without the need for moving between systems and applications?

Last but not the least, is it easy for non-data scientists to understand the workflow of the application, the concepts, and steps necessary to proceed?
To learn more about Automation-Focussed Machine Learning Solutions, the Forrester Wave report is a great resource. For guidance on top factors to consider while selecting an AutoML platform , check out our latest AutoML Evaluation Guide here.
 
Learn more about dotData:
dotData Enterprise
Why dotData
Why AutoML 2.0

Sachin Andhare

Sachin is an enterprise product marketing leader with global experience in advanced analytics, digital transformation, and the IoT. He serves as Head of Product Marketing at dotData, evangelizing predictive analytics applications. Sachin has a diverse background across a variety of industries spanning software, hardware and service products including several startups as well as Fortune 500 companies.

Recent Posts

dotData Insight: Melding the Power of AI-Driven Insight Discovery & Generative AI

Introduction Today, we announced the launch of dotData Insight, a new platform that leverages an…

1 year ago

Boost Time-Series Modeling with Effective Temporal Feature Engineering – Part 3

Introduction Time-series modeling is a statistical technique used to analyze and predict the patterns and…

1 year ago

Practical Guide for Feature Engineering of Time Series Data

Introduction Time series modeling is one of the most impactful machine learning use cases with…

2 years ago

Maintain Model Robustness: Strategies to Combat Feature Drift in Machine Learning

Introduction Building robust and reliable models in machine learning is of utmost importance for assured…

2 years ago

The Hard Truth about Manual Feature Engineering

The past decade has seen rapid adoption of Artificial Intelligence (AI) and Machine Learning (ML)…

2 years ago

Feature Factory: A Paradigm Shift for Enterprise Data

The world of enterprise data applications such as Business Intelligence (BI), Machine Learning (ML), and…

2 years ago