Thought Leadership

Take Advanced Analytics into Overdrive with AutoML 2.0

Take Advanced Analytics into Overdrive with AutoML 2.0

July 7, 2020

The term “Advanced Analytics” was coined by the Gartner Group and is defined as the “…autonomous or semi-autonomous examination of data or content using sophisticated techniques and tools, typically beyond those of traditional business intelligence (BI), to discover deeper insights, make predictions, or generate recommendations.” Advanced analytics, by definition, requires the use of advanced techniques like data mining, machine learning, pattern matching, and other sophisticated manipulation of data in an effort to gain greater insights. The most broadly used category of advanced analytics is also known as predictive analytics. Predictive analytics itself is not new, but has traditionally been the exclusive domain of data scientists and highly skilled statisticians due to the extremely complex mathematical models required to effectively build predictive dashboards. While many organizations can benefit from predictive analytics, only a few are able to create and deploy dashboards powered by predictive algorithms, due to the high cost of hiring and retaining talent.

Why Advanced (Predictive) Analytics?

The benefits of advanced analytics and predictive analytics are relatively intuitive, given the typical use cases where predictive modeling can be beneficial. Customer churn, for example, is one of the most highly used and beneficial use-cases for predictive analytics. Predicting which customers are likely to churn can provide a business with an increased focus to be able to target those customers for upgrades and promotional offers, lowering churn rates. Similarly, predicting the likelihood of default on loans or outstanding payables can provide huge savings to organizations by limiting exposure to long-term collections. In marketing, predicting the likelihood of a campaign’s performance can have a massive impact on return on investment and can help marketing teams provide better focus for their efforts. Even as early as 2011, research firm The Aberdeen Group found that businesses using predictive analytics could identify the right target audience and make precise offerings to them at twice the rate of companies that were not using predictive analytics. The benefits of being able to predict business outcomes is tangible and of high value. The challenge, historically, has been that developing predictive analytics systems has been difficult, time-consuming and expensive.

The Traditional Workflow of Building Predictive Dashboards

When most people think of predictive analytics, the first thought that comes to mind is “expensive.” For most organizations, the challenge of predictive analytics is the cost involved in building out effective models that are delivered in business-friendly dashboards that can be used by line-of-business users. The reason achieving a well-formed predictive workflow is challenging is because of the steps involved in going from “data” to “predictive models.” Fundamentally, there are 5 steps involved in moving from just having “data” to using it to predict business outcomes:

  • Data Collection & Consolidation – Anyone who works in a large enterprise organization knows that enterprises love data. The problem, however, is that data lives in silos – separate systems for sales, marketing, operations, accounting etc. – sometimes sharing data – sometimes not. The first challenge of moving from data to predictions is that you need to take all that data and consolidate it into a unified analytics platform that can provide all relevant data for you to use and analyze.
  • Data Prep and Data Cleansing – Another major challenge in preparing data for predictive analytics is often referred to as normalization. Data normalization has two distinct phases – first, data must be unified and normalized across systems – this is typically a highly manual process that involves performing actions like ensuring that field values are consistent across systems and that duplicate data is removed from the final data warehouse. The second phase is preparing data for AI modeling – also often manual and designed to minimize errors in AI model development due to problems with the data.
  • Business Hypothesis Ideating, Testing & Validation – Once data is ready for analysis, the predictive analytics process requires what are commonly referred to as “features.” Features are nothing more than ways of using data to describe a potential useful outcome. Features will, in turn, be used in machine learning and AI models to derive predictions. Feature generation is often very time-consuming and manual and requires input from subject matter experts as well as data scientists and engineers in order to create, test and evaluate the usefulness of individual features.
  • ML Model Development and Testing – With features built, the next step in the process is to use features to test against multiple ML algorithms to test which ML models might provide better results. Again, this can be a very iterative and time-consuming process. In recent years, software tools known as “AutoML” have made the process of evaluating and testing ML models more automated.
  • Deploying Models into Production Environments – Once a model is built and has been tested, the next phase is to deploy the model into the ultimate production environment. For BI-based predictive applications, this might be a PowerBI dashboard or a Tableau dashboard that provides some form of predictive scoring based on user input (filters, drop-downs, etc.), allowing users to perform “what if” analysis on the business problems being predicted.

AutoML 2.0: Automating Your Workflow

The five steps outlined above are actually a fairly simplified version of what actually tends to happen. There are multiple steps where manual work involved requires multiple experts and multiple types of work. While AutoML 1.0 tools have allowed for the rapid development of Machine Learning models, they have still relied on prepared data. New platforms, however, are becoming available that can automate nearly the entire process – from AI-based data prep all the way to model deployment, allowing for the first time BI teams to develop, test and deploy predictive models without having to hire expensive data scientists. These AutoML 2.0 platforms are ideally suited for mid-sized organizations and smaller enterprises that can benefit from predictive analytics, but may not have the data science skills or staff to execute on traditional workflows.

Next Steps

So how do you get started? As a first step, it’s important to understand the core differences between AutoML and AutoML 2.0 platforms. Investing in the wrong type of product could create a nightmare scenario where additional staff and new software packages are required before you can create value from your advanced analytics infrastructure. Organizations serious about advanced analytics must leverage data science automation to gain greater agility and faster, more accurate decision-making. The emergence of AutoML platforms allows enterprises to be more nimble by allowing them to tap into current teams and resources without having to recruit and additional talent. AutoML 2.0  platforms empower BI developers and business analytics professionals to leverage AI/ML and add predictive analytics to their BI stack quickly and easily. AutoML 2.0 platforms not only generate features automatically, eliminating the most complex and time-consuming part of workflow, but also select the best algorithm depending upon the application. By providing automated data preprocessing, model generation, and deployment with a transparent workflow, AutoML 2.0 is bringing AI to masses thereby accelerating data science adoption.  

Walter Paliska

Walter brings 25+ years of experience in enterprise marketing to dotData. Walter oversees the Marketing organization and is responsible for product marketing and demand generation for dotData. Walter’s background includes experience with both software and hardware companies, and he has worked in seven different startups, including three successful bootstrap startups.

Walter Paliska

Walter brings 25+ years of experience in enterprise marketing to dotData. Walter oversees the Marketing organization and is responsible for product marketing and demand generation for dotData. Walter’s background includes experience with both software and hardware companies, and he has worked in seven different startups, including three successful bootstrap startups.

Recent Posts

dotData Insight: Melding the Power of AI-Driven Insight Discovery & Generative AI

Introduction Today, we announced the launch of dotData Insight, a new platform that leverages an…

12 months ago

Boost Time-Series Modeling with Effective Temporal Feature Engineering – Part 3

Introduction Time-series modeling is a statistical technique used to analyze and predict the patterns and…

1 year ago

Practical Guide for Feature Engineering of Time Series Data

Introduction Time series modeling is one of the most impactful machine learning use cases with…

1 year ago

Maintain Model Robustness: Strategies to Combat Feature Drift in Machine Learning

Introduction Building robust and reliable models in machine learning is of utmost importance for assured…

1 year ago

The Hard Truth about Manual Feature Engineering

The past decade has seen rapid adoption of Artificial Intelligence (AI) and Machine Learning (ML)…

2 years ago

Feature Factory: A Paradigm Shift for Enterprise Data

The world of enterprise data applications such as Business Intelligence (BI), Machine Learning (ML), and…

2 years ago