Thought Leadership

The 10 Commandments of AI & ML (P2)

The 10 Commandments of AI & ML (P2)

December 10, 2020

Concerned about accelerating AI workflows, addressing multiple use cases, and scaling ML initiatives? Here are the rules to help you succeed.

In this two-part blog series, we review the ten rules that will ensure success with your first AI/ML project paving the way for many more. In the first part, we discussed the alignment of business objectives with use cases, getting a head start on data preparation and the mechanics of feature engineering. We also talked about understanding AutoML tools’ capabilities and ensuring the right modeling approach while balancing model accuracy and interpretability.

This second part will discuss why visibility to the ML processes and results are critical, the importance of data science education, real-time analytics, infrastructure compatibility, and ML operationalization. Here are the five best practices:

  • Ensure the ML project has visibility. ML initiatives fail because they operate with a silo mentality, like a secret science experiment that no one needs to know until perfected.  In the absence of a shared vision and goals, the business and technical teams do not share information, the communication breaks down and derails collaboration. Even if the PoC fails, it is essential to ensure access to the ML process, deliverables, and results. The whole point of democratization of AI is to make the right data available to SMEs and business experts in the organization at any time they need it.  When all the stakeholders, broader organization-wide teams collaborate, see the results, and exchange meaningful information, your ML project will succeed. Only then will you create an analytical framework, a blueprint with a repeatable process, and scale ML to other use cases.
  • Take full advantage of data science education and sign up for additional learning opportunities to learn AI and ML fundamentals from your AutoML vendor. Many businesses need help with defining AI use cases and struggle with data readiness. The BI and analytics leaders at these companies have prioritized predictive analytics use cases. Still, they either don’t have a budget for hiring more data scientists or simply don’t have in-house data science skills. A structured program with bundled training, use-case co-development, and support will allow these companies to leverage existing resources – in-house BI teams – and add more value with AI + BI – in record time. Your team will need a lot of hand-holding during the first ML initiative’s initial few months, and you should encourage everyone to take some foundational ML courses. A reliable training program will enable business people to try ML and instill confidence among SMEs, increasing ML adoption.
  • Ask stakeholders and functional domain experts if they have a use case that requires real-time data processing. Real-time analytics enables low latency applications where subsecond (say tens of millisecond) performance is required.   However, many vendors cannot process data so fast and offer only batch processing, such as hourly or nightly batches. Stream processing is ideal for applications where real-time prediction services are needed. A few everyday use cases are Instant Credit Approval, Fraud Detection, Automated Underwriting, Dynamic Pricing, and Industrial IoT. A containerized AI model with minimal CPU and memory deployable at the edge may be necessary to enable real-time applications such as predictive maintenance or real-time quality monitoring in smart manufacturing.
  • Evaluate infrastructure needs based on your enterprise data architecture. What will be the orchestration backbone (Amazon SageMaker)? Which tool will schedule, monitor, and manage data workflows and track jobs (Apache Airflow)? The infrastructure must be flexible and integrate easily with data pipelines and storage. Suppose your strategy is to infuse AI in business intelligence (BI) applications. In that case, you need to evaluate whether the ML platform gives a seamless BI with AI experience through out-of-box connectivities with third-party data and BI platforms such as Teradata, MS SQL / Azure database, and Tableau or (Power BI). Public cloud services are inexpensive, but you need to plan carefully about your use cases and suitable deployment scenarios—On-prem vs. cloud vs. hybrid. The ML platform should be flexible to work with self-service data preparation tools, acting as the core analytics engine, and connecting easily with popular BI tools for visualization.
  • Don’t ignore ML operationalization. Operationalization means that the model runs in a production environment connected to business applications and makes predictions using live data. It can take anywhere between 8 to 90 days to deploy a single model in production.  Irrespective of the AI and ML platform used, it should provide end-points to run and control the developed pipeline and easily integrate with other business systems using standard APIs. Operationalizing ML should include model monitoring (model drift), capturing the performance degradation, and updating models as necessary. Automation makes enterprise-level, end-to-end data science operationalization possible with minimum effort. Automation also enables enterprise data science and software/IT teams to operationalize complex ML projects. Every enterprise AI project should start with a plan to deploy models in production to capture the value and realize AI’s potential.

Finally, have realistic expectations in terms of time to market and ROI on ML initiatives. The right technology and tools can make a difference. Full cycle data science automation can automate the entire AI and ML workflow, perform the heavy lifting (from data preparation, feature engineering, building models) and accelerate your ML projects from months to days.

If you would like to learn more about the ML platform and what features to look out for in an AUtoML tool, check out our evaluation guide:
https://dotdata.com/ai-faststart-evaluation/

Sachin Andhare

Sachin is an enterprise product marketing leader with global experience in advanced analytics, digital transformation, and the IoT. He serves as Head of Product Marketing at dotData, evangelizing predictive analytics applications. Sachin has a diverse background across a variety of industries spanning software, hardware and service products including several startups as well as Fortune 500 companies.

Sachin Andhare

Sachin is an enterprise product marketing leader with global experience in advanced analytics, digital transformation, and the IoT. He serves as Head of Product Marketing at dotData, evangelizing predictive analytics applications. Sachin has a diverse background across a variety of industries spanning software, hardware and service products including several startups as well as Fortune 500 companies.

Recent Posts

dotData Insight: Melding the Power of AI-Driven Insight Discovery & Generative AI

Introduction Today, we announced the launch of dotData Insight, a new platform that leverages an…

11 months ago

Boost Time-Series Modeling with Effective Temporal Feature Engineering – Part 3

Introduction Time-series modeling is a statistical technique used to analyze and predict the patterns and…

1 year ago

Practical Guide for Feature Engineering of Time Series Data

Introduction Time series modeling is one of the most impactful machine learning use cases with…

1 year ago

Maintain Model Robustness: Strategies to Combat Feature Drift in Machine Learning

Introduction Building robust and reliable models in machine learning is of utmost importance for assured…

1 year ago

The Hard Truth about Manual Feature Engineering

The past decade has seen rapid adoption of Artificial Intelligence (AI) and Machine Learning (ML)…

1 year ago

Feature Factory: A Paradigm Shift for Enterprise Data

The world of enterprise data applications such as Business Intelligence (BI), Machine Learning (ML), and…

2 years ago