Updated for 2022 According to a recent Gartner blog about analytics and BI solutions, only 20% of analytical insights will deliver business outcomes through 2022. Another article by VentureBeat AI reported that 87% of data science projects never make it into production. And a global survey by Dimensional Research concluded that 78% of their AI/ML projects stall at some stage before deployment. Even in 2022, as many as 68% of data scientists admit to abandoning 40% to 80% of their Data Science projects. These results indicate an exceptionally high failure rate across analytics, data science, and machine learning projects. There are many reasons why so many projects fail to meet their business objectives. In this blog, we look at the top practical challenges that enterprise AI projects face and how you can mitigate them: Start with business problems you need to solveWhile AI is an incredibly powerful technology, it is…
In the first part of this blog, Basic Concepts and Techniques of AI Model Transparency, we reviewed a few common techniques for AI model transparency such as linear coefficients, local linear approximation, and permutation importance. In particular, the permutation importance is applicable to any black-box models, any accuracy/error functions, and more robust against high-dimensional data (because it handles each feature one by one rather than all features at the same time). One of the drawbacks of the permutation importance is its high computation cost. We have to repeat the evaluation process by (the number of features) * (the number of random shuffling to repeat) * (the number of models). To reduce the computation time, a common approach is to apply downsampling that works well when the positive and negative classes are balanced. However, such naive downsampling makes permutation importance extremely unreliable. Permutation Importance Under Class Imbalance Let us first see…
Transparency of AI/ML models is a topic as old as AI/ML itself. However, transparency has increasingly become more important due to proliferation of enterprise AI applications, critical breakthroughs in black-box ML modeling (e.g. Deep Learning), and greater concerns with increased personal data being used in AI models. The word “transparency” is often used in different contexts, but generally refers to issues like: Interpretability and ExplainabilityReproducibility and Traceability Ethic, Trust and Fairness This blog focuses on the most “basic” level of transparency, how to explain the impact of input variables (a.k.a. features) in the final prediction. There are many techniques to evaluate the impact of input features. Below are some common techniques and their advantages and disadvantages. Linear coefficients for Linear Models The simplest (but important) technique is linear coefficients (weights) of features. Fig.1 illustrates the idea of linear coefficients based on a simple two-dimensional example (x1 and x2 are two features).…