dotData | AutoML 2.0 Solutions for Enterprise https://dotdata.com Data Science Automation and Machine Learning Platform | dotData Tue, 07 Jul 2020 21:08:00 +0000 en hourly 1 https://wordpress.org/?v=5.4.2 https://dotdata.com/wp-content/uploads/2019/09/favicon.png dotData | AutoML 2.0 Solutions for Enterprise https://dotdata.com 32 32 dotData Launches dotData Stream – Containerized AI Model for Real-Time Prediction https://dotdata.com/dotdata-launches-dotdata-stream-containerized-ai-model-for-real-time-prediction/?utm_source=rss&utm_medium=rss&utm_campaign=dotdata-launches-dotdata-stream-containerized-ai-model-for-real-time-prediction Tue, 07 Jul 2020 21:08:00 +0000 https://dotdata.com/?p=8026 Highly Scalable and Effective AI/ML Container, Easily Deployable Either in the Cloud for ML Orchestration or at the Edge for...

The post dotData Launches dotData Stream – Containerized AI Model for Real-Time Prediction appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
Highly Scalable and Effective AI/ML Container, Easily Deployable Either in the Cloud for ML Orchestration or at the Edge for Intelligent IoT

SAN MATEO, Calif.July 7, 2020 /PRNewswire/ — dotData, a leader in full-cycle data science automation and operationalization for the enterprise, today launched dotData Stream, a new containerized AI/ML model that enables real-time predictive capabilities for dotData users. dotData Stream was developed to meet the growing market demand for real-time prediction capabilities for use cases such as fraud detection, automated underwriting, dynamic pricing, industrial IoT, and more.

dotData Stream performs real-time predictions using AI/ML models developed on the dotData Platform, including feature transformation such as one-hot encoding, missing value imputation, data normalization, and outlier filter. It is highly scalable and effective – a single prediction can be performed as fast as tens of milliseconds or even faster for microbatch predictions. Its deployment is as easy and simple as launching a docker container with AI/ML models downloaded from the dotData Platform with just one click. An end-point for real-time predictions becomes immediately available. In addition, dotData Stream can run in cloud MLOps Platforms for enterprise AI/ML orchestration or at the edge servers for intelligent IoT applications.

JFE Steel, one of the world’s leading integrated steel producers, recently implemented dotData to support the deployment of intelligent IoT in their manufacturing plants.

“After testing several leading autoML platforms, we chose dotData as we were impressed with dotData’s autoML 2.0 full-cycle automation of ML processes, including automated feature engineering on our manufacturing data,” said Mr. Kazuro Tsuda, Staff General Manager, Data Science Project Dept. JFE Steel Corporation. “JFE Steel has a vision to deploy various AI models to implement Cyber-Physical Systems in our steel manufacturing plants. dotData Stream will be a key component to realize our vision and JFE Steel is looking forward to expanding its partnership with the dotData team.”

“We are seeing an increasing demand for real-time prediction capability, which has become an essential necessity for many enterprise companies. dotData Stream allows our customers to leverage AI/ML capability in a real-time environment,” said Ryohei Fujimaki, Ph.D., founder and CEO of dotData. “We are honored and excited about our partnership with JFE Steel. Their intelligent IoT application is the perfect use case to demonstrate the ability of dotData Stream, and we are fully committed to supporting their vision to adopt AI/ML in smart manufacturing and achieve the full potential of Industry 4.0.”

dotData provides AutoML 2.0 solutions that help accelerate the process of developing AI and Machine Learning (AI/ML) models for use in advanced predictive analytics BI dashboards and applications. dotData makes it easy for BI developers and data engineers to develop AI/ML models in just days by automating the full life-cycle of the data science process, from business raw data through feature engineering to implementation of ML in production utilizing its proprietary AI technologies. dotData’s AI-powered feature engineering automatically applies data transformation, cleansing, normalization, aggregation, and combination, and transforms hundreds of tables with complex relationships and billions of rows into a single feature table, automating the most manual data science projects that are fundamental to developing predictive analytics solutions.

dotData democratizes data science by enabling BI developers and data engineers to make enterprise data science scalable and sustainable. dotData automates up to 100 percent of the AI/ML development workflow, enabling users to connect directly to their enterprise data sources to discover and evaluate millions of features from complex table structures and huge data sets with minimal user input.  dotData is also designed to operationalize AI/ML models by producing both feature and ML scoring pipelines in production, which IT teams can then immediately integrate with business workflows. This can further automate the time-consuming and arduous process of maintaining the deployed pipeline to ensure repeatability as data changes over time. With the dotData GUI, AI/ML development becomes a five-minute operation, requiring neither significant data science experience nor SQL/Python/R coding.

For more information or a demo of dotData’s AI-powered full-cycle data science automation platform, please visit dotData.com.

About JFE Steel Corporation
JFE Steel is a steelmaker engaged in the total steel-making process, taking iron ore raw material and turning it into final products. Boasting one of the world’s greatest capacities for steel production, JFE Steel satisfies customers by producing steel under a corporate philosophy of “contributing to society with the world’s most innovative technology.” The company also contributes to environmental protection by developing reduced-impact ironmaking processes and high-performance steel materials.

Official web site: https://www.jfe-steel.co.jp/en/company/about.html

About dotData
dotData Pioneered AutoML 2.0 to help business intelligence professionals add AI/ML models to their BI stacks and predictive analytics applications quickly and easily. Fortune 500 organizations around the world use dotData to accelerate their ML and AI development to drive higher business value. dotData’s automated data science platform accelerates ROI and lowers the total cost of model development by automating the entire data science process that is at the heart of AI/ML. dotData ingests raw business data and uses an AI-based engine to automatically discover meaningful patterns and build ML-ready feature tables from relational, transactional, temporal, geo-locational, and text data.

dotData has been recognized as a leader by Forrester in the 2019 New Wave for AutoML platforms. dotData has also been recognized as the “best machine learning platform” for 2019 by the AI breakthrough awards, was named an “emerging vendor to watch” by CRN in the big data space and was named to CB Insights’ Top 100 AI Startups in 2020. For more information, visit www.dotdata.com, and join the conversation on Twitter and LinkedIn.

The post dotData Launches dotData Stream – Containerized AI Model for Real-Time Prediction appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
Take Advanced Analytics into Overdrive with AutoML 2.0 https://dotdata.com/take-advanced-analytics-into-overdrive-with-automl-2-0/?utm_source=rss&utm_medium=rss&utm_campaign=take-advanced-analytics-into-overdrive-with-automl-2-0 Tue, 07 Jul 2020 20:37:18 +0000 https://dotdata.com/?p=8020 The term “Advanced Analytics” was coined by the Gartner Group and is defined as the “…autonomous or semi-autonomous examination of...

The post Take Advanced Analytics into Overdrive with AutoML 2.0 appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
The term “Advanced Analytics” was coined by the Gartner Group and is defined as the “…autonomous or semi-autonomous examination of data or content using sophisticated techniques and tools, typically beyond those of traditional business intelligence (BI), to discover deeper insights, make predictions, or generate recommendations.” Advanced analytics, by definition, requires the use of advanced techniques like data mining, machine learning, pattern matching, and other sophisticated manipulation of data in an effort to gain greater insights. The most broadly used category of advanced analytics is also known as predictive analytics. Predictive analytics itself is not new, but has traditionally been the exclusive domain of data scientists and highly skilled statisticians due to the extremely complex mathematical models required to effectively build predictive dashboards. While many organizations can benefit from predictive analytics, only a few are able to create and deploy dashboards powered by predictive algorithms, due to the high cost of hiring and retaining talent.

Why Advanced (Predictive) Analytics?

The benefits of advanced analytics and predictive analytics are relatively intuitive, given the typical use cases where predictive modeling can be beneficial. Customer churn, for example, is one of the most highly used and beneficial use-cases for predictive analytics. Predicting which customers are likely to churn can provide a business with an increased focus to be able to target those customers for upgrades and promotional offers, lowering churn rates. Similarly, predicting the likelihood of default on loans or outstanding payables can provide huge savings to organizations by limiting exposure to long-term collections. In marketing, predicting the likelihood of a campaign’s performance can have a massive impact on return on investment and can help marketing teams provide better focus for their efforts. Even as early as 2011, research firm The Aberdeen Group found that businesses using predictive analytics could identify the right target audience and make precise offerings to them at twice the rate of companies that were not using predictive analytics. The benefits of being able to predict business outcomes is tangible and of high value. The challenge, historically, has been that developing predictive analytics systems has been difficult, time-consuming and expensive.

The Traditional Workflow of Building Predictive Dashboards

When most people think of predictive analytics, the first thought that comes to mind is “expensive.” For most organizations, the challenge of predictive analytics is the cost involved in building out effective models that are delivered in business-friendly dashboards that can be used by line-of-business users. The reason achieving a well-formed predictive workflow is challenging is because of the steps involved in going from “data” to “predictive models.” Fundamentally, there are 5 steps involved in moving from just having “data” to using it to predict business outcomes:

  1. Data Collection & ConsolidationAnyone who works in a large enterprise organization knows that enterprises love data. The problem, however, is that data lives in silos – separate systems for sales, marketing, operations, accounting etc. – sometimes sharing data – sometimes not. The first challenge of moving from data to predictions is that you need to take all that data and consolidate it into a unified analytics platform that can provide all relevant data for you to use and analyze.
  2. Data Prep and Data Cleansing – Another major challenge in preparing data for predictive analytics is often referred to as normalization. Data normalization has two distinct phases – first, data must be unified and normalized across systems – this is typically a highly manual process that involves performing actions like ensuring that field values are consistent across systems and that duplicate data is removed from the final data warehouse. The second phase is preparing data for AI modeling – also often manual and designed to minimize errors in AI model development due to problems with the data.
  3. Business Hypothesis Ideating, Testing & Validation – Once data is ready for analysis, the predictive analytics process requires what are commonly referred to as “features.” Features are nothing more than ways of using data to describe a potential useful outcome. Features will, in turn, be used in machine learning and AI models to derive predictions. Feature generation is often very time-consuming and manual and requires input from subject matter experts as well as data scientists and engineers in order to create, test and evaluate the usefulness of individual features.
  4. ML Model Development and Testing – With features built, the next step in the process is to use features to test against multiple ML algorithms to test which ML models might provide better results. Again, this can be a very iterative and time-consuming process. In recent years, software tools known as “AutoML” have made the process of evaluating and testing ML models more automated.
  5. Deploying Models into Production EnvironmentsOnce a model is built and has been tested, the next phase is to deploy the model into the ultimate production environment. For BI-based predictive applications, this might be a PowerBI dashboard or a Tableau dashboard that provides some form of predictive scoring based on user input (filters, drop-downs, etc.), allowing users to perform “what if” analysis on the business problems being predicted.

AutoML 2.0: Automating Your Workflow

The five steps outlined above are actually a fairly simplified version of what actually tends to happen. There are multiple steps where manual work involved requires multiple experts and multiple types of work. While AutoML 1.0 tools have allowed for the rapid development of Machine Learning models, they have still relied on prepared data. New platforms, however, are becoming available that can automate nearly the entire process – from AI-based data prep all the way to model deployment, allowing for the first time BI teams to develop, test and deploy predictive models without having to hire expensive data scientists. These AutoML 2.0 platforms are ideally suited for mid-sized organizations and smaller enterprises that can benefit from predictive analytics, but may not have the data science skills or staff to execute on traditional workflows.

Next Steps

So how do you get started? As a first step, it’s important to understand the core differences between AutoML and AutoML 2.0 platforms. Investing in the wrong type of product could create a nightmare scenario where additional staff and new software packages are required before you can create value from your advanced analytics infrastructure. Organizations serious about advanced analytics must leverage data science automation to gain greater agility and faster, more accurate decision-making. The emergence of AutoML platforms allows enterprises to be more nimble by allowing them to tap into current teams and resources without having to recruit and additional talent. AutoML 2.0  platforms empower BI developers and business analytics professionals to leverage AI/ML and add predictive analytics to their BI stack quickly and easily. AutoML 2.0 platforms not only generate features automatically, eliminating the most complex and time-consuming part of workflow, but also select the best algorithm depending upon the application. By providing automated data preprocessing, model generation, and deployment with a transparent workflow, AutoML 2.0 is bringing AI to masses thereby accelerating data science adoption.  

The post Take Advanced Analytics into Overdrive with AutoML 2.0 appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
The Evolution, Misconceptions, and Reality of AutoML https://dotdata.com/the-evolution-misconceptions-and-reality-of-automl/?utm_source=rss&utm_medium=rss&utm_campaign=the-evolution-misconceptions-and-reality-of-automl Mon, 06 Jul 2020 17:00:24 +0000 https://dotdata.com/?p=8006 With every new technology, especially in the early days, comes a share of misconceptions, fallacy, and ambiguity. That’s why our...

The post The Evolution, Misconceptions, and Reality of AutoML appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
With every new technology, especially in the early days, comes a share of misconceptions, fallacy, and ambiguity. That’s why our CEO Ryohei Fujimaki shares the top five myths and reality of AutoML with RTInsights: https://bit.ly/3gsV6qU

The post The Evolution, Misconceptions, and Reality of AutoML appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
dotData Partners with Teradata, a Leading Cloud Data & Analytics Company https://dotdata.com/dotdata-partners-with-teradata-a-leading-cloud-data-analytics-company/?utm_source=rss&utm_medium=rss&utm_campaign=dotdata-partners-with-teradata-a-leading-cloud-data-analytics-company Tue, 30 Jun 2020 13:15:49 +0000 https://dotdata.com/?p=7900 By making Teradata Vantage available, dotData offers enhanced seamless integration between enterprise data management and AutoML 2.0 SAN MATEO, Calif.,...

The post dotData Partners with Teradata, a Leading Cloud Data & Analytics Company appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
By making Teradata Vantage available, dotData offers enhanced seamless integration between enterprise data management and AutoML 2.0

SAN MATEO, Calif., June 30, 2020 – dotData, a leader in full-cycle data science automation and operationalization for the enterprise, announced today that it has partnered with Teradata, a leading cloud data & analytics company. The integration leverages the robust enterprise data management and analytic capabilities of Teradata’s Vantage platform and dotData’s autoML 2.0 platform to create a powerful end-to-end data science solution, from data collection and preparation through feature engineering to machine learning operationalization. The collaboration will streamline and simplify the movement of data between Teradata and dotData to help the companies’ joint customers derive more value from their AI and machine learning initiatives.  

Vantage is Teradata’s flagship platform that was designed to simplify analytic ecosystems by unifying analytics, data lakes and data warehouses. This single platform ensures that insights are based on 100 percent of the data. Regardless of the required scale or the location of the data, organizations receive a unified, integrated view of the business.

dotData provides AutoML 2.0 solutions that help accelerate the process of developing AI and Machine Learning models for use in advanced predictive analytics BI dashboards and applications. dotData makes it easy for BI developers and data engineers to develop AI/ML models in just days by automating the full life-cycle of the data science process, from business raw data through feature engineering to implementation of ML in production utilizing its proprietary AI technologies. dotData’s AI-powered feature engineering automatically applies data transformation, cleansing, normalization, aggregation, and combination, and transforms hundreds of tables with complex relationships and billions of rows into a single feature table, automating the most manual data science projects that are fundamental to developing predictive analytics solutions.

Mitsui Sumitomo Insurance Co., Ltd., a member of MS&AD Insurance Group Holdings, Inc. (‘MS&AD Insurance Group’) leverages the combination of Teradata and dotData in its data science program to optimize customer value and utilization of its products and services. 

“We aim to build trust and loyalty among our customers, and using advanced digital technology to better meet our customers’ needs is one way in which we excel above our competition,” said Mr. Shinichiro Funabiki, Senior Executive Officer, CDO of MS&AD Insurance Group. “Deploying the combined solution of Teradata and dotData gives us a powerful and streamlined data science platform that enables our data science team to maximize the value of our data science investments, and bring better value to our customers.” 

“The integration of the Teradata and dotData platforms enables our joint customers to leverage the benefits of Teradata’s proven, enterprise-class data solutions with dotData’s autoML 2.0 platform to derive even greater value from their AI and ML models,” said Ryohei Fujimaki, Ph.D., CEO and founder of dotData. “We are proud to collaborate with Teradata to help MS&AD transform the insurance business with our data science automation.”

dotData democratizes data science by enabling BI developers and data engineers to make enterprise data science scalable and sustainable. dotData automates up to 100 percent of the AI/ML development workflow, enabling users to connect directly to their enterprise data sources to discover and evaluate millions of features from complex table structures and huge data sets with minimal user input. dotData is also designed to operationalize AI/ML models by producing both feature and ML scoring pipelines in production, which IT teams can then immediately integrate with business workflows. This can further automate the time-consuming and arduous process of maintaining the deployed pipeline to ensure repeatability as data changes over time. With the dotData GUI, AI/ML development becomes a five-minute operation, requiring neither significant data science experience nor SQL/Python/R coding.

For more information or a demo of dotData’s AI-powered full-cycle data science automation platform, please visit dotData.com.

 

About dotData

dotData Pioneered AutoML 2.0 to help business intelligence professionals add AI/ML models to their BI stacks and predictive analytics applications quickly and easily. Fortune 500 organizations around the world use dotData to accelerate their ML and AI development to drive higher business value. dotData’s automated data science platform accelerates ROI and lowers the total cost of model development by automating the entire data science process that is at the heart of AI/ML. dotData ingests raw business data and uses an AI-based engine to automatically discover meaningful patterns and build ML-ready feature tables from relational, transactional, temporal, geo-locational, and text data. 

dotData has been recognized as a leader by Forrester in the 2019 New Wave for AutoML platforms. dotData has also been recognized as the “best machine learning platform” for 2019 by the AI breakthrough awards, was named an “emerging vendor to watch” by CRN in the big data space and was named to CB Insights’ Top 100 AI Startups in 2020. For more information, visit www.dotdata.com, and join the conversation on Twitter and LinkedIn.

The dotData logo is a registered trademark of dotData, Inc. and/or its affiliates in the U.S. and worldwide. 

About Teradata

Teradata transforms how businesses work and people live through the power of data. Teradata leverages all of the data, all of the time, so you can analyze anything, deploy anywhere, and deliver analytics that matter most to your business. And we do it on-premises, in the cloud, or anywhere in between. We call this pervasive data intelligence, powered by the cloud. It’s the answer to the complexity, cost and inadequacy of today’s approach to analytics. Get the answer at teradata.com. 

The Teradata logo is a trademark, and Teradata is a registered trademark of Teradata Corporation and/or its affiliates in the U.S. and worldwide.

 

MEDIA CONTACT:

Jennifer Moritz
Zer0 to 5ive for dotData
jmoritz@0to5.com
917-748-4006

The post dotData Partners with Teradata, a Leading Cloud Data & Analytics Company appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
What You Should Know about Investing in AI During Economic Downturn https://dotdata.com/what-you-should-know-about-investing-in-ai-during-economic-downturn/?utm_source=rss&utm_medium=rss&utm_campaign=what-you-should-know-about-investing-in-ai-during-economic-downturn Mon, 29 Jun 2020 16:07:14 +0000 https://dotdata.com/?p=7896 Our CEO, Ryohei Fujimaki, PhD discusses investing in #AI during the economic downturn for @thenextweb https://bit.ly/2YHcF0e #DataScience #ArtificialIntelligence

The post What You Should Know about Investing in AI During Economic Downturn appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
Our CEO, Ryohei Fujimaki, PhD discusses investing in #AI during the economic downturn for @thenextweb https://bit.ly/2YHcF0e #DataScience #ArtificialIntelligence

The post What You Should Know about Investing in AI During Economic Downturn appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
Five Practical Challenges in Enterprise AI / ML https://dotdata.com/five-practical-challenges-in-enterprise-ai-ml/?utm_source=rss&utm_medium=rss&utm_campaign=five-practical-challenges-in-enterprise-ai-ml Tue, 23 Jun 2020 13:10:12 +0000 https://dotdata.com/?p=7886 According to a recent Gartner blog about analytics and BI solutions, only 20% of analytical insights will deliver business outcomes...

The post Five Practical Challenges in Enterprise AI / ML appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
According to a recent Gartner blog about analytics and BI solutions, only 20% of analytical insights will deliver business outcomes through 2022. Another article by VentureBeat AI reported that 87% of data science projects never make it into production. And a global survey by Dimensional Research concluded that 78% of their AI/ML projects stall at some stage before deployment. These results indicate an exceptionally high failure rate across analytics, data science, and machine learning projects. There are many reasons why so many projects fail to meet their business objectives. In this blog, we look at the top practical challenges that enterprise AI projects face and how you can mitigate them:

  1. Identifying business problems and appropriate use cases
    While AI is an incredibly powerful technology, it is not a panacea for every business problem. Building AI because everyone is doing it and throwing any problem at it without concrete objectives is a path to failure. AI is great at sifting through massive amounts of data, discovering patterns, and finding hidden insights that otherwise are not obvious. To get started, prioritize hard to solve, complex business problems that have clear objectives. Assemble a cross-functional team of technical and functional experts, ensure buy-in from domain experts. Finally, define success criteria and measure success with relevant key metrics.
  2. Access to high-quality data
    AI and Machine learning tools rely on data to train underlying algorithms. Access to clean, meaningful data that is representative of the problem at hand is critical for the success of AI initiatives. But, enterprise data tends to be biased, noisy, outdated, unstructured, and full of errors. Many companies lack data infrastructure or do not have enough volume or quality data. Others use antiquated error-prone manual methods for data preparation resulting in inaccurate data and ultimately wrong business decisions. A typical enterprise data architecture should include master data preparation tools designed for data cleansing, formatting, and standardization before storing the data in data lakes and data marts. Data quality, data management, and governance issues are of paramount importance given the high reliance on good quality data and if overlooked, can derail any AI and ML project.
  3. Data pipeline complexity
    Data is spread across disparate databases in different formats and you need to blend and consolidate data from disconnected systems. The challenge is how to extract data, how to clean data, and reformat data to make it ready for predictive analytics. This processed data requires further manipulation that is specific to AI/ML pipelines including additional table joining and further data prep and cleansing. The process requires data engineers to write SQL code and perform manual joins to complete the remaining tasks. This complex process of data ingestion, storage, cleansing, and transformation takes time and is a major bottleneck in scaling data science operations. Automated machine learning tools such as AutoML 2.0 platforms eliminate the complexity of the data pipeline by automating a full-cycle data science workflow. Through automation, these platforms transform the raw data into the inputs of machine learning a.k.a. feature engineering, and produce predictions by combining hundreds of or even more features.
  4. Balancing model accuracy and interpretability
    There is a trade-off between prediction accuracy and model interpretability and data scientists have to do the balancing act by selecting the appropriate modeling approach. Generally speaking, higher accuracy means complex models that are hard to interpret. Easy interpretation means using simpler models but that comes by sacrificing a little bit of accuracy. Traditional data science projects tend to adopt what is known as black-box approaches that generate minimal actionable insights resulting in a lack of accountability in the decision-making process. The solution to the transparency paradox is a new approach that involves using white-box models. White-box modeling implies generating transparent features and models that empower your AI team to execute complex projects with confidence and certainty. White-box models (WBMs) provide clear explanations of how they behave, how they produce predictions, and what variables influenced the model. WBMs are preferred in many enterprise data science use cases because of their transparent ‘inner-working’ modeling process and easily interpretable behavior. Explainability is very important in enterprise data science projects. By giving insight about how the prediction models work and the reasoning behind predictions, organizations can build trust and increase transparency. AutoML 2.0 platforms automate the trade-off between accuracy and interpretability and give users the choice to select the right approach based on the use case.
  5. Model operationalization and deployment
    ML delivers value when a data scientist exports the final model from Jupiter notebook to deploy it in production. Operationalization means that the model is running in a production environment (not a sandbox environment), connected to business applications, and making predictions using live data. This last mile deployment has been a slow, manual, and prolonged process rendering the models, and insights obsolete. It can take anywhere between 8 to 90 days to deploy a single model in production. Irrespective of the AI and ML platform used, it should provide end-points to run and control the developed pipeline, and easily integrate with other business systems using standard APIs. There are several approaches to moving models into production. You need to think through batch vs real-time prediction and take into account whether real-time prediction service is feasible in terms of cost, infrastructure, and complexity. Deployment also includes monitoring the model performance, capturing the performance degradation, and updating models as necessary. Automation makes enterprise-level, end-to-end data science operationalization possible with minimum effort and maximum impact, enabling enterprise data science and software/IT teams to operationalize complex data science projects. Every enterprise data science project should start with a plan to deploy models in production to capture value and realize AI’s potential.

The post Five Practical Challenges in Enterprise AI / ML appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
Demystifying Feature Engineering for Machine Learning https://dotdata.com/demystifying-feature-engineering-for-machine-learning/?utm_source=rss&utm_medium=rss&utm_campaign=demystifying-feature-engineering-for-machine-learning Thu, 04 Jun 2020 17:26:40 +0000 https://dotdata.com/?p=7100 What is Feature Engineering Let’s say you are addressing a complex business problem such as predicting customer churn or forecasting...

The post Demystifying Feature Engineering for Machine Learning appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
What is Feature Engineering

Let’s say you are addressing a complex business problem such as predicting customer churn or forecasting product demand using applied machine learning. Assuming a team is in place and the business case identified, where do you start? The first step is to collect the relevant data to train the machine learning (ML) algorithms. This is usually followed by the selection of the appropriate algorithm or ensemble of algorithms. Choosing the right algorithm depends on the business goals (Accuracy vs Interpretability), category of the problem (Regression or Classification), nature of data (Categorical or Numerical), desired outcome, and constraints (computational resources, training time, latency). Irrespective of the choice of algorithm, whether it is logistic regression, decision tree, boosting, or neural networks, there is a fundamental requirement of providing high-quality input data containing relevant business hypotheses and historical patterns aka Feature Engineering (FE). Often the algorithms get all the limelight and many people believe that algorithms are the secret weapons in the AI battle. But it is FE that performs the magic behind machine learning.

FE is the process of applying domain knowledge to extract analytical representations from raw data, making it ready for machine learning. It involves the application of business knowledge, mathematics, and statistics to transform data into a format that can be directly consumed by machine learning models. It starts from many tables spread across disparate databases that are then joined, aggregated, and combined into a single flat table using statistical transformations and/or relational operations.

Feature Engineering

Enterprise data to ML ready data using AI-powered Feature Engineering

Practical FE is far more complicated than simple transformation exercises such as One-Hot Encoding. To implement FE, you need to write hundreds or even thousands of SQL-like queries, performing a lot of data manipulation, as well as a multitude of statistical transformations.

The Significance of Feature Engineering

ML is driven by algorithms and the algorithms are dependent on data. If you know the historical data, you can detect the pattern. Once you uncover a pattern, you can build a hypothesis. Based on the hypothesis, you can predict the likely outcome such as which customers are likely to churn in a given time period. FE is all about finding the optimal combination of hypotheses.

FE is critical because if you provide the wrong hypotheses as an input, ML cannot make accurate predictions. The quality of any provided hypothesis is vital for the success of an ML model. Quality of feature is critically important from accuracy and interpretability point of view. FE is the most iterative, time-consuming, and resource-intensive process, involving interdisciplinary expertise. It requires technical knowledge but, more importantly, domain knowledge. The data science team builds features by working with domain experts, testing hypotheses, building and evaluating ML models, and repeating the process until the results become acceptable for businesses.

Feature Engineering Automation

FE automation has vast potential to change the traditional data science process. It significantly lowers skill barriers beyond ML automation alone, eliminating hundreds or even thousands of manually-crafted SQL queries, and ramps up the speed of the data science project even without a full light of domain knowledge. It also augments our data insights and delivers “unknown- unknowns” based on the ability to explore millions of feature hypotheses just in hours.

These days automated machine learning (AutoML) is gathering a lot of attention. AutoML is tackling one of the critical challenges that organizations struggle with: the sheer length of the AI and ML project, which usually takes months to complete, and the incredible lack of qualified talent available to handle it. While current AutoML products have undoubtedly made significant inroads in accelerating the AI and machine learning process, they fail to address the most significant step, the process to prepare the input of machine learning from raw business data, in other words, feature engineering.

To create a genuine shift in how modern organizations leverage AI and machine learning, the full cycle of data science development must involve automation. If the problems at the heart of data science automation are due to lack of data scientists, poor understanding of ML from business users, and difficulties in migrating to production environments, then these are the challenges that AutoML must also resolve.

AutoML 2.0, which automates the data and feature engineering, is streamlining FE automation and ML automation as a single pipeline and one-stop-shop. With AutoML 2.0, the full-cycle from raw data through data and feature engineering through ML model development takes days, not months, and a team can deliver 10x more projects.

Summary

Contrary to popular belief, algorithms are not the most distinguishing features of applied machine learning. FE influences the performance and accuracy of ML models more than anything else. It helps reveal the hidden patterns in the data and increases the predictive power of machine learning. In order for ML algorithms to work properly, you need to provide the right input data that algorithms can understand. Oftentimes this involves complex mathematical transformations on raw data. FE provides that input data into a single aggregated format optimized for ML. It is the secret sauce that enables AI/ML to do the magic. Whether it is preventing fraud in financial services, anomaly detection in manufacturing, or predicting customer churn for insurance companies, feature engineering is the most decisive factor for AI/ML success or failure.

The post Demystifying Feature Engineering for Machine Learning appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
dotData to Present on AutoML 2.0 at Deep Learning World https://dotdata.com/dotdata-to-present-on-automl-2-0-at-deep-learning-world/?utm_source=rss&utm_medium=rss&utm_campaign=dotdata-to-present-on-automl-2-0-at-deep-learning-world Tue, 02 Jun 2020 13:10:57 +0000 https://dotdata.com/?p=7088 To watch the presentation: https://hopin.to/events/machine-learning-week-2020 Code: MLWUS20 Click on “Main Stage” CEO to Discuss How AutoML 2.0 Can Empower BI...

The post dotData to Present on AutoML 2.0 at Deep Learning World appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
To watch the presentation:
  1. https://hopin.to/events/machine-learning-week-2020
  2. Code: MLWUS20
  3. Click on “Main Stage”

CEO to Discuss How AutoML 2.0 Can Empower BI Professionals to Implement AI, Machine Learning, and Predictive Analytics

San Mateo, June 1, 2020 – dotData, a leader in AutoML 2.0 software to help accelerate AI/ML development and operationalization for the enterprise, today announced that its founder and CEO, Ryohei Fujimaki, Ph.D., will present at the Deep Learning World conference, taking place virtually from May 31 – June 4, 2020. 

Mr. Fujimaki will present a session, From BI to BI+AI: Adding Predictive Analytics to Your BI Stack in Days, Not Months, on Tuesday, June 2 at 9:00 a.m. The session will discuss how the latest advancements in AutoML 2.0 can help business intelligence professionals rapidly and easily implement data science and machine learning, from source data through feature engineering to implementation of ML in production. 

dotData provides solutions that dramatically improve the productivity of data science projects, which traditionally require extensive manual effort from valuable and skilled resources, by automating the data science process utilizing its proprietary artificial intelligence technologies.

“dotData’s auto ML 2.0 platform makes data science a point-and-click operation, empowering business intelligence teams with a powerful tool to leverage their data for business insights. This frees businesses to focus on the results of their AI and machine learning applications, rather than the implementation,” said Mr. Fujimaki. 

dotData’s AI-powered feature engineering automatically applies data transformation, cleansing, normalization, aggregation, and combination, and transforms hundreds of tables with complex relationships and billions of rows into a single feature table, automating the most manual data science projects.

dotData democratizes data science by enabling existing resources to perform data science tasks, making enterprise data science scalable and sustainable. dotData is also designed to operationalize data science by producing both feature and ML scoring pipelines in production, which IT teams can then immediately integrate with business workflow. This can further automate the time-consuming and arduous process of maintaining the deployed pipeline to ensure repeatability as data changes over time. With the dotData GUI, the data science task becomes a five-minute operation, requiring neither significant data science experience nor SQL/Python/R coding. For more information or a demo of dotData’s AI-powered AutoML 2.0 platform, please visit dotData.com.

 

About dotData

dotData pioneered AutoML 2.0 to help business intelligence professionals add AI/ML models to their BI stacks and predictive analytics applications quickly and easily. Fortune 500 organizations around the world use dotData to accelerate their ML and AI development to drive higher business value. dotData’s automated data science platform accelerates ROI and lowers the total cost of model development by automating the entire data science process that is at the heart of AI/ML. dotData ingests raw business data and uses an AI-based engine to automatically discover meaningful patterns and build ML-ready feature tables from relational, transactional, temporal, geo-locational, and text data. 

dotData has been recognized as a leader by Forrester in the 2019 New Wave for AutoML platforms. dotData has also been recognized as the “best machine learning platform” for 2019 by the AI breakthrough awards, was named an “emerging vendor to watch” by CRN in the big data space and was named to CB Insights’ Top 100 AI Startups in 2020. For more information, visit www.dotdata.com, and join the conversation on Twitter and LinkedIn.

 

MEDIA CONTACT:
Jennifer Moritz                   
Zer0 to 5ive for dotData
jmoritz@0to5.com
917-748-4006

The post dotData to Present on AutoML 2.0 at Deep Learning World appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
Enterprises should not neglect AI digital transformation https://dotdata.com/enterprises-should-not-neglect-ai-digital-transformation/?utm_source=rss&utm_medium=rss&utm_campaign=enterprises-should-not-neglect-ai-digital-transformation Mon, 01 Jun 2020 20:32:45 +0000 https://dotdata.com/?p=7093 What impact will #COVID19 have on #enterprise investments in #ArtificialIntelligence? Our CEO Ryohei Fujimaki recently shared his thoughts with @TechTarget’s...

The post Enterprises should not neglect AI digital transformation appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
What impact will #COVID19 have on #enterprise investments in #ArtificialIntelligence? Our CEO Ryohei Fujimaki recently shared his thoughts with @TechTarget’s @markrlabbe: https://bit.ly/3gDkKKk #AI #DataScience #DigitalTransformation

The post Enterprises should not neglect AI digital transformation appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
AutoML 2.0: Making AI in manufacturing simple https://dotdata.com/automl-2-0-making-ai-in-manufacturing-simple/?utm_source=rss&utm_medium=rss&utm_campaign=automl-2-0-making-ai-in-manufacturing-simple Tue, 26 May 2020 20:09:10 +0000 https://dotdata.com/?p=7074 Published @SME_MFG: https://bit.ly/2TEOkFr — With #AutoML 2.0, firms can leverage the wealth of data at a manufacturer’s disposal, to create...

The post AutoML 2.0: Making AI in manufacturing simple appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>
Published @SME_MFG: https://bit.ly/2TEOkFr — With #AutoML 2.0, firms can leverage the wealth of data at a manufacturer’s disposal, to create ML/AI algorithms in a matter of days,” our CEO Ryohei Fujimaki shared his insights with SME for this article.  #manufacturing #MachineLearning #DataScience #ArtificialIntelligence #ML and AI

The post AutoML 2.0: Making AI in manufacturing simple appeared first on dotData | AutoML 2.0 Solutions for Enterprise.

]]>