Video: AI4: Accelerate AI With Automated Feature Engineering

AI Development Helped by dotData

Please join dotData CEO Ryohei Fujimaki, P.h.D. as he discusses AI Automation, Feature Engineering and the role they play in helping organizations accelerate their path to developing AI models in record time.

Video Transcript: AI-4: Accelerate AI With Automated Feature Engineering

Good morning, and good afternoon everyone. This is Ryohei Fujimaki, CEO and founder of dotData. Thank you very much for coming to our keynote session. Today, I’m going to explain how AI automation’s new technology is going to enable you to deliver your AI project in days, not the months that a traditionally manual process requires, and how AI automation helps you to accelerate your digital transformation. 

Before my presentation, let me briefly introduce who we are. We are an AI automation company and dotData was established as a spin-off of NEC Corporation in 2018. We are today at series A with a total of $43 million funding in 2019. Top data was named a leader in the automated machine learning market by Forrester Research, and Forrester said dotData is AutoML’s best-kept secret. We have been selected by many large-scale and middle-scale enterprises including fortune 500 companies. And we have received a lot of world-class recognition from the market and from our partners. 

Let me start my presentation with this quote: Hundreds of AI models are making intelligent decisions using dotData every day, resulting in 2.5 times conversion late. This is a quote by the General Manager of Data Innovation at the world’s top five property and casualty insurance. And this client is a very early adopter of our AI automation. Today, enterprises must accelerate their digital transformation and digital strategy more than ever before. Under the current economic situation, the physical becomes much harder. Regardless of the size of the company, enterprises must become more digital and bring more data and AI intelligence into their business. Imagine how many AI models do you need for the next three years to really accelerate your digital transformation. 

What happened if hundreds or even 1000s of models are helping you to make intelligent decisions behind your business systems. Let me talk a bit more about the story of this insurance crowd. They plan the big AI roadmap or intelligent sales enablement system. In this system, they needed a lot of AI models to identify the right policy for the right customer. Explain the reason why the policy is best suited for the customer in generating sales talking script to make AI insight more actionable for their sales team. When we met with them in 2018, they had no data science team and no data scientists. However, just in nine months since they started to use the data, they have delivered hundreds of production quality models with our AI automation. Today, the system is running every day, and the policy conversion rate increased by 250%. 

The key message here is that we need such scale and speed in AI development to produce truly impactful business values and accelerate the digital transformation. But what are the challenges and the pains to make this vision happen? Let me explain. This is our view of the traditional AI development process. Enterprise AI is not just about machine learning. In fact, when we start machine learning 80% of work has been already done. So what are the last 80% to work before machine learning?

Once you define your business use cases, you have to first collect data from Enterprise Data Store for Business, neither for AI nor machine learning. So you have to load the data from the different businesses In systems cleanse schema on the entity relations, we architect the data structure, find useful data patterns, implement many data manipulations, etc, to prepare input of machine learning. Very very importantly, this process requires deep data knowledge and domain knowledge. That’s why this process has been always very complex, very manual, and very time-consuming. 

Once the input is ready, we do machine learning. This is more like playing with many mathematical and statistical algorithms to find the most accurate model. Overall, the traditional process takes months to complete, and the required highly skilled data scientists. They remember, we need hundreds or even 1000s of mesh AI models, but this traditional process doesn’t scale. Thought data changes this process to a completely different manner. Our AI automation is going to take care of data wrangling and manipulations with putting data without putting data on domain knowledge. Our AI discovered useful in learning about patterns and the hypothesis and of course, total support machine learning automation, which carefully tuned state-of-the-art machine learning models, including feature cleansing, like missing values, outliers, and data normalization. 

With dotData from raw data to deliver AI models, it takes only days, not months. So why can dotData automate this very complex process? What is the difference between dotData from standard automated machine learning? So, the right-hand side is what machine learning or AutoML needs: a single flat aggregated table called the Feature Table. Every single column of this feature table corresponds to a certain hypothesis. For example, for customer churn, one column may be the time period since the customer used service last time. Another column may be the total spending of service during the past three months. Maybe the other column is the average payment duration after the invoice is issued. But of course, there are no such columns or hypotheses in row business data. flow data are stored for business in different pieces of information are stored in different tables like a service usage law, payment history, billing history, customer demographics, etc. 

They have very complex operations and are much larger than billions of records because it’s not yet aggregated. So the very lengthy process before machine learning is about to transform the left-hand side roll business data into the right-hand side of the Feature Table based on your data knowledge and domain knowledge. Here’s a quote. Feature engineering is typically where most of the effort is in machine learning project goals. The words intuition, creativity, and black art are as important as technical stuff by Patrick Domingo, a famous professor at the University of Washington. Our key innovation is really an intelligent AI algorithm that automates the state of the feature engineering process allows the machine to come up with hypothesis patterns. So dotData is going to automate the full cycle of AI development. 

What is it going to deliver to our business? First, dotData allows you to deliver AI products 10 times faster. It was months to complete the traditional manual process. But now without data, you can build a production-ready AI model just within days. And additionally, dotData makes your AI development process more agile and more robust. The traditional process requires so much upfront effort for wrangling the data before seeing the outcomes.

We have seen many client projects that were rejected by the business team that failed After months of months is of effort to build AI models. But with dotData, you can get the first AI models within hours and adjust them very quickly by reflecting feedback from your business team, and the delivery of business delivers a result within days. The second value that dotData delivers is thought data enables BI and analytics teams to deliver AI. Most enterprises do not have enough data scientists so instead, you can leverage the existing BI analytics team to scale your Enterprise AI initiatives 10 times faster, and 10 times more people thought data delivers speed on the scale to the enterprise. 

So how do our platform and offer looks like we have two different platforms depending on your teams? The first one is a dotData Enterprise that is designed for the BI and analytics team to perform full cycle AI automation with zero coding. All technical complexities are taken care of by the platform. Most of our dotrData Enterprise users had no AI experience before dotData but quickly ramped up to speed to deliver production-ready AI models. If you and your team are more experienced data scientists doted upon provides you to leverage our AI automation, but still maintain the flexibility to customize your own process. It’s all interfaced by Python and data frames, so you can take advantage of your own Python library, or powerful Python ecosystem integrated with our AI automation. Regardless of the interface, both platforms give the full gives you to fully leverage our full-cycle automation.

From this slide, let me introduce a couple of quotes in the customer success stories. The first quote is thought data build sales forecasting model that was 70% more accurate on our benchmark all in just a few hours with zero coding and no data scientist. This is from Bhuvana, the CIO at convergent technologies. Convergint Technologies is a leading integrator and service provider of building security systems. Due to the COVID, the physical business has got a big impact and it is critical for them to accelerate their digital transformation. I want to emphasize here that the users at Convergint have mostly IT backgrounds and had no AI experience before. But as this quote is telling, the team has already started to produce very strong AI models. 

Now, they are expanding dotData to create a new revenue stream based on AI-based intelligence services. The next is from Scott Grossman, managing partner, and CIO of a US hedge fund. With dotData, we were able to run exponentially more trading models in the same amount of time, we discovered better trading patterns in improved our fund performance almost immediately. For hedge fund. The most important part of the top data is its extremely quick turnaround to try allows them to try different ideas with very little manual effort. Literally, Scott and the team are testing different AI trading models every day using dotData and have kept improving their fund performance by discovering a lot of important trading patterns from data. Both Bhuvana and Scott are using dotData Enterprise so let me briefly share how simple and easy BI and analytics team can build AI model using dotData Enterprise data First step is to just drag and drop your data set or simply connect with your database and load the data into dotData Enterprise. 

This looks nothing fancy, nothing special. But behind that data, actually, a lot of things are automatically performed like data cleansing, schema, inference, entity duration, statistical profiling, etc. To make data ready for AI development. If you are going to do it, the manual, it’s going to be a very cumbersome process. Now, step two, this is where our magic happens. You first tell the platform what you want to what do you want to protect, and then specify tables and data that you want to use 10 tables 20 tables, you can use as many as you want. And in principle, that is only information that you must provide. There is no visual canvas to draw the very complex data manipulation flow. There’s no complex toggle or parameters to tune.

But all these complex stuff are handled by dotData Enterprise. And you just need to tell the machine your problem and your data and you submit the job after computation, you can see your features in machine learning models. Usually, there are many interesting features discovered by our platform, which gives you much deeper data insights. The end-to-end endpoint for predictions is automatically generated. So you can immediately integrate a pipeline into your business system. Or you can export a lot of outcomes into CSV or Tableau format and make your BI dashboard more predictive.

So the last quote is from JFE Steel, dotData is a key component to deploy various AI models to implement cyber-physical systems in our steel manufacturing plants. This is Kazuhiro Clozaril, General Manager of Data Science at JFE steel. JFE Steel is a global steel manufacturing company. And they are using dotData for plant failure detection, steel core, prediction, and so on so forth. And this is an interesting use case of dotData AI automation in IoT. JFE still must build many AI models at the same time. Additionally, they have to deploy those models on their servers. Each manufacturing plant with a traditional process is again another painful process to re-implement a model, create an appointment, test them and maintain them. But with dotData, it’s going to be a very, very simple process. And for the IoT scenario, JFE still utilizes dotData Stream, which allows you to instantly package your AI models into a portable container. 

Then we explain how it works. First, you build your model using dotData Enterprise or dotted pi. Then you export the dot data model package with one click. Now you can launch started a stream with a single line of Docker command. This is all you need just to build the model in the platform, download the package and launch the container. Then it immediately exposed a real-time prediction endpoint with millisecond latency. In production quality or necessary dependencies are packaged inside the container. So no additional work is required. It is deployable on edge computers on the cloud making your AI Model Deployment extremely simpler and easier than ever before. So this is the end of my presentation. Let me summarize the key takeaways. Today, accelerating digital transformation for enterprises become even more important than ever before and implementing intelligence in digital strategy. You Need a scale and speed in your AI development and thought data. We are the first and only platform that truly automates AI development from raw data through data and feature engineering and machine learning. We accelerate enterprise AI and we democratize AI enterprise AI. So please visit our website and contact us for a free proof of concept and trial. Thank you very much for your attention.