fbpx

Reflections from ODSC East 2021

By Sachin Andhare

This was the second year in a row that the premier data science conference went virtual due to the Covid-19 pandemic. Overall the experience was much better this year with a breadth of research topics as well as industry coverage from machine learning for Time Series Data, Transformers in natural language processing (NLP) to Deep Neural Networks for visual quality inspection in manufacturing.

Demystifying Feature Engineering for Machine Learning

By Sachin Andhare

What is Feature Engineering FE is the process of applying domain knowledge to extract analytical representations from raw data, making it ready for machine learning. It involves the application of business knowledge, mathematics, and statistics to transform data into a format that can be directly consumed by machine learning models. It starts from many tables spread across disparate databases that are then joined, aggregated, and combined into a single flat table using statistical transformations and/or relational operations. Let’s say you are addressing a complex business problem such as predicting customer churn or forecasting product demand using applied machine learning. Assuming a team is in place and the business case identified, where do you start? The first step is to collect the relevant data to train the machine learning (ML) algorithms. This is usually followed by the selection of the appropriate algorithm or ensemble of algorithms. Choosing the right algorithm depends…