AutoML: All you need to know [Updated 2024]

In the last decade, the amount of generated data increased exponentially, making regular algorithms inefficient in analyzing them. As machine learning started to have a huge boost, more and more industries are using machine learning algorithms to solve their real-world problems and achieve higher outcomes.

What is AutoML (Automated Machine Learning)?

Automated machine learning (AutoML) is the new fast-growing technology that aims to make data science more productive and accessible for everyone. Automated machine learning helps to automate all stages of a data science workflow including data preparation, feature engineering, model selection, and hyperparameter optimization.

Nowadays, a lot of tools can use Automated machine learning (AutoML), train on the raw data and get a deployed model with just a few clicks.

Who can use Automated machine learning?

Applying a machine learning algorithm to real-world problems requires time and machine learning knowledge for data scientists to develop it. All machine learning pipelines are manually done by machine learning experts, and it is rather difficult or even impossible in many cases for novices to develop models for deployment. This is where Automated machine learning (AutoML) becomes very handy.

If a company has a short budget or lacks machine learning expertise, the usage of Automated machine learning is the perfect solution. However, this functionality doesn't allude that the role of a machine learning engineer or a data scientist will not be as demanding in the future. On the contrary, this technology opens new opportunities for them as they get to automate machine learning pipelines and save more time. Nonetheless, the interpretation and visualization of the results will still require lots of machine learning expertise.

AutoML vs Manual ML

Automated Machine Learning (AutoML ) and Manual Machine Learning (Manual ML) represent two different approaches to developing machine learning models. Let's take a look at their main differences:

1. Automation level

AutoML: It requires automated processes to handle various stages of the machine learning pipeline, from data preprocessing and feature engineering to model selection, hyperparameter tuning, and model evaluation. AutoML platforms and tools aim to simplify the model-building process for users with varying levels of expertise.

Manual ML: With Manual ML, engineers and data scientists need to manually perform each step of the machine learning pipeline, including feature selection, model creation, and hyperparameter tuning. This approach requires a deep understanding of machine learning algorithms, their strengths and weaknesses, and the specific issue that is needed to be solved.

2. Required expertise

AutoML: Designed to lower the barrier of entry for machine learning, AutoML enables users with limited experience in the field to develop effective models. This can also be beneficial for experienced practitioners, as it allows them to explore different models and settings in a quick manner.

Manual ML: Manual ML requires a solid understanding of machine learning concepts, techniques, and algorithms. This means that data scientists and engineers must have expertise in the subject area so they can select appropriate algorithms, tune hyperparameters, and build the best possible model.

3. Customization and control

AutoML: Compared to manual ML, AutoML platforms typically offer fewer control and customization options, as the main focus is on automation and ease of use. Advanced users may find the level of customization insufficient for certain complex tasks.

Manual ML: With manual ML, practitioners have full control over all aspects of the modeling process, enabling them to build highly customized models tailored to their specific requirements. This can result in better performance but it also requires a higher level of expertise.

4. Time and resources

AutoML: AutoML can save both time and resources by automating the tedious and time-consuming aspects of the machine learning pipeline. This allows users to develop models quickly, which can be extra helpful in fast-paced environments or when working with large datasets.

Manual ML: Manual ML can be more time-consuming and resource-intensive, as practitioners must manually perform each step of the process. However, if a skilled practitioner tailors the model to the specific problem at hand, this approach can sometimes result in better performance.

Types of Automated machine learning (AutoML)

The goal of Automated machine learning research is to develop state-of-the-art algorithms for the machine learning pipeline during each automation stage. The main focus areas are feature engineering, neural architecture search, hyperparameter optimization, and meta learning. To get a proper understanding of each topic, you will require separate research. But for now, let's provide you with a brief introduction to each one.

Feature engineering

In Automated machine learning, feature engineering refers to the automated process of creating, selecting, and transforming features (independent variables or input data) to improve the performance of machine learning models. Automated machine learning platforms and tools aim to simplify this crucial aspect of the machine learning pipeline, which traditionally requires domain expertise and manual intervention. In the context of AutoML, feature engineering may involve:

Feature extraction: Automatically identifying and extracting relevant features from raw data, such as deriving new variables from existing ones or aggregating data to create meaningful features. For example, extracting day of the week, month, or holiday indicators from date-time data.

Feature transformation: Automatically applying transformations to features to make them more suitable for modeling. This can include normalization, scaling, one-hot encoding for categorical variables, or applying mathematical functions like log, square root, or polynomial transformations.

Feature selection: Automatically identifying and selecting the most important features that contribute to the model's predictive performance. This can be done through various methods such as filter-based techniques (e.g., correlation, mutual information), wrapper methods (e.g., recursive feature elimination), or embedded methods (e.g., Lasso regularization).

Dimensionality reduction: Automatically applying dimensionality reduction techniques like Principal Component Analysis (PCA) or t-Distributed Stochastic Neighbor Embedding (t-SNE) to reduce the number of features while preserving the most significant information.

Neural Architecture Search (NAS)

The three main components of the Neural Architecture Search are search space, architecture optimization method, and model evaluation method.

Search space: It defines the neural network architecture's possible designs. The most commonly used ones are:

1. Entire-structured: Stacking a predefined number of layers (convolution, max pooling, linear, etc.) to create a neural network with possible layer skips. This is a very straightforward method that creates a possible search space, allowing a neural network to have a good generalization ability. However, it is considered a resource-consuming training with no transferability.

2. Cell-based: Stacks predefined cells to create neural network, rather than layers like in the entire-structure. The origin is a well-known and widely used ResNet network family that has cells stacked together for deeper networks. Two types of cells are used; normal and reduction. Normal cells conserve input and output dimensionality, whereas a reduction cell decreases width and increases height. The goal of this approach is to find both normal and reduction cells, allowing deeper neural networks to be constructed by adding cells to the smaller ones.

3. Hierarchical: It assembles simple cells from base operations(1x1 conv, 3x3 conv, max pooling, etc.) and uses simple cells to construct blocks for the neural network architecture. The number of blocks is a kind of hyperparameter that needs to be tuned or inserted manually.

Architecture Optimization Method: After defining search space models, you need to select models with better performances.

1. Gradient Descent is the most commonly used method that uses the gradient of the loss function and minimizes it. The DARTS (Differentiable Architecture Search) is one of the algorithms that use the gradient descent method. It searches the neural network's best architecture in a pre-defined search space by minimizing loss function, which depends on neural network parameters and pre-defined operations weights.

2. The evolutionary algorithm generates the input population and calculates its fitness. After that, it follows these 4 steps: selection (selects best inputs), crossover (generates offspring from best selected), mutation (changes offspring some input values), and update generated input population. You need to repeat this process until reaching stopping conditions.

Model evaluation: The simplest way is to train models and compare their results for the model evaluation, but this is considered pretty time-consuming.

1. Early stopping: This is one of the first machine learning methods to avoid overfitting in classical ML. In AutoML, it is used to stop the pipelines that the training process does poorly.

2. Low fidelity: Model training time depends on the model size and data set. However, it is possible to reduce the number of filters in the model architecture or even drop data set resolution to faster model training.

Hyperparameter optimization

Grid Search: It divides space equally while returning the best set of parameters. The method is computationally expensive as with the increase of the number of parameters, grid points increase exponentially.

Random Search: This option generates random points in space and outputs the best hyperparameter sets that have the best performances.

Transfer Learning

Transfer learning is a machine learning method that transfers information from one trained model to another. SuperAnnotate's Neural Network provides several well-known pre-trained models that users can fine-tune on a new dataset and make the annotation process much smoother and faster.

Automated Machine Learning Startups and Enterprises

Enterprise AutoML Solutions

Here's a brief overview of each tool and some of their key differences:

Google AutoML: As a part of the Google Cloud AI Platform, Google AutoML provides a suite of machine learning products that enables developers with limited expertise to train high-quality models. It particularly offers AutoML Vision for image classification, AutoML Tables for structured data, AutoML Natural Language for text processing, and AutoML Translation for language translation. Google Cloud also supports the custom model building by using TensorFlow or PyTorch and is known for its user-friendly interface and seamless integration with other Google Cloud services.

AWS Sagemaker, AutoGluon, and Lambda are all parts of the AutoML tools from AWS. The AWS Sagemaker provides machine learning training and deployment tools. Any AWS Sagemaker Autopilot key features are a training model with the missing values. AutoGluon provides AutoML pipelines with few lines of code and AWS Lambda is a serverless compute service that runs codes and controls resources.

Azure AutoML is a part of Microsoft Azure that provides a better experience with machine learning pipelines. The user gets to choose one of the following tasks: classification, forecasting, regression, computer vision, or NLP. After selecting the metrics and hyperparameters, Azure Auto ML will create multiple ML pipelines with different algorithms for machine learning model training. It will also use different techniques for feature engineering to produce the highest-quality models.

IBM AutoML is part of IBM's Watson Studio, a comprehensive suite of tools and services for data science and machine learning. IBM AutoML provides automated solutions for building, training, and deploying machine learning models. Some of its key features include data preparation, automated feature engineering, model selection, and hyperparameter optimization. The IBM automated machine learning platform is most suited for classical ML allocations that work more with structured tabular data.

Startup AutoML platforms

Databricks AutoML uses open-source tools such as scikit-learn, xgboost, ARIMA, etc. to provide tools for data preparation, model training and evaluation, and deployment. Databricks handles data imbalance in preparation time and generates sets of hyperparameters for model training on cluster nodes. These possible models can solve classification, regression, and foresting tasks. Toward the end, the platform provides the results for the trained models and Jupyter Notebook to access them.

Dataiku AutoML provides tools for accelerating model training in each step. It also offers the option to select training model prediction styles: quick prototype to get fast results, interpretable model decision trees and linear models, and high performance to do deep hyperparameter optimization.

H2O AutoML is an open-source library for machine learning. The number of the required input parameters is minimized as possible so anyone can use the H2O interface to train models with different algorithms. However, complex issues such as neural network hyperparameters fine-tuning will require some prior data science knowledge.

DataRobot provides AutoML tools for both experts and non-experts. It is possible to automate machine learning models with a few clicks or write code for more complex solutions. The AutoML tool generates models and visualizes results while providing some suggestions for good models.

Applications of Automated Machine Learning (AutoML)

By now, we established that all the mentioned AutoML tools provide solutions for various real-world problems and save time for business growth rather than concentrating on ML model training. Let's explore some of the most common applications of AutoML:

Agriculture: Improves product quality testing process.
Customer Support: Improves the efficiency of customer support by doing sentiment analysis in chatbots.
Cybersecurity: Monitoring the data flow for preventing cyber threats or to detect malware or spam.
Entertainment: It can be used for content selection or as a recommendation engine.
Image Recognition: AutoML can help train models for object detection or facial recognition.
Marketing: AutoML has a wide range of usages starting from recommendation engines to predict analytics and improve engagement rates. It is used in social media for behavioral marketing campaigns.
Retail: Reduces the cost of inventory carry to increase profits.
Risk Assessment: Helps financial companies in fraud detection and is also used for risk assessment and management.

Final thoughts

The automated machine learning process democratizes machine learning and gives the opportunity to people without extensive programming knowledge to generate AutoML models and select the best model for their business problems. However, to train custom machine learning models, experts with knowledge of deep learning algorithms are required.

Armen Gyurjinyan

Independent Machine Learning Engineer

All you need to know about AutoML [Updated 2024]

Contents

What is AutoML (Automated Machine Learning)?

Who can use Automated machine learning?