Deep Learning in Scientific Research: A Comprehensive Overview with COVID-19 Forecasting as an Example

Introduction

Deep learning, an advanced method of artificial intelligence (AI), is rapidly transforming various fields of scientific research. Mimicking the human brain through networks of neurons and synapses, deep learning models are capable of analyzing complex data and identifying intricate patterns that would be difficult or impossible for humans to discern. This article explores the applications of deep learning in scientific research, with a specific focus on its use in COVID-19 forecasting as a detailed example.

Deep Learning for Disease Research: A Case Study

Researchers at Washington State University (WSU) have developed an AI-based deep learning program that can dramatically speed up disease-related research. The development, detailed in Scientific Reports, showcases the potential of deep learning to accelerate scientific discoveries.

High Accuracy in Tissue Analysis

According to Michael Skinner, a WSU biologist and co-corresponding author on the paper, the AI model demonstrated remarkable accuracy in analyzing tissues. "This AI-based deep learning program was very, very accurate at looking at these tissues," Skinner noted.

Development of the AI Model

Computer scientists Colin Greeley and Lawrence Holder trained the AI model using images from past epigenetic studies conducted by Skinner’s laboratory. These studies involved molecular-level signs of disease in kidney, testes, ovarian, and prostate tissues from rats and mice.

Epigenetics and Deep Learning

In Skinner’s research on epigenetics, which involves studying changes to molecular processes that influence gene behavior without changing the DNA itself, the analysis process could take a year or more for large studies. Deep learning offers a significant advantage by drastically reducing the time required for such analyses.

Read also: Writing Lab Reports

Handling High-Resolution Images

The research team designed the WSU deep learning model to handle extremely high-resolution, gigapixel images, containing billions of pixels. This capability allows for detailed and comprehensive tissue analysis.

Potential for Improving Human Health Research

The authors also point to the model’s potential for improving research and diagnosis in humans, particularly for cancer and other gene-related diseases. The network designed by the WSU team is considered state-of-the-art, according to Holder.

Deep Learning for COVID-19 Forecasting: A Detailed Examination

The COVID-19 pandemic has spurred scientists to apply machine learning methods to combat the crisis. While there is a significant amount of research in this area, a comprehensive survey specifically examining deep learning methods for COVID-19 forecasting was lacking. This article addresses that gap by reviewing and analyzing current studies that use deep learning for COVID-19 forecasting.

Methodology and Scope

The review considered all published papers and preprints, discoverable through Google Scholar, from April 1, 2020, to February 20, 2022, that describe deep learning approaches to forecasting COVID-19. The search identified 152 studies, of which 53 passed the initial quality screening and were included in the survey.

Model-Based Taxonomy

A model-based taxonomy was proposed to categorize the literature, describing each model and highlighting its performance. The deficiencies of the existing approaches are identified, and necessary improvements for future research are elucidated.

Read also: Crafting Internship Reports

Applications of Machine Learning to COVID-19

Applications of machine learning to COVID-19 have attracted enormous interest from the research community. The current research trends in the field can be divided into four major categories:

Image and symptom-based diagnosis
Forecasting the number of cases
Intelligent contact tracing
AI-aided drug discovery

A large amount of research is devoted to forecasting the number of infections using deep learning techniques. Various deep learning techniques have been proposed to forecast the number of infections including recurrent neural networks (RNNs), gated recurrent units (GRUs), long short-term memory networks (LSTMs), graph neural networks (GNN), and others.

Existing Surveys and Their Limitations

Despite the large amount of literature, there exists no state-of-the-art survey of the subject. The existing surveys provide either a general overview of machine learning applications or an overview of forecasting methods at large. Surveys that provide a general overview of machine learning applications do not delve into an in-depth analysis of forecasting methods. In most cases, general surveys focus on COVID-19 diagnosis, leaving COVID-19 forecasting as a secondary topic.

There exists a small number of surveys dedicated to the broad review of forecasting methods for COVID-19. General forecasting reviews focus mostly on mathematical models such as the susceptible-infected-recovered (SIR) model and its variants, while deep learning methods receive little consideration.

Deep Learning Approaches to COVID-19 Forecasting

There are currently over 150 research papers in the literature that propose various deep learning approaches to forecasting the number of COVID-19 infections. A number of approaches are based on MLP, RNN, and GRU models. However, the majority of the approaches are based on the LSTM model and its variants. Among the LSTM variants, ConvLSTM and multivariate LSTM (M-LSTM) are the most commonly used approaches. The use of M-LSTM is justified under a reasonable assumption that the number of COVID-19 cases depends on multiple factors (features). The popularity of LSTM is not entirely surprising given its successful performance on other time-series tasks. On the other hand, most models for COVID-19 forecasting use a window of 5 previous observations to forecast the next day observation. Given the shortness of the input sequences, the utility of the LSTM model is questionable. Among other existing approaches, spatiotemporal models using GNNs together with Google mobility data have shown promising results. Spatiotemporal models leverage the information about human movement traffic between cities to model the spread of the pandemic.

Read also: Scholarships and tax liability

Taxonomy of Deep Learning Models for Forecasting COVID-19

A model-based taxonomy is employed to categorize the existing research into distinct subsets. For each model, its general architecture is described along with the specific adjustments made to tailor for COVID-19 forecasting. The theoretical advantages and disadvantages of the model are discussed, as well as its performance in practice. One of the main factors in the performance of a forecasting model is the training and testing data. The country source and time frame of the data can have a dramatic impact on the accuracy results. The differences in data make it challenging to effectively compare different studies.

Importance of Forecast Accuracy Metric

The choice of the forecast accuracy metric is an important consideration in model evaluation. Since the forecast values and errors depend closely on the population size, raw measures of accuracy such as mean absolute error (MAE) and root mean squared error (RMSE) are not appropriate for cross-study comparison. It is more suitable to consider the relative error to measure the accuracy of forecasts. In the survey, the mean absolute percentage error (MAPE) is employed to report the accuracy of the forecasting models.

Structure of the Survey

The survey is structured as follows:

Section 2 presents the taxonomy for organizing the current research into distinct categories.
Section 3 describes and discusses various approaches to forecasting COVID-19 infections together with the corresponding results.
Section 4 discusses the pitfalls of the existing approaches and advises on future research and improvements.
Section 5 concludes the paper.

Taxonomy of Forecasting Models for COVID-19

A model-based taxonomy is proposed to categorize the existing research in COVID-19 forecasting. Each major category can be further refined into more specialized subcategories. The general taxonomy of the forecasting models for COVID-19 can be divided into three major categories: autoregression, mathematical modeling, and machine learning.

Autoregressive Methods

Autoregressive methods are based on the classical time series analysis techniques which include the autoregressive integrated moving average (ARIMA) and generalized autoregressive conditional heteroskedasticity (GARCH) models. ARIMA is a widely used model for time series analysis. In the ARIMA model, the current value of a time series depends on a linear combination of the past values together with random Gaussian noise. It is a simple yet effective approach that has been used to forecast COVID-19 in several countries. In a recent study, the authors employed vectorized ARIMA model to obtain accurate forecasts in the UAE and Saudi Arabia with MAPE 0.0017% and 0.002% respectively. In some cases, it has been shown to outperform the more sophisticated models. The GARCH model is used to model time series shocks such as lockdowns. The authors showed that the ARCH model can be used effectively to forecast COVID-19 in the UAE.

Mathematical Models

Mathematical models are frequently employed in COVID-19 forecasting. There exists a large number of attempts to model the spread of COVID-19 using stochastic processes such as the compartmental and exponential models. The parameters of the SEIR model can be determined using an optimization procedure based on the gradient descent algorithm. The authors employed the SEIR model together with the parameter optimization procedure to forecast COVID-19 cases in China between Jan-Mar, 2020. The model produced robust accuracy with MAPE 3.8%. A large scale comparison of probabilistic models including SEIR-based methods is presented. The authors highlight the performance of two models - the COVID-19 Public Forecast model and the UMass-MechBayes model - as producing highly accurate county-level forecasts in the USA. The latter approach uses a nonparametric model of the transmission rate βt which allows for the transmission rate to increase or decrease for each measurement period. Other frequently used mathematical models include error trend season (ETS) and exponential smoothing (ES) with and without multiplicative error-trend. The authors found that ETS outperforms ES and ARIMA in univariate long-term forecasting, while it was found that ES produces the most accurate forecasts in the short-term.

Machine Learning

Machine learning has been employed successfully in various fields. As a result, machine learning models have been used extensively to provide data-driven forecasts of COVID-19 cases.

Traditional Methods

The traditional methods include support vector machines (SVM), gradient boosting (GB), random forest (RF), k-Nearest Neighbors (kNN), and other algorithms. The authors implemented a Bayesian time series model together with an RF algorithm within an epidemiological compartmental model to forecast the number of COVID-19 cases. The authors introduced a dynamic model based on kNN that builds a unique model for each point of time. The model uses 11 historical inputs and is able to achieve MAPE 9% in 10-week ahead prediction. A more basic approach using polynomial curve-fitting was used to forecast the number of cases in India. On the other hand, the authors found that SVM underperforms exponential smoothing and linear regression. Machine learning is also used in conjunction with other methods. The authors combine mechanistic and machine learning approaches in a unified reinforcement learning framework. The overall trajectory of the disease is estimated by the mechanistic model which in implemented in the machine learning model to forecast local variability. A combination of machine learning and ARIMA is used to construct a hybrid model. Recently, the authors combined a differential equation model with GB machine learning algorithm to forecast COVID-19 under imperfect vaccination scenario. Further details about the applications of traditional machine learning models can be found in the literature. Overall, the use of traditional methods in COVID-19 forecasting has been relatively limited and with mixed results.

Deep Learning

The deep learning category comprises various neural network architectures. The success of neural networks has made them a natural candidate for forecasting. Since forecasting COVID-19 is a time-series task, the majority of the neural networks are based on the recurrent network architecture. In the recurrent model, the forecasted values from the previous time-steps are used as part of the input to forecast the value in the next time-step of the series. The deep learning category has attracted the greatest amount of interest among the researchers with over 150 research papers devoted to the subject. Deep learning includes several models. The most basic model is the multi-layer perceptron (MLP) which consists of the input layer, several fully-connected hidden layers, and the output layer. The MLP model fits a nonlinear function to the data. It can be used in any regression problem and serves as a robust benchmark. Convolutional neural networks (CNN) are a popular class of machine learning models. The strong performance of CNN models in image classification led to their application in other fields. In particular, CNNs are used in several studies in forecasting COVID-19. The most popular type of neural network is the recurrent model. Recurrent neural networks (RNN) were designed specifically for dealing with sequential data. In the recurrent model, the output from the previous time-step is used to forecast the value of the series in the next time step. There exist several extensions of RNN aimed at addressing the problem of exploding and vanishing gradients that occurs in long sequences. The family of recurrent models includes the plain RNN, gated recurrent unit (GRU), and long short-term memory (LSTM). The success of LSTM on speech recognition tasks has prompted its use in many other applications including forecasting. In particular, LSTM has been the most widely applied model in the literature. A number of different extensions of the LSTM architecture have also been proposed to forecast COVID-19. The LSTM-based models include the plain LSTM, convolutional LSTM (ConvLSTM), bi-directional LSTM (BiLSTM), and multivariate LSTM (M-LSTM). Despite the popularity of LSTM, its use in COVID-19 forecasting is often questionable given the small window of previous of values used for forecasting. A number of forecasting models in the literature employ LSMT with window size of 5 or less. Among other approaches employed in COVID-19 forecasting are graph neural networks (GNN) and variational autoencoders (VAE). GNNs use spatiotemporal information to model the spread of the pandemic. A number of studies have been proposed based on GNN that utilize Google mobility data together with COVID-19 time series to forecast the future number of infections.

Deep Learning Models for COVID-19 Forecasting: Architectures and Applications

This section delves into each deep learning model presented in the taxonomy, providing details about the architecture of the models and their application to COVID-19 forecasting. During the initial review of the existing literature, 152 existing publications related to forecasting COVID-19 with deep learning were discovered, of which 53 were selected for further analysis in this study. The distribution of the articles according to the model type is presented in the figure.

Distribution of Publications by Model Type

LSTM and its variants are the most widely used and accurate models; however, their performance depends on the data (country and time frame). Among the LSTM extensions, convolutional LSTM and bidirectional LSTM have shown the highest accuracy in comparative studies.

Multi-Layer Perceptron (MLP)

The most basic deep learning architecture is the multi-layer perceptron (MLP). The MLP model is a nonlinear regression algorithm that employs a layered structure to learn the patterns within the data. Concretely, the MLP architecture consists of the input layer, one or more fully-connected hidden layers, and the output layer.

tags: #deep #learning #sttudy #scientific #report #example