Understanding Probability Output in Esri Deep Learning Models

Deep learning models have become increasingly important in geospatial analysis, offering powerful tools for tasks like object detection, pixel classification, and urban growth prediction. Esri's ArcGIS platform provides a comprehensive environment for training and deploying these models. This article explains how to interpret the probability outputs of deep learning models within the Esri ecosystem, covering the training process, model types, and practical applications.

Introduction to Deep Learning in ArcGIS

The field of artificial intelligence (AI) has advanced significantly, with deep learning playing a crucial role in various applications. Deep learning, a subset of machine learning, involves deep neural networks inspired by the human brain. The arcgis.learn module in Esri's ArcGIS API for Python enables GIS analysts and data scientists to easily adopt and apply deep learning techniques to their workflows. This module facilitates the training of state-of-the-art deep learning models through an intuitive API, accelerating the training process and reducing guesswork.

Training Deep Learning Models Using the Train Deep Learning Model Wizard

The Train Deep Learning Model wizard in ArcGIS Pro is an assisted workflow that guides users through training a deep learning model using their collected training data. To access this wizard:

Click the Imagery tab.
Click the Deep Learning Tools drop-down menu.
Choose Train Deep Learning Model.

The wizard consists of three main pages: Get Started, Train, and Result.

Get Started

On the Get Started page, you specify how you want to train the deep learning model:

Read also: Unlocking GIS Potential

Set the parameters automatically: The model type, parameters, and hyperparameters are automatically configured to build the best model. This option requires an ArcGIS Pro Advanced license.
Specify my own parameters: You set the model type, parameters, and hyperparameters manually.

Train

The Train page allows you to set parameter information for training. The parameters vary depending on the option selected on the Get Started page.

Required Parameters

Input Training Data: The folders containing the image chips, labels, and statistics required to train the model. This is the output from the Export Training Data For Deep Learning tool.
Output Model (Automatic training): The output trained model saved as a deep learning package (.dlpk file).
Output Folder (Manual training): The output folder location where the trained model will be stored.

Optional Parameters (Automatic Training)

Pretrained Model: A pretrained model used to fine-tune the new model. The input is an Esri model definition file (.emd) or a deep learning package file (.dlpk). Fine-tuning is supported only for models trained using ArcGIS.
Total Time Limit (Hours): The total time limit for AutoDL model training.
Neural Networks: Specifies the architectures used to train the model. By default, all networks are used.
Save Evaluated Models: Specifies whether all evaluated models will be saved.

Optional Parameters (Manual Training)

Max Epochs: The maximum number of epochs for which the model will be trained. The default value is 20.
Pre-trained Model: A pretrained model used to fine-tune the new model. The input is an Esri model definition file (.emd) or a deep learning package file (.dlpk).
Model Type: Specifies the model type used to train the deep learning model.
Model Arguments: Information from the Model Type parameter used to populate this parameter. These arguments vary depending on the model architecture.
Data Augmentation: Specifies the type of data augmentation used. Options include:
- Default: Uses default data augmentation parameters and values.
- None: No data augmentation is performed.
- Custom: Specify user-defined data augmentation values in the Augmentation Parameters parameter.
- File: Specify fastai transforms for data augmentation using a .json file.
Batch Size: The number of training samples processed at one time.
Validation %: The percentage of training samples used for validating the model. The default value is 10.
Chip Size: The size of the image chips used to train the model.
Resize To: Resizes the image chips. This parameter applies to object detection (PASCAL VOC), object classification (labeled tiles), and super-resolution data only.
Learning Rate: The rate at which existing information is overwritten with newly acquired information.
Backbone Model: Specifies the preconfigured neural network used as the architecture for training the new model.
Monitor Metric: Specifies the metric monitored during checkpointing and early stopping.
Stop when model stops improving: Specifies whether early stopping is implemented.
Freeze Model: Specifies whether the backbone layers in the pretrained model are frozen.
Weight Initialization Scheme: Specifies the scheme used to initialize weights for the layer. This parameter is applicable only when multispectral imagery is used.
Enable Tensorboard: Specifies whether Tensorboard metrics are enabled during training.

Result

The Result page displays key details of the trained model for review and comparison with other models. Reviewing a model provides insights into how it was trained and its potential performance.

Key Details

Model: Use the Browse button to select a model for review.
Compare: Use the Compare button to compile metrics of loaded models into a comparison report.
Model Type: The name of the model architecture.
Backbone: The name of the preconfigured neural network used for training.
Learning Rate: The learning rate used in training.
Training and Validation loss: A graph showing training loss and validation loss over the course of training.
Analysis of the model: Metrics or numbers depending on the model architecture. For example, pixel classification models display precision, recall, and the f1 score for each class.

Understanding Model Architectures

Esri supports various deep learning model architectures, each suited for different tasks. These models are often based on pretrained convolutional neural networks (CNNs) like ResNet, VGG, and Inception, which have been trained on large image datasets like ImageNet. The pretrained models act as feature extractors, which can be fine-tuned for specific geospatial tasks.

Object Detection Models

Object detection models identify objects within an image and their locations in terms of bounding boxes. This is useful for infrastructure mapping, anomaly detection, and feature extraction.

Single Shot Detector (SSD)

The SingleShotDetector model, based on Fast.ai MOOC Version2 Lesson 9, is used for object detection tasks. It uses a pretrained convnet, such as ResNet, as the 'backbone.' The SSD divides the image into a grid, with each grid cell predicting which object (if any) lies within it and its location.

Pixel Classification Models

Pixel classification models classify each pixel of an image as belonging to a particular class. In GIS, segmentation can be used for land cover classification or for extracting roads or buildings from satellite imagery.

Interpreting Probability Outputs

The output of a deep learning model is typically a probability score for each class or object detected. Understanding these probability scores is crucial for assessing the model's confidence and making informed decisions.

Probability Scores

Probability scores range from 0 to 1, where a higher score indicates a higher confidence in the prediction. For example, in an object detection task, a bounding box around a well pad might have a probability score of 0.95, indicating a high level of confidence that the object is indeed a well pad.

Thresholding

A common practice is to set a probability threshold above which the prediction is considered valid. For example, if the threshold is set to 0.7, only predictions with a probability score of 0.7 or higher are accepted.

Factors Affecting Probability Outputs

Several factors can affect the probability outputs of a deep learning model:

Training Data Quality: High-quality, representative training data is essential for accurate probability outputs.
Model Architecture: The choice of model architecture can impact the probability scores. Some models are inherently more confident than others.
Hyperparameter Tuning: Proper hyperparameter tuning can improve the calibration of probability outputs.
Data Augmentation: Data augmentation techniques can help the model generalize better and produce more reliable probability scores.

Practical Applications

Understanding probability outputs is crucial for various applications:

Monitoring Well Pads

In the oil and gas industry, deep learning models can be used to detect unregistered well pads. By analyzing satellite imagery and interpreting the probability scores, regulators can monitor new drilling activities and identify potential illegal operations.

Urban Growth Prediction

Urban planners can use deep learning models to predict urban growth patterns. By analyzing land cover data and other spatial variables, these models can generate probability maps indicating the likelihood of urban development in different areas.

Environmental Monitoring

Deep learning models can be used to monitor environmental changes, such as deforestation and desertification. By analyzing satellite imagery and interpreting the probability scores, scientists can track these changes over time and develop strategies for conservation.

Example: Detecting Well Pads Using Deep Learning

Let's consider an example of training a deep learning model to identify well pads from Sentinel-2 imagery using the arcgis.learn module.

Exporting Training Samples

The export_training_data() method generates training samples for training deep learning models, given the input imagery, along with labeled vector data or classified images. The object detection models in arcgis.learn accept training samples in the PASCAL_VOC_rectangles format.

from arcgis.gis import GISfrom arcgis.raster.functions import applyfrom arcgis.learn import export_training_datagis = GIS("home")well_pads = gis.content.get('ae6f1c62027c42b8a88c4cf5deb86bbf') # Well pads layersentinel_item = gis.content.get("15c1069f84eb40ff90940c0299f31abc") # Sentinel-2 imagerysentinel_data = apply(sentinel_item.layers[0], 'Natural Color with DRA', astype='U8')export_training_data(sentinel_data, well_pads, "PNG", {"x":448,"y":448}, {"x":224,"y":224}, "PASCAL_VOC_rectangles", 75, "well_pads")

Data Preparation

The prepare_data() method automates the data preparation process by reading the training samples and constructing the appropriate fast.ai DataBunch.

from arcgis.learn import prepare_datadata = prepare_data('/arcgis/directories/rasterstore/well_pads', {0: ' Pad'})

Model Training

The arcgis.learn module includes support for training deep learning models for object detection. The models leverage fast.ai's learning rate finder and one-cycle learning, allowing for much faster training.

Leveraging R-ArcGIS Bridge

The R-ArcGIS Bridge is an Esri R package that allows the seamless passing of data and analytic results between ArcGIS Pro and R. This bridge enables users to automate analytic processes and maintain changes made to the underlying GIS datasets. For example, land cover raster datasets can be prepared and processed in ArcGIS Pro and then passed via R-ArcGIS Bridge to R as an R data frame for modeling and training.

tags: #esri #deep #learning #model #probability #output