Physics-Informed Neural Networks: A Deep Dive Tutorial

Physics-Informed Neural Networks (PINNs) represent a paradigm shift in how we approach solving problems rooted in physical sciences and engineering. By seamlessly integrating the power of deep learning with the constraints of physical laws, PINNs offer a versatile and efficient framework for tackling a wide range of challenges, from predicting heat distribution to modeling fluid dynamics. This article aims to provide a comprehensive tutorial on PINNs, exploring their underlying principles, implementation details, and diverse applications.

Introduction to Physics-Informed Machine Learning

In the realm of machine learning, data-driven methods have traditionally dominated. However, when dealing with physical systems, incorporating prior knowledge about the underlying physics can significantly enhance the accuracy and reliability of our models. PINNs achieve this by embedding physical laws, often expressed as differential equations, directly into the neural network's loss function. This guides the learning process towards solutions that not only fit the available data but also adhere to the governing physical principles.

PINNs seamlessly integrate physics knowledge with data. Choosing between PINNs, data-driven approaches, and traditional numerical methods depends on your application. PINNs differ from traditional neural networks in their ability to incorporate a priori domain knowledge of the problem in the form of differential equations. This additional information enables PINNs to make more accurate predictions outside of the given measurement data.

The Essence of PINNs

At their core, PINNs are neural networks designed to solve forward and inverse problems involving nonlinear partial differential equations. The key innovation lies in how they leverage the power of deep learning while simultaneously improving compliance with physical laws. This makes them particularly useful in scenarios where physics is fully or partially known, such as when dealing with a PDE or ODE with unknown coefficients.

How PINNs Work

Consider a scenario where you have noisy measurements, ( θ{meas} ), of a system and want to predict future values, ( θ{pred} ), using a feedforward artificial neural network. A naive neural network trained solely on the available measurements may overfit the noise and perform poorly when predicting values outside the range of the training data. Acquiring more data could enhance predictions, yet this approach may be prohibitively expensive or impossible for many applications.

PINNs offer a solution by incorporating the differential equation governing the system as an additional, physics-informed term in the loss function. The PINN evaluates the residual of the differential equation at additional points in the domain, providing more information to the network without requiring more measurements.

The Loss Function: Guiding the Learning Process

The loss function, ( L ), is the heart of any machine learning model, and PINNs are no exception. The PINN loss function typically consists of several terms:

Physics-informed loss term, ( L_{Physics} ): This term evaluates the residual of the differential equation at points within the domain. Automatic differentiation (AD) or other numerical differentiation techniques are used to compute the derivatives required in the differential equation.
Optional terms for initial and boundary conditions, ( L_{Conds} ): These terms evaluate the error between the values predicted by the network and any known initial or boundary data.
Optional terms for additional measurements, ( L_{Data} ): These terms account for any other available data that can further constrain the solution.

By minimizing this composite loss function, the PINN learns to satisfy both the governing physical laws and the available data, resulting in a more accurate and physically plausible solution.

Applications of PINNs

The versatility of PINNs has led to their application in a wide range of fields, including:

Heat Transfer: PINNs can model heat distribution and transfer processes by embedding the heat equation into the loss function. This ensures that the solutions adhere to the laws of thermodynamics, leading to physically plausible predictions. They can also replace expensive numerical simulations to quickly approximate temperature distributions over parameterized geometries in design optimization applications.
Computational Fluid Dynamics (CFD): PINNs can approximate velocity, pressure, and temperature fields of fluids by incorporating the Navier-Stokes equations in the loss function.
Structural Mechanics: PINNs can solve both forward and inverse problems by embedding the governing physical laws, such as equations of elasticity and structural dynamics, directly into the loss function. This enables accurate prediction of structural responses like deformations, stresses, and strains under various loads and conditions, as well as the identification of unknown material properties or external loads based on observed data. Particularly useful in scenarios where traditional analytical solutions are infeasible or data is scarce, PINNs reduce the dependency on extensive data sets by leveraging physical principles to guide the learning process.
Solving Parametric Partial Differential Equations (PDEs).
Addressing challenges that prove difficult for traditional methods, such as inverse problems.

PINNs in Practice: A Step-by-Step Implementation

To illustrate the practical implementation of PINNs, let's consider a simplified example: modeling the cooling of a coffee cup.

Read also: Physics at Illinois: Loomis Lab

The Problem: Newton's Law of Cooling

Imagine you have a cup of coffee that is cooling over time. This process is governed by Newton's Law of Cooling:

[\frac{dT}{dt} = R(T_{env} - T)]

where:

(T) is the temperature of the coffee at time (t).
(T_{env}) is the ambient temperature of the environment.
(R) is the cooling rate.

Let's say your coffee starts at a boiling temperature and cools down in a room at 25°C. You want to predict the temperature of the coffee at any given time, even beyond the times for which you have measurements.

Implementing a PINN with PyTorch

Here's how you can implement a PINN to solve this problem using PyTorch:

import torchimport torch.nn as nn# Define the neural network architectureclass Net(nn.Module): def __init__(self): super(Net, self).__init__() self.fc1 = nn.Linear(1, 20) self.fc2 = nn.Linear(20, 20) self.fc3 = nn.Linear(20, 1) self.relu = nn.ReLU() # Make r a differentiable parameter included in self.parameters() self.r = nn.Parameter(data=torch.tensor([0.])) def forward(self, t): out = self.fc1(t) out = self.relu(out) out = self.fc2(out) out = self.relu(out) out = self.fc3(out) return out# Define the physics loss functiondef physics_loss_discovery(model: nn.Module, Tenv, ts): """The physics loss of the model""" # ts = torch.linspace(0, 1000, steps=1000,).view(-1,1).requires_grad_(True).to(DEVICE) temps = model(ts) dT = grad(temps, ts)[0] # Use the differentiable parameter instead pde = model.r * (Tenv - temps) - dT return torch.mean(pde**2)def grad(outputs, inputs): """Computes the partial derivative of an output with respect to an input.""" return torch.autograd.grad( outputs, inputs, grad_outputs=torch.ones_like(outputs), create_graph=True )# Generate training data (noisy measurements) - not needed for pure PINN# Define collocation pointsdef generate_collocation_points(t_min, t_max, num_points): ts = torch.linspace(t_min, t_max, steps=num_points, requires_grad=True).view(-1, 1) return ts# Training loopdef train_pinn(model, optimizer, Tenv, collocation_points, epochs=1000): for epoch in range(epochs): optimizer.zero_grad() loss = physics_loss_discovery(model, Tenv, collocation_points) loss.backward() optimizer.step() if epoch % 100 == 0: print(f'Epoch {epoch}, Loss: {loss.item()}')# Main executionif __name__ == '__main__': # Hyperparameters Tenv = 25.0 # Ambient temperature # R = 0.005 # Cooling rate (to be discovered) learning_rate = 1e-3 num_epochs = 1000 num_collocation_points = 1000 t_min = 0.0 t_max = 15.0 # Model and optimizer model = Net() optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate) # Generate collocation points collocation_points = generate_collocation_points(t_min, t_max, num_collocation_points) # Train the PINN train_pinn(model, optimizer, Tenv, collocation_points, epochs=num_epochs) # Print the discovered cooling rate print(f'Discovered cooling rate: {model.r.item()}') # Make predictions t_test = torch.linspace(0, 30, 100).view(-1, 1) T_pred = model(t_test).detach().numpy() # Plot the results import matplotlib.pyplot as plt plt.plot(t_test.numpy(), T_pred) plt.xlabel('Time') plt.ylabel('Temperature') plt.title('Coffee Cooling with PINN') plt.show()

Explanation of the Code

Neural Network Architecture: The Net class defines a simple feedforward neural network with three fully connected layers and ReLU activation functions. Critically, the cooling rate r is defined as a nn.Parameter, allowing it to be learned during training.
Physics Loss Function: The physics_loss_discovery function calculates the physics-informed loss based on Newton's Law of Cooling. It computes the derivative of the temperature with respect to time using torch.autograd.grad and then calculates the mean squared error of the differential equation. It uses the differentiable parameter model.r instead of a fixed cooling rate.
Training Loop: The train_pinn function iterates through the training epochs, calculating the loss, computing gradients, and updating the network parameters using the Adam optimizer.
Collocation Points: The generate_collocation_points function generates collocation points which are the points in the domain where the differential equation is enforced.
Main Execution: The if __name__ == '__main__': block sets up the problem, defines the hyperparameters, creates the model and optimizer, generates the collocation points, trains the PINN, and prints the discovered cooling rate. Finally, it generates predictions and plots the results.

Advantages of PINNs in this Example

Extrapolation: The PINN can accurately predict the temperature of the coffee even beyond the time range of the training data, thanks to the physics-informed regularization.
Parameter Discovery: The PINN can learn the cooling rate (R) directly from the data, even if it is unknown a priori.

Addressing Challenges and Limitations

While PINNs offer a powerful approach to solving physics-based problems, they also come with their own set of challenges and limitations:

Complexity of the Differential Equation: Very complex differential equations can lead to "bumpy" loss landscapes, making it difficult for gradient descent to find the optimal solution.
Choice of Collocation Points: The distribution of collocation points can significantly impact the accuracy and stability of the PINN.
Balancing Loss Terms: Properly weighting the different terms in the loss function (e.g., physics loss, data loss) is crucial for achieving good performance. Potential imbalances between the two partial losses.
Gradient Flow Pathologies: Issues like vanishing or exploding gradients can hinder the training process, especially in deep networks.

Strategies for Mitigation

Researchers have developed various strategies to address these challenges:

Self-Adaptive PINNs: These methods dynamically adjust the weights of the loss terms during training to improve convergence.
Adaptive Sampling: Techniques like residual-based adaptive sampling focus on placing more collocation points in regions where the residual of the differential equation is high.
Causality-Respecting PINNs: Enforcing causality constraints can improve the stability and accuracy of PINNs, especially for time-dependent problems.
Locally Adaptive Activation Functions: These activation functions can help mitigate gradient flow pathologies and improve the expressiveness of the network.

Vanilla-PINNs vs hard-PINNs

In the original method introduced by Raissi et al. (2019), often referred to as vanilla-PINNs in the literature, the initial and boundary conditions necessary for solving the equations are enforced through another set of data termed training points, where the solution is either known or assumed. These constraints are integrated by minimizing a second loss function, typically a measure of the error like the mean-squared error. The integration of the two loss functions results in a total loss function, which is ultimately employed in a gradient descent algorithm. An advantageous aspect of PINNs is its minimal reliance on extensive training data, as only knowledge of the solution at the boundary is required for vanilla-PINNs.

Following the approach initially proposed by Lagaris (1998), there is another option to precisely enforce boundary conditions to eliminate the need for a training dataset. This involves compelling the neural networks (NNs) to consistently assign the prescribed value at the boundary by utilizing a well-behaved trial function.

Solving Poisson-Type Equations

To further demonstrate the application of PINNs, let's consider solving Poisson-type equations in 2D cartesian coordinates.

Problem Setup

The general form of the Poisson equation is:

[-\nabla \cdot ( \mu \nabla u) = f]

where:

(u(x, y)) is the unknown function.
(f(x, y)) is a known source term.
(\mu) is a scalar parameter.

For simplicity, we can focus on a case where (\mu = 1), resulting in the Laplace equation:

[-\nabla^2 u = - (u{xx} + u{yy}) = f]

where (u{xx}) and (u{yy}) are the second-order derivatives of (u) with respect to (x) and (y), respectively.

We can also focus on a case having the exact solution (u(x,y)=x^{2}-y^{2}).

Implementation Details

Data Generation: Generate a data set of training data with points localized at the 4 boundaries. For example, choose 120 points (30 per boundary) with a random distribution. Also generate collocation points inside the domain, e.g., 400 randomly distributed points are used.
Network Architecture: Choose a network architecture, for example, having 5 hidden layers with 20 neurons per layer.
Training: Train the PINN using a suitable optimization algorithm (e.g., Adam) and a learning rate (e.g., (lr=2\times 10^{-4})).
Loss Functions: Define the loss functions (L{data}) and (L{PDE}) to enforce the boundary conditions and the Poisson equation, respectively.

Inverse Problems and Parameter Discovery

PINNs are particularly well-suited for solving inverse problems, where the goal is to determine unknown parameters or functions within a physical system based on limited observations.

Discovering Unknown Parameters

Consider the advection-diffusion problem:

[-\nabla \cdot ( \mu \nabla u) + \mathbf{v} \cdot \nabla u = f]

where (\mu) is a scalar parameter taking different values. In a 2D direct problem, (\mu) is known, but for inverse problems, (\mu) is considered an unknown. The PINN can be trained to discover this parameter by including it as a learnable parameter in the neural network.

tags: #physics #informed #machine #learning #tutorial