In-Situ Machine Learning with CAMSARI: Overcoming Device Variability in Neuromorphic Spintronic Systems

One of the major hurdles in contemporary electronics lies in the creation and deployment of hardware neural networks capable of rapid and energy-efficient machine learning. Spintronics emerges as a promising avenue in this domain, boasting nanosecond-scale operation speeds and compatibility with existing microelectronic technologies. However, when considering the development of large-scale, functional neuromorphic systems, the variability in device properties poses a significant challenge. This article explores an autonomously operating circuit designed for hardware-aware machine learning, leveraging probabilistic neurons constructed with stochastic magnetic tunnel junctions (MTJs). This approach demonstrates that in situ learning of weights and biases within a Boltzmann machine can effectively mitigate device-to-device variations and accurately learn the probability distributions of meaningful operations, such as a full adder.

Introduction to Probabilistic Computing and Spintronics

Traditional computing systems, rooted in the von Neumann architecture, depend on deterministic binary bits to process information and perform calculations. This approach proves inefficient for combinatorial optimization problems, which are often classified as Non-deterministic Polynomial-time hard (NP-hard) or NP-complete. Examples of such problems include the knapsack problem, integer factorization, and the traveling salesman problem (TSP). Probabilistic bits (P-bits), which serve as the fundamental building blocks for Ising computation hardware, offer a compelling alternative due to their inherent stochasticity.

Spintronics, utilizing the spin of electrons rather than their charge, offers potential advantages in terms of speed and power consumption. Magnetic tunnel junction (MTJ) P-bit devices, in particular, present desirable characteristics such as high speed (on the order of nanoseconds), ultra-low power consumption (on the order of microwatts), and a small footprint (around 10 nm). In order to achieve the stochastic behavior necessary for P-bits, the energy barrier ΔE of the MTJ must be reduced to approximately 1kBT. However, fabricating multiple MTJ P-bits with identical ΔE values poses a significant challenge, especially given the difficulty in precisely controlling such small energy barriers. This inherent difficulty leads to significant intrinsic variation among MTJ P-bit devices.

To ensure the accuracy of Ising computation, individual P-bit devices typically require calibration, which becomes prohibitively expensive for large P-bit arrays. Furthermore, within an analog P-bit system, direct measurement and calibration of individual P-bit variations remain elusive. Several methods have been proposed to address this variation issue, including time division multiplexing (TDM), the application of external magnetic fields and voltages, and the configuration of suitable resistances.

Boltzmann Machines and In-Situ Learning

Boltzmann machines, stochastic recurrent neural networks inspired by the Boltzmann distribution, have found widespread application in generative machine learning. Recently, hardware implementations of restricted Boltzmann machines have been proposed. In response to the variability of MTJ P-bit devices, an in situ learning approach based on a fully visible Boltzmann machine (FVBM) has been developed. This approach updates the weight matrix to adapt to P-bit variation, effectively treating P-bits with variations as a unified system and training suitable weights at the system level, rather than relying on individual calibration.

Read also: Read more about Computer Vision and Machine Learning

Modeling P-Bit Variation

The variability in P-bit devices can be attributed to variations in their sigmoid curves, which define the probability of a P-bit switching states based on an input voltage or current. Experimental measurements of these sigmoid curves reveal two primary categories of deviation from the ideal curve: stretching or compression in shape, and rigid shifts. These deviations can be characterized by two parameters: α, which describes the degree of stretching and compression, and ΔV, which represents the rigid shift of the sigmoid curve. Accurately extracting α and ΔV simultaneously from a large P-bit array is a crucial and challenging task.

A behavioral model incorporating α and ΔV can be used to fit experimentally measured sigmoid curves. However, these device-to-device variations significantly compromise the accuracy of Ising computation.

Weight Compensation Method

To address the challenge of P-bit variability, a weight compensation method has been developed. This method compensates for P-bit device variation by re-deriving the weight matrix instead of calibrating each P-bit device individually. By analyzing computations of an AND gate using both ideal and real P-bits, the necessity of addressing P-bit variation becomes clear.

The weight matrix compensation process involves re-deriving the weight and bias matrices to ensure precise Ising computation with the real P-bit array. This process relies on two compensation parameters: Cα = 1/α and CΔV = -diag(1/α)ΔV. These compensation parameters are solely related to P-bit variations, rather than the weight matrix itself. This implies that even with changes in the problem being solved, these compensation parameters remain unchanged, eliminating the need to retrain the weight matrix. The only prerequisite is obtaining the P-bit variation parameters, α and ΔV, treating these variations as inherent and immutable attributes of P-bit devices.

Variation Extraction Algorithm Based on Boltzmann Machine Learning

Central to the weight compensation methodology is the imperative requirement for precise knowledge of the α and ΔV values for each P-bit. To address this need, a novel automatic variation extraction algorithm has been developed, capable of extracting device variations of each P-bit in a large array based on Boltzmann machine learning. In order for the accurate extraction of variations from an extendable P-Bit array, an Ising Hamiltonian based on a 3D ferromagnetic model is constructed, achieving precise and scalable array variation extraction.

Read also: Revolutionizing Remote Monitoring

The Boltzmann machine can be mapped to the hardware system for the P-Bit, and it trains the weight matrix W for the given data distribution. However, training a weight matrix for each specific problem is time-consuming and lacks transferability and scalability. The objective is to extract the α and ΔV of each P-bit device in the real array just once, thinking of them as the inherent attributes of each P-bit, then use the weight compensation method to solve diverse unconventional problems.

By observing and analyzing the impact of α and ΔV on the data distributions in Ising computation, the algorithm can effectively extract the P-bit variation. When ΔV>0, the ideal Sigmoid curve rigidly shifts to the left; all points of this curve have also moved upward for the input voltage. Consequently, the 〈m〉 increases from 〈mideal〉 to 〈m1〉. Conversely, when ΔV0, the interaction between P-Bit i and P-Bit k in Ising computing is positive, then the state of two P-Bits are the same direction. For P-Bit 1, α1, the absolute value of the 〈m2〉 increased, which resulted in the increase in…

Experimental Validation and Results

The effectiveness, transferability, and scalability of this approach have been demonstrated by successfully solving the 16-city TSP and 21-bit integer factorization on a large P-bit array with corrected variations given by the Automatic Extraction and Compensation methods. The results of AND gate computation in three scenarios of P-Bit variation is illustrative. The ideal P-Bit array obtains accurate results in AND gate computation; each of the four correct states is approximately 18% and this make 72% total accuracy. However, the real P-Bit array extracted from the experimental data with different α and ΔV cannot get accurate results in Ising computing. The accuracy from the real P-Bit array is 59%, much smaller than the ideal P-Bit array. With weight compensation, the real P-Bit array achieves accurate computation akin to the ideal P-Bit array.

Monte Carlo analysis demonstrates that a small standard deviation of ΔV results in a significantly large KLD for the AND gate computation, while the tolerance for α is higher. When the standard deviation of α is less than 0.5, the computation can ensure sufficient accuracy.

P-Bit Device Fabrication and Operation

The P-bit device is the main building block of the Ising computer. It is actually a binary stochastic neuron (BSN) which can fluctuate between −1 and 1 with probability that is tuned by the input voltage (or current). The curve of controlled probability with respect to input voltage is usually Sigmoid-like.

Read also: Boosting Algorithms Explained

An ultra-fast field-free stochastic SOT P-bit device can be fabricated using a stack deposited consisting of IrMn(7.5)/CoFe(2)/Ru(0.8)/CoFe(2)/CoFeB(1.9)/MgO /CoFeB(1.2)/W(5). The stack is etched into elliptical MTJs with a major axis of 137 nm and a minor axis of 107 nm. The SOT MTJ structure involves adding a heavy metal layer beneath an MTJ, which is used to apply a SOT voltage. This voltage can influence the double-wall energy barrier via the SOT effect. The elliptical MTJ has a stray field that inherently favors the ‘0’ state. When the SOT voltage is low, the SOT MTJ tends to stay in the ‘0’ state. As the SOT voltage increases, the probabilities of the ‘0’ and ‘1’ states become 50/50%. While the SOT voltage is sufficiently high, the SOT MTJ is more likely to remain in the ‘1’ state.

tags: #insitu #machine #learning #camsari

Popular posts: