Introduction

The anomalous magnetic moment of the electron has been both predicted by quantum field theory (QFT) and measured experimentally to an astounding level of accuracy, equivalent to measuring your height to the nanometer. The experimental measurements and the predicted value agree with each other to 10 decimal places making it the most accurately verified prediction in the history of physics. Muons are subatomic particles that are very similar to electrons, a close cousin, but are about 200 times heavier and when this same experiment is done on these muons it’s found that the predicted values and the experimental values are ever so slightly different. This disparity suggests that there is physics that we don’t yet understand happening to the muons which could mean new undiscovered particles amongst other possibilities.

 

So what is an anomalous magnetic moment?

At a fundamental level, an electron and its cousins behave like tiny bar magnets, and how strong of a magnet it is determined by its magnetic moment. The magnetic moment in turn is determined by the spin of the electron which is just a fundamental value we know that the electron has, similar to its mass or charge. Using classical physics developed over a hundred years ago we would expect a magnetic moment of half a Bohr magneton. But quantum mechanics and QFT shows us that the actual value is 1 Bohr magnetons and a little bit extra Twice the 1 Bohr magnetons and a little bit extra is known as the g factor and is roughly equal to 2.002 and this little bit extra is the anomalous magnetic dipole moment. It is roughly a representation of how strongly the particle will interact with virtual particles which are particles that appear and disappear only briefly.

 

Feynman diagrams, QFT, and theoretical calculations

When muons and electrons interact in a magnetic field it can be visualized using Feynman diagrams. The primary interaction can be seen below:

A particle of light (the squiggly line) or a photon is a force-carrying particle, it is like a messenger telling the electron (the straight line) what the magnetic field is and how the electron should move. We can see the photon hits the electron and is absorbed causing the electron to change direction. If this was the only way that a particle could interact with a magnetic field then the g-factor would be exactly 2 and there would be no anomalous magnetic dipole moment.

However, the electron could also interact in the following way:

 

 

The electron emits a photon first, then absorbs the force-carrying photon, and then reabsorbs the emitted photon. The outcome is the same as before but this interaction is a lot less likely to happen and so it contributes only slightly to the g factor. If we add up the contribution from all the possible interactions we find the true g factor of the particle and using QFT and supercomputers we can add up the first few hundred most probable interactions and thus find the g factor to a high precision theoretically.

 

Fermi lab experiment and experimental value

The E821 experiment at Brookhaven National Laboratory (BNL) experimentally determined the anomalous magnetic dipole moment of muons back in 2006 by observing how they precess in a uniform magnetic field since the frequency of the precession will be dependent on the g-factor. Muons only live for a few microseconds before decaying into many other particles. The frequency itself was actually observed indirectly by measuring both the energy of the particles that are emitted when the muons decay, and how often these particles are detected. The precession frequency can then be determined from these values. The experiment found that the experimental values differed from the theoretical values with a 3.6 sigma significance level. That means there is about a 1 in 3000 chance that the experiment was a fluke. The physics community likes to have a sigma significance level of 5 or more though before declaring a result as true or about a 1 in a million chance of the results being a fluke. The experiment done at BNL is being repeated at Fermilab to try and achieve this 5 sigma significance level and measurements are expected to conclude in 2022, the first results from the experiment were released in April of 2021, and combining it with the results from BNL shows a 4.2 level of significance which is very promising.

Why is there a difference?

Because the muon is so much heavier than the electron it’s more likely to briefly form lighter particles (these are called virtual particles) and interact with them before recombination adding more to the anomalous magnetic dipole moment than the electron would have. So if there was some undiscovered particle that the theoretical calculations did not take into account then the heavier muon would interact with it 4200 times more strongly than the electron would, so the anomalous magnetic dipole moment of the electron would be changed to an undetectable level but the bigger change to the anomalous magnetic dipole moment of the muon would be detectable. If the final results out of Fermilab next year obtain a 5 sigma significance level we can say confidently that there are more yet undiscovered particles. Or at the very least there is something wrong with our current theories of how the universe works at the fundamental level.

 

Machine Learning in General

Shortly after the invention and popularisation of computers, a new field of science emerged called computational physics. This new area of physics is centred around computer simulations and computer modelling. This sparked tremendous progress in materials science and physical chemistry to the point where Paul Dirac, in 1929, proclaimed that the fundamental physical laws for chemistry are ‘completely known’. Since then our understanding of the universe on the most rudimental level has been continuously growing, and computational methods have been further enhanced.

To give a couple of examples of use of said techniques: computer simulations allow scientists to predict the properties of a compound (with reasonable accuracy) before it has been made in the laboratory or create extensive databases that cover the calculated properties of known and hypothetical systems.

Standard computer algorithms, as good as they can be, have many drawbacks. They can only get us so far before a better method is necessary. When we think about an algorithm we think about a set of very specific instructions; in the case of a computer algorithm, we tell the machine exactly what to do at every step of an algorithm – the computer is a little more than a calculator. Howbeit, it may not always be possible for a human to do so, especially for very complicated tasks. So why can’t we give the machine more freedom so that it can develop the needed algorithm specifically tailored to the task a

t hand and data provided? Well, we can! It is called machine learning.

This idea is still very new. However, as the method becomes more and more mature, we are discovering the vast potential that machine learning has – the potential to change and enhance the way we use computer models today.

But what exactly is machine learning?

Machine learning (ML) is a class of computer algorithms that use data to make and improve their predictions.

For a researcher using ML the most important thing – the thing that determines the limits of his work – is data. The quality and quantity of data available directly influence the accuracy of future predictions. Before data set can be used, it sometimes may require featurisation, which is the process of converting raw data into something more meaningful and more suitable for an algorithm. For example, in spectroscopy, the signal is acquired in the time domain, but to conduct analysis it is converted to the frequency domain using the Fourier transform. The more suitable representation of the data the better

the algorithm will work, albeit the choice of the right representation will necessitate an insight into the underlying physics and inner workings of the algorithm.

The training of a ML model can be divided into three categories based on the type of data used: supervised, semi-supervised or unsupervised.

In supervised learning, which is the most commonly used in physical sciences, the training data consist of input values and associated output values. The model is then trained so that given a specific set of input values the model has never seen before it can predict the output values to an acceptable degree of fidelity.

Unsupervised learning is used if only input values are available. The model then uses said values in an attempt to identify trends, patterns, or clusters in the data.

Semi-supervised learning may be useful when a large set of input values, but only a small portion of output values are available.

When the data has been properly ‘seasoned’ it is time to choose the model that will be used. Researchers have developed a wide range of ML model types (a.k.a. learners) each with a specific goal in mind. For example, a one type of supervised learning models can be used to predict the output values for a discrete set (i.e. classify inputs into discrete categories), and another model type can be used to predict the outputs for a continuous set (e.g. to determine the polarisation of light after it passes through a material).

It is useful in certain situations to use an ensemble of different algorithms or of similar algorithms with different internal parameters (hyperparameters). This process is called ‘bagging’ or ‘stacking’.

Popular types of learners are artificial neural networks and deep neural networks. They strive to mimic the operation of the human brain. Artificial neurons are the processing unit in the model, they receive inputs from other neurons and use them to perform a straightforward calculation. Input from each neuron is weighted differently and the learning process aims to adjust said weights so that the prediction is as accurate as possible.

Regardless of the model used, most learners are not autonomous and require the values of hyperparameters to be estimated beforehand using heuristics. Since even the slightest variations of the hyperparameters can cause

significant changes to the speed of the learning process, and the selection of the optimal values is time-consuming and difficult, the development of automatic optimisation algorithms is actively pursued.

After choosing a model, or models, it is important to evaluate it in order to optimise it, and if there is an option between two or more, to choose the most optimal one. The test should be done on a sample of data the model has not seen during the training process. It is crucial, as otherwise, the model might learn the answers to the training set only and when faced with new data it will not give the right output. The standard split is 70-80% of the entire data set for training and 20-30% for testing.

As much as we would want to, ML is not perfect, and there are three rudimentary sources of error in models: model bias, model variance, and irreducible errors.

Bias arises when the algorithm is based on wrong assumptions, and as a consequence, it misses underlying relationships. High bias (a.k.a. underfitting) happens when the model is too rigid or when data is insufficiently detailed; both result in the model not being able to describe the relationships between inputs and outputs.

Variance occurs when the model is too sensitive to training data that may contain noise or errors due to limitations of measurement equipment, outliers, and/or missing data. When the model becomes better at handling training data while plateauing on testing data, we have high variance (a.k.a. overfitting). Usually, it is caused by a high complexity of the model and a large number of hyperparameters.

Finally, with the right model that has been thoroughly tested and optimised we can begin to apply it to the problem and find the answers we have been looking for.

Applying Machine Learning to Molecular Spin Relaxation

In quantum mechanics, spin is an intrinsic property that all elementary particles (such as electrons, protons, neutrinos, etc.) possess. It is a form of angular momentum, thus the name ‘spin’; however, per contra to the name, the particle does not actually rotate around its axis. It is a mind-boggling concept in which depth we shall not go in this article.

All particles can be divided into two categories: integer spin (bosons) and half-integer spin (fermions). Fermions obey the Pauli exclusion principle which describes the phenomenon where no two (or more) identical fermions can occupy the same quantum state. Thanks to this property fermions constitute what we would call the ordinary matter.

One of the more basic laws in electromagnetism is that an accelerating electric charge creates a magnetic field. So, a ‘spinning’ electron (which has spin ½) produces a magnetic field. In material, only the electrons in the most outer (valence) shell of its molecules contribute to the total magnetisation. For an object to become a magnet it has to be made out of atoms that have partially filled valence shell. Once the spins of the electrons in those valence shells are aligned with each other (as otherwise spins would be pointing in random directions effectively cancelling each other out) we have a magnet.

To align spins, we need to put the material in an external, very strong magnetic field that will force the spins out of their equilibrium positions and to  ‘point’ in the same direction (it is called the Zeeman effect). However, once the external magnet is removed, electrons will try to come back to their original randomness. The period of time that such non-equilibrium arrangement can be maintained is called the spin lifetime.

In insulating materials, the spin lifetime is essentially limited by the interaction between spins and lattice vibrations (phonons), namely, the spin-phonon coupling. Those interactions allow spins to absorb or emit phonons in order to relax back to the equilibrium.

Spin ½ systems represent a quite simple prototype of magnetic material; their understanding is pivotal for rationalisation of more complex magnetic compounds. To understand how phonons do relax molecular spins from the first-principle, we apply machine learning to the spin ½ system.

A deeper understanding of the spin-lattice relaxation may have an impact on several fields such as the efficiency of MRI contrast agents or the coherence time of both molecular and solid-state qubits. The consequence of the later application is the fact that the engineering of the spin-lattice interaction is a primary challenge in the spin-based quantum computing field.

There are many different models to explain spin relaxation in both nuclear and electronic spins and they are all based on the Redfield equations, which predict how the population of various spin levels change in time due to the interaction of spin with phonons. The main interactions are the Hyperfine structure (denoted by a matrix A) that couples the two spins and the Zeeman interaction with external magnetic fields, described by the tensors g. Those quantities depend on the atomic positions r in a non-trivial way and, in the paper, ML was used to interpolate these high-dimensional tensorial functions.

The ML model was then used to efficiently scan the spin interactions along all the molecular degrees of freedom and to calculate their numerical derivatives. All the numerical derivatives were used as input to predict, among many other quantities calculated, the relaxation time as a function of temperature and magnetic field for one electronic spin coupled to one nuclear spin of a molecule VO(acac)2.

Such machine-learning-accelerated first-principle calculations allow researchers to analyse our fundamental knowledge of the laws of nature, to test what we know, propose new factors that influence physical phenomena, and bring us a step closer to predicting the future.