Machine Learning in General
Shortly after the invention and popularisation of computers, a new field of science emerged called computational physics. This new area of physics is centred around computer simulations and computer modelling. This sparked tremendous progress in materials science and physical chemistry to the point where Paul Dirac, in 1929, proclaimed that the fundamental physical laws for chemistry are ‘completely known’. Since then our understanding of the universe on the most rudimental level has been continuously growing, and computational methods have been further enhanced.
To give a couple of examples of use of said techniques: computer simulations allow scientists to predict the properties of a compound (with reasonable accuracy) before it has been made in the laboratory or create extensive databases that cover the calculated properties of known and hypothetical systems.
Standard computer algorithms, as good as they can be, have many drawbacks. They can only get us so far before a better method is necessary. When we think about an algorithm we think about a set of very specific instructions; in the case of a computer algorithm, we tell the machine exactly what to do at every step of an algorithm – the computer is a little more than a calculator. Howbeit, it may not always be possible for a human to do so, especially for very complicated tasks. So why can’t we give the machine more freedom so that it can develop the needed algorithm specifically tailored to the task a
t hand and data provided? Well, we can! It is called machine learning.
This idea is still very new. However, as the method becomes more and more mature, we are discovering the vast potential that machine learning has – the potential to change and enhance the way we use computer models today.
But what exactly is machine learning?
Machine learning (ML) is a class of computer algorithms that use data to make and improve their predictions.
For a researcher using ML the most important thing – the thing that determines the limits of his work – is data. The quality and quantity of data available directly influence the accuracy of future predictions. Before data set can be used, it sometimes may require featurisation, which is the process of converting raw data into something more meaningful and more suitable for an algorithm. For example, in spectroscopy, the signal is acquired in the time domain, but to conduct analysis it is converted to the frequency domain using the Fourier transform. The more suitable representation of the data the better
the algorithm will work, albeit the choice of the right representation will necessitate an insight into the underlying physics and inner workings of the algorithm.
The training of a ML model can be divided into three categories based on the type of data used: supervised, semi-supervised or unsupervised.
In supervised learning, which is the most commonly used in physical sciences, the training data consist of input values and associated output values. The model is then trained so that given a specific set of input values the model has never seen before it can predict the output values to an acceptable degree of fidelity.
Unsupervised learning is used if only input values are available. The model then uses said values in an attempt to identify trends, patterns, or clusters in the data.
Semi-supervised learning may be useful when a large set of input values, but only a small portion of output values are available.
When the data has been properly ‘seasoned’ it is time to choose the model that will be used. Researchers have developed a wide range of ML model types (a.k.a. learners) each with a specific goal in mind. For example, a one type of supervised learning models can be used to predict the output values for a discrete set (i.e. classify inputs into discrete categories), and another model type can be used to predict the outputs for a continuous set (e.g. to determine the polarisation of light after it passes through a material).
It is useful in certain situations to use an ensemble of different algorithms or of similar algorithms with different internal parameters (hyperparameters). This process is called ‘bagging’ or ‘stacking’.
Popular types of learners are artificial neural networks and deep neural networks. They strive to mimic the operation of the human brain. Artificial neurons are the processing unit in the model, they receive inputs from other neurons and use them to perform a straightforward calculation. Input from each neuron is weighted differently and the learning process aims to adjust said weights so that the prediction is as accurate as possible.
Regardless of the model used, most learners are not autonomous and require the values of hyperparameters to be estimated beforehand using heuristics. Since even the slightest variations of the hyperparameters can cause
significant changes to the speed of the learning process, and the selection of the optimal values is time-consuming and difficult, the development of automatic optimisation algorithms is actively pursued.
After choosing a model, or models, it is important to evaluate it in order to optimise it, and if there is an option between two or more, to choose the most optimal one. The test should be done on a sample of data the model has not seen during the training process. It is crucial, as otherwise, the model might learn the answers to the training set only and when faced with new data it will not give the right output. The standard split is 70-80% of the entire data set for training and 20-30% for testing.
As much as we would want to, ML is not perfect, and there are three rudimentary sources of error in models: model bias, model variance, and irreducible errors.
Bias arises when the algorithm is based on wrong assumptions, and as a consequence, it misses underlying relationships. High bias (a.k.a. underfitting) happens when the model is too rigid or when data is insufficiently detailed; both result in the model not being able to describe the relationships between inputs and outputs.
Variance occurs when the model is too sensitive to training data that may contain noise or errors due to limitations of measurement equipment, outliers, and/or missing data. When the model becomes better at handling training data while plateauing on testing data, we have high variance (a.k.a. overfitting). Usually, it is caused by a high complexity of the model and a large number of hyperparameters.
Finally, with the right model that has been thoroughly tested and optimised we can begin to apply it to the problem and find the answers we have been looking for.
Applying Machine Learning to Molecular Spin Relaxation
In quantum mechanics, spin is an intrinsic property that all elementary particles (such as electrons, protons, neutrinos, etc.) possess. It is a form of angular momentum, thus the name ‘spin’; however, per contra to the name, the particle does not actually rotate around its axis. It is a mind-boggling concept in which depth we shall not go in this article.
All particles can be divided into two categories: integer spin (bosons) and half-integer spin (fermions). Fermions obey the Pauli exclusion principle which describes the phenomenon where no two (or more) identical fermions can occupy the same quantum state. Thanks to this property fermions constitute what we would call the ordinary matter.
One of the more basic laws in electromagnetism is that an accelerating electric charge creates a magnetic field. So, a ‘spinning’ electron (which has spin ½) produces a magnetic field. In material, only the electrons in the most outer (valence) shell of its molecules contribute to the total magnetisation. For an object to become a magnet it has to be made out of atoms that have partially filled valence shell. Once the spins of the electrons in those valence shells are aligned with each other (as otherwise spins would be pointing in random directions effectively cancelling each other out) we have a magnet.
To align spins, we need to put the material in an external, very strong magnetic field that will force the spins out of their equilibrium positions and to ‘point’ in the same direction (it is called the Zeeman effect). However, once the external magnet is removed, electrons will try to come back to their original randomness. The period of time that such non-equilibrium arrangement can be maintained is called the spin lifetime.
In insulating materials, the spin lifetime is essentially limited by the interaction between spins and lattice vibrations (phonons), namely, the spin-phonon coupling. Those interactions allow spins to absorb or emit phonons in order to relax back to the equilibrium.
Spin ½ systems represent a quite simple prototype of magnetic material; their understanding is pivotal for rationalisation of more complex magnetic compounds. To understand how phonons do relax molecular spins from the first-principle, we apply machine learning to the spin ½ system.
A deeper understanding of the spin-lattice relaxation may have an impact on several fields such as the efficiency of MRI contrast agents or the coherence time of both molecular and solid-state qubits. The consequence of the later application is the fact that the engineering of the spin-lattice interaction is a primary challenge in the spin-based quantum computing field.
There are many different models to explain spin relaxation in both nuclear and electronic spins and they are all based on the Redfield equations, which predict how the population of various spin levels change in time due to the interaction of spin with phonons. The main interactions are the Hyperfine structure (denoted by a matrix A) that couples the two spins and the Zeeman interaction with external magnetic fields, described by the tensors g. Those quantities depend on the atomic positions r in a non-trivial way and, in the paper, ML was used to interpolate these high-dimensional tensorial functions.
The ML model was then used to efficiently scan the spin interactions along all the molecular degrees of freedom and to calculate their numerical derivatives. All the numerical derivatives were used as input to predict, among many other quantities calculated, the relaxation time as a function of temperature and magnetic field for one electronic spin coupled to one nuclear spin of a molecule VO(acac)2.
Such machine-learning-accelerated first-principle calculations allow researchers to analyse our fundamental knowledge of the laws of nature, to test what we know, propose new factors that influence physical phenomena, and bring us a step closer to predicting the future.