I. Background
Back in 2024 when the Nobel Prize in Physics was awarded to John (Hopfield) and Geoffrey (Hinton), the physics community was shocked and inquired about Hopfield and Hinton, who were they?
On October 8, the Royal Swedish Academy of Sciences awarded the Nobel Prize in Physics to John J. Hopfield and Geoffrey E. Hinton, which came as a surprise to everyone. Because these two are experts in the field of computers, they didn't seem to have much to do with traditional physics research. Their work is indeed great, but on the surface, it does not directly promote the traditional field of physics, but indirectly promote the development of computer science through the theory of physics.
I know that my knowledge is limited and I am not too familiar with the research areas of these two bigwigs. Maybe they have amazing and epoch-making contributions in basic and applied physics and so on, just like the research of physicists like Yang Zhenning and Li Zhengdao, which can influence the development of the history of human physics. Now I would like to study this physics prize and these two computer experts with an attitude of learning and exploration.
II. Life and achievements
John J. Hopfield, the renowned American life scientist whose life was filled with the relentless pursuit of knowledge, was born in 1933 in the vibrant city of Chicago, Illinois. Hopfield excelled academically, earning a bachelor's degree from Swarthmore College in 1954, followed by a Ph.D. in physics from Cornell University in 1958.
He has had an equally illustrious teaching career, holding faculty positions at several leading institutions, including the University of California, Berkeley. He now holds the title of Howard Prior Professor Emeritus of Molecular Biology, a tribute to his contributions to academia. Not only that, but Hopfield was also involved in founding the Ph.D. program in Computing and Neurological Systems at Caltech in 1986, which shows his vision and efforts in advancing science education.
In 1982, John Hopfield published his seminal paper entitled "Neural Networks and Physical Systems with Abrupt Collective Computation". In this paper, he skillfully incorporated the concept of dynamics from physics into the design of neural networks. This innovative approach not only provided new perspectives for solving pattern recognition problems, but also found approximate solutions to a class of complex combinatorial optimization problems. Because of the great influence of this work, the network was later affectionately called the "Hopfield network".
Hopfield networks, the name may sound a bit technical, but it's actually a really cool concept. Imagine a special kind of neural network that combines a storage system and a binary system, like a recurrent recurrent neural network. It is inspired by our human memory and tries to mimic the way our brain stores and recalls information.
Another special thing about this network is that it is a recurrent neural network, which means that its output feeds back into the input, creating a loop. In this network, each neuron is connected to all the other neurons, like a giant interconnected network, sometimes called a fully interconnected network.
This research by Hopfield is more than just a theoretical breakthrough; it provides us with a whole new perspective on how the brain works. His work has fueled early developments in the field of neural networks and has given us a deeper understanding of the complexity of the brain. In short, the Hopfield network is like a model for how our brain works, helping us understand ourselves better.
He has a long history of collaboration with Bell Labs, a history that goes back many years. In 1987, Bell Labs made a major breakthrough when they successfully developed a new type of neural network chip based on the principles of Hopfield neural networks. This was not only a technical achievement, but also an indispensable cornerstone in the development of modern artificial intelligence.
There is another noteworthy award on Hopfield's honor roll - the Boltzmann Prize 2022. Named after the famous physicist Ludwig Boltzmann, this award has been given every three years since 1975 to recognize scientists who have not yet won a Nobel Prize, and each scientist can only win it once. Hopfield shared the honor with another scientist, recognizing not only his personal achievements but also his contributions to the field of neural networks.
Geoffrey Hinton, a scientist a bit younger than John Hopfield, was born into an extraordinary background: Hinton was born in 1947 to a family of academics in the United Kingdom, a family that produced many world-renowned scholars. Many world-renowned scholars have emerged from this family, such as his great-grandfather, the famous mathematician Boole, who laid the foundation of modern computer mathematics with Boolean algebra in the 19th century; his aunt, who proposed the concept of "gross national product" in economics; his cousin, a nuclear physicist who participated in the Manhattan Project; and his father, who is a nuclear physicist who was involved in the Manhattan Project; and his father, who is a scientist who was born in the United States. His cousin was a nuclear physicist who worked on the Manhattan Project; and his father was an entomologist at the Royal Society.
Hinton's life story is quite legendary. After graduating from high school, he followed his family's tradition and enrolled at King's College, Cambridge. However, during his years there, Hinton wandered between a number of disciplines, including math, physics, chemistry, biology, and philosophy, in search of his academic direction, and he eventually chose experimental psychology as the subject of his undergraduate thesis.
Hinton's confusion did not dissipate after college. He became a carpenter for a while, making cabinets, shelves, and doors by hand, but it wasn't enough to make ends meet, and in 1972, at the age of 25, Hinton decided to enroll at the University of Edinburgh, where he began his journey into neural networks. Interestingly, at that time he had not even heard of John Hopfield. His tutor only met with him once a week and often advised him to give up, "Studying machine learning? You're wasting your time."
By 1993, Hinton had hit a low point in his life. His wife passed away and one of his two sons was diagnosed with Attention Deficit Hyperactivity Disorder (ADHD). At the same time, neural network research was facing a bottleneck, at a time when the scientific community had not yet universally agreed that neural networks were the mainstream direction of artificial intelligence development. hinton once said, self-deprecatingly, "I felt like I was 'dead' in the water by the time I was 46." He thought at the time that perhaps only 100 years after his death would his research be recognized by the world and make a breakthrough. This is a familiar story in the world of artificial intelligence: in 2009, Hinton in the experiment found that NVIDIA's GPU chip is very suitable for running neural networks, but the chip is too expensive he can not afford to buy, it sent an email to NVIDIA hoping that the other side can give him a chip for free to do research, but did not get any reply from NVIDIA. The outside world is not sure whether Jen-Hsun Huang knew about this, otherwise he might slightly regret it. Because three years later, Jeffrey E. Hinton's research made a huge breakthrough.
Geoffrey Hinton was a professor in the Department of Computer Science at the University of Toronto from 2001 to 2014. In 2012, he and two of his favorite students, Alex Krizhevsky and Ilya Sutskever, developed a neural network with 8 layers called AlexNet, a name derived from Alex. This network was an instant hit and won the ImageNet Large Scale Visual Recognition Challenge that year. It performed so well that at one point the organizers wondered if they had cheated, as AlexNet's image recognition accuracy was a huge jump from the second place finisher.
That was the year Hinton sold his startup to Google and became its vice president. He left that position in 2023.In 2018, Hinton, along with his student Yann LeCun and scholar Yoshua Bengio, won the Turing Award, the highest honor in computing.
When the honor of the Nobel Prize in Physics was bestowed upon Hinton, it was the first time in history that he was thrust into the global spotlight. The scientist, known as the "Godfather of AI," was reportedly at a budget hotel in California getting ready for an MRI scan when he received the call. With the award, Hinton also hopes that winning the Nobel Prize will give his words more weight and make people pay more attention to the AI security issues he has been emphasizing. His voice, now more than ever, needs to be heard by the world.
In the world of Artificial Intelligence (AI), there are two scientists whose names ring out as not only masters of the field, but also widely recognized as the founders of AI. Among them, Geoffrey Hinton is even honored as the "Godfather of AI" because of his far-reaching influence. It is worth mentioning that Ilya Sutskever, the former chief scientist of OpenAI, is also one of Hinton's favorite students.
So what exactly did these two brilliant researchers do that was so groundbreaking?Hopfield and Hinton used the tools of statistical physics to develop new approaches to artificial neural networks. Their research ventured into several esoteric areas, including the Boltzmann distribution, the spin-glass model, energy functions, and the principle of least action. These concepts may sound a bit abstract, but it was this fundamental research that laid a solid theoretical foundation for the development of artificial intelligence.
III. Relevance of the awarded results to physics
The Nobel Committee for Physics' tribute to John Hopfield and Geoffrey Hinton was succinct but profound. Using the tools of physics, they built the foundational methods of modern machine learning that are revolutionizing science, engineering, and even our daily lives. The word "groundbreaking" here implies that their contributions were not only groundbreaking, but also far-reaching, and that without the tools of physics, these methods would not have been possible. The Nobel Prize in Physics is therefore well deserved to be awarded to these two researchers in the field of AI.
So what were their specific findings?Hopfield and Hinton developed artificial neural network methods using the tools of statistical physics. Their work touched on areas such as the Boltzmann distribution, the spin-glass model, energy functions, the principle of least action, etc. Prof. Hopfield's contribution was the creation of a new type of information storage and reconstruction structure known as the Hopfield network, which is capable of storing and reconstructing images, as well as other types of patterns in the data, and which works in a mechanism similar to the way the brain recalls vocabulary or concepts through related information similar to the way the brain recalls words or concepts from related information. Prof. Hinton, on the other hand, has invented methods for discovering properties of data autonomously, such as Boltzmann machines, an approach that has been invaluable in the development of modern large-scale artificial neural networks.
John Hopfield's outstanding contribution is reflected in his proposal of the Hopfield network, an innovation inspired by a physical phenomenon called the Ising model. The name, Ising's model, comes from the physicist Ernst Ising, and it is a model that is both mathematical and statistically mechanical, specifically designed to describe how matter exhibits ferromagnetism.
Let's talk about the Ising model. This model was originally proposed by German physicist Wilhelm Lenzi in 1920, and was later further developed by his student Ernst Ising. The Ising model depicts, in a simplified mathematical form, how atomic spins (also known as magnetic moments) in ferromagnetic materials interact with each other and how this effect varies with temperature.
In the Ising model, the spin of each atom is regarded as a variable that can take the value +1 or -1, representing spin up or spin down. The interaction between neighboring atoms can be described by a parameter J. If J is positive, neighboring atoms tend to have the same spin direction (ferromagnetic behavior); if J is negative, neighboring atoms tend to have opposite spin directions (antiferromagnetic behavior). In addition, the model can include an external magnetic field H that affects the arrangement of the atomic spins.
The core of the Ising model is its Hamiltonian quantity, which describes the energy state of the system. For a given spin configuration, the Hamiltonian quantity calculates the total energy of the system, which depends on the interactions between the spins as well as on the influence of an external magnetic field. The goal of the model is to find the most probable spin configurations of the system at a given temperature, i.e., those with the lowest energy or in thermal equilibrium.
An important application of the Ising model is the study of phase transitions, especially in two- and three-dimensional lattices, where the model is able to exhibit a secondary phase transition at a specific critical temperature. At this phase transition point, the magnetism of the system suddenly disappears, i.e., it changes from an ordered ferromagnetic state to a disordered paramagnetic state. This phase transition can be described by modeling critical exponents and critical phenomena that are common to many complex systems, such as gas-liquid phase transitions, turbulence, and even stock markets and economic systems.
In addition to its applications in physics, the Ising model is widely used in other fields, such as social sciences, biology, and computer science. For example, in social sciences, the Ising model can be used to simulate the propagation and evolution of social views; in biology, it can simulate signaling networks within cells; and in computer science, the principles of the Ising model are used in the design of neural networks, such as the Hopfield network, an artificial neural network capable of storing and recalling information.
Analytic solutions of the Ising model have been found in the one- and two-dimensional cases, where the analytic solution of the two-dimensional Ising model was given by Lars Onsager in 1944. However, in the three-dimensional and higher dimensional cases, analytical solutions of the model have not been found and usually need to be investigated by numerical simulation methods. Simulation techniques for the Ising model include the Monte Carlo method, which is a statistical sampling technique used to model the thermodynamic behavior of a system.
Geoffrey Hinton developed the famous Boltzmann Machine based on the findings of John Hopfield and using the tools of statistical physics. The Boltzmann Machine, or Boltzmann Machine to give it its full name, is a stochastic neural network that was co-invented by Hinton and Terry Sejnowski in 1985. The design of this network was inspired by statistical mechanics, and was specifically named after the Austrian physicist Ludwig Boltzmann in honor of his significant contributions to the field of statistical mechanics. The Boltzmann machine works on the basis of these principles of physics.
A Boltzmann machine consists of interconnected units, which are similar to neurons, that randomly decide whether to activate or not, i.e., turn on or off. Each connection in the network has a weight associated with it, and this weight determines the strength and sign of the connection. In a Boltzmann machine, there are two types of units: visible units, which are used to input and output data, and hidden units, which are used to capture structural features of the data. In this way, the Boltzmann machine is able to learn complex patterns in the data and generate new data instances that have similar characteristics to the training data.
Boltzmann machines can theoretically learn to represent any distribution as long as there are enough hidden units, but the training process is computationally expensive due to its all-connected structure, and the MCMC sampling process may also lead to slow convergence. To address these computational challenges, a variant, the Restricted Boltzmann Machine (RBM), is often used, which restricts the connections in the network such that there are only connections between visible and hidden units and no visible-visible or hidden-hidden connections . This restriction makes the network easier to train as it allows for more efficient training algorithms and may lead to faster convergence.
Boltzmann machines are of great importance in the field of machine learning, and although they may not be as popular in practice as other models (e.g., deep neural networks), they have played a fundamental role in the development of deep learning and generative models.
The Boltzmann machine is a versatile player in the field of deep learning, appearing in a wide variety of application scenarios:
- Image Recognition and Processing: Imagine a Boltzmann machine as an artist that not only recognizes and classifies pictures, but also detects objects in images and even recognizes faces. It has also come into its own in the field of medical image analysis, helping to detect diseases and segment tissues.
- natural language processing (NLP): The Boltzmann machine joins forces with other neural network structures to handle tasks such as text categorization, sentiment analysis, and machine translation. Its ability to understand and generate language provides powerful support for processing complex text.
- recommender system: The generative modeling properties of Boltzmann machines make them shine in recommender systems. It is capable of generating personalized recommendation lists by learning potential relationships between users and items, improving recommendation accuracy and user satisfaction.
- speech recognition: In the field of speech recognition, the Boltzmann machine is capable of extracting features of sound signals and combining them with other models such as Hidden Markov Models (HMM) for speech recognition. Its robustness in complex sound environments gives it a significant advantage in this field.
- unsupervised learningwith anomaly detection: The unsupervised learning capability of the Boltzmann machine allows it to excel in tasks such as unsupervised clustering and anomaly detection. Especially in the case of missing or scarce data labels, it can extract useful information and discover potential structures or anomalous patterns in the data.
- Drug Discovery and Bioinformatics: In the fields of drug discovery and bioinformatics, the Boltzmann machine is capable of predicting the biological activity of drugs, discovering new drug targets, and so on. Its ability to process high-dimensional data provides an effective means to analyze complex biological systems.
- Deep Belief Networks (DBN): DBNs are generative models stacked with multiple layers of Restricted Boltzmann Machines (RBMs) to capture high-level abstract features in data.DBNs use unsupervised pre-training to train the model layer-by-layer, a layer-by-layer learning strategy that makes DBNs more stable and efficient during training, and is especially suitable for processing high-dimensional data and unlabeled data.
- Deep Boltzmann machine (DBM): A DBM is an extension of a Boltzmann machine that contains multiple hidden layers capable of learning complex hierarchies of data.The DBM pre-trains each layer through unsupervised learning, and then fine-tunes it through supervised learning to optimize performance for a particular task.
- restricted Boltzmann machine(RBM): RBM is a generative stochastic neural network that consists of two layers of fully connected neurons: a visible layer and a hidden layer.Connections in RBM are undirected, i.e., the connections are symmetric. There are no connections between neurons in the same layer.RBM is widely used for tasks such as feature learning, dimensionality reduction, and classification.
The Nobel Prize committee probably thought that awarding the prize to these two scientists demonstrated just how the ideas of physics can inspire the development of deep learning. In terms of their research, the justification for the award is that they have used the ideas of physics to drive interest and progress in deep learning. It's a bit like a table tennis player, Ma Long, winning the Olympics, and then the medal going to his coach because his coaching helped him win the match. While it sounds a bit unbelievable, if we think about it deeply, we realize that there is indeed truth in it. After all, it is true that an athlete's success cannot be achieved without the training and influence of his coach. But it would really be a bit outrageous if the medal was awarded to the soccer player Crow, simply because Malone often watched his matches and drew inspiration from them.
Many have expressed doubts about the 2024 Nobel Prize in Physics being awarded to the field of artificial intelligence, which does not seem to be directly related to physics. They believe that the prize should be more appropriately awarded to veteran scientists who have made direct contributions to the field of traditional physics. Such criteria for awarding the prize may disappoint researchers who have worked hard in the traditional physics world; after all, the Nobel Prize is the highest honor and dream of researchers. Machine learning and neural networks are two completely different research paradigms from traditional physics research, and the award did come as a surprise to many physicists.
Traditional physics research emphasizes robustness, and award-winning results usually require a solid theoretical foundation, extensive applications, and valid results, all three of which are indispensable. Machine learning, on the other hand, tends to be less theoretically explicit in explaining its approximation methods, such as whether it performs top-angle truncation or chooses a particular channel, which makes it more difficult to analyze its intrinsic mechanisms, and the results of the research often researched resemble a black-box operation. This award might have been more convincing if the validity of machine learning applications in physics had been rigorously derived and demonstrated theoretically.
The essence of AI is statistics and probability in mathematics, and the output of generative AI is essentially a "guess". When you ask AI questions, the big model will match the relevant content in the massive data, that is, the process of deep learning, according to the learning content filtered out the relevant content, and probability of matching these contents, selecting a high degree of matching words and phrases to be combined again to form the answer. However, we must recognize that this "guess" is not random, but based on reasonable speculation, which relies on the mathematical model behind and data mining and analysis. Today's AI is based on probabilistic statistics and black-box fitting with given big data content. At its root, AI is closer to math than physics.
IV. Gains and losses of AI
The development of Artificial Intelligence (AI) is undoubtedly a powerful driver of social change, and its impact is far-reaching and significant. In particular, the recent launch of large-scale models by OpenAI has allowed us to witness the powerful potential of AI. AI has greatly improved the efficiency of various industries through automation and intelligent means, such as replacing human labor in manufacturing to complete repetitive labor and reduce the error rate; in the service industry, chatbots are able to provide customer service 24 hours a day; and in the field of healthcare, AI-assisted diagnostic system can quickly and accurately identify diseases and shorten diagnosis and treatment time.
AI is also capable of intelligent decision support, which analyzes large amounts of data and makes predictions that provide insights that humans may not recognize, leading to better decision making and problem solving in areas such as finance, marketing, and healthcare.AI's ability to process and analyze large amounts of data, such as medical records or customer information, is also a major strength, helping to identify patterns and trends that may not be immediately noticeable to humans.
AI can also come in handy in healthcare and medical research, assisting doctors in diagnosing diseases, developing personalized treatment plans and identifying potential health risks, as well as helping medical research by analyzing large amounts of data and identifying new connections. In addition, AI can perform high-risk tasks, such as exploring deep space, handling hazardous materials, and searching for survivors in disaster zones, that may be too dangerous or difficult for humans.
The development of AI has also advanced scientific research, with applications in fields such as materials science, weather forecasting, and the classification of genetic mutations, demonstrating the critical role of AI in solving complex scientific problems.
However, the development of AI also brings some challenges and drawbacks.Advances in AI technology may lead to the replacement of some traditional jobs by machines, especially those engaged in data processing and standardized operations, which may exacerbate social inequality.AI also poses a huge challenge to personal privacy and information security.AI that relies on a large amount of data for learning and optimization may lead to leakage of personal information and even jeopardize national security and social stability if the data is not collected, analyzed and used properly. AI may lead to the leakage of personal information and even jeopardize national security and social stability.
AI also puts humans in an ethical dilemma, as AI begins to have a degree of autonomy in decision-making, we have to face the issue of "machine morality", such as ethical choices in the event of an accident involving a driverless vehicle.AI systems may perpetuate or even amplify existing biases in society, leading to discriminatory outcomes, which is of particular concern in areas such as criminal justice, lending and recruitment. This is particularly worrying in areas such as criminal justice, lending and recruitment.
Heavy reliance on AI technology can lead to a lack of critical thinking and decision-making skills.AI leads to a lack of accountability and increased opacity, and the complexity and "black box" mode of operation of AI systems make it difficult to hold individuals or organizations accountable for their actions.
While advances in AI are driving social progress and economic growth, they may also be exacerbating social inequality and affecting structural changes in employment.AI and automation technologies may lead to a reduction in certain low-skilled jobs, affecting groups that depend on those jobs. Automation, robots and algorithms have played a role in replacing human work tasks, slowing wage growth and increasing inequality.
AI may also lead to an unequal distribution of income and wealth, with the development of technologies potentially allowing those few who own and control them to benefit enormously, while the majority are at risk of declining incomes. This unequal distribution of income and wealth could lead to a more solidified social stratification, exacerbating the gap between rich and poor.
AI technologies and high-paying jobs tend to be concentrated in certain cities or regions, while other regions may be marginalized. This uneven geographical development may lead to inequitable distribution of resources and exacerbate economic and social disparities between regions.
The development of AI requires a workforce with higher levels of skills and education, but not everyone has access to the necessary education and training, which could lead to increased social inequality by making it more difficult for those who are already disadvantaged to gain access to new employment opportunities.AI systems may inherit and amplify existing biases in society because they learn and make decisions by analyzing historical data, and if this data contains biases, then AI systems may reproduce those biases in their decision-making, leading to unfair treatment of certain groups.
The contribution of AI is obvious to all, and its harm is also obvious.AI brings great changes to the future of life, making people feel a little bit very false, a little bit like a bubble floating in mid-air, bright and eye-catching, but it always makes people feel that it is floating all the time, and may be broken one day at the touch.
V. An invisible hand
The awarding of the Nobel Prize in Physics this time seems to have given some people the feeling that the Nobel Prize Committee has shown some kind of favoritism and sycophancy towards AI, and has even gone against Nobel's original intention of setting up the prize, affecting the purity and sanctity of the prize. This feeling is like there is a big invisible hand manipulating everything behind the scenes.
Let's talk about how Nobel's legacy has been managed and invested. Nobel's estate was carefully managed and invested by the Nobel Foundation to preserve and increase the value of the Nobel Prize, which continues to be awarded to this day. According to Nobel's will, his estate was used to establish the Nobel Foundation, a private organization that manages the estate and awards the Nobel Prizes. The Foundation's main duties are to protect the interests of the Nobel Prizes, to represent the Nobel institutions externally, and to organize publicity events and award ceremonies.
The Nobel Foundation follows the principle of "immobilization", i.e., it uses only the proceeds of its investments as prizes, without depleting the original funds. This ensures the sustainability of the awards and protects against inflation. Initially, the Foundation invested only in fixed-income securities, but over time, the investment strategy was gradually broadened to include diversified assets such as equities, real estate, private equity and hedge funds, in order to realize asset appreciation.
The Nobel Foundation has adopted an asset allocation strategy known as "532", which is approximately 50 per cent invested in equities, 30 per cent in alternative assets and 20 per cent in fixed income assets. This allocation helps to achieve high investment returns with manageable risk. Of course, the Nobel Foundation receives tax exemptions from the Swedish government, which reduces the financial pressure on the Foundation and allows more funds to be used for investments and prize payments.
The Nobel Foundation distributes a portion of the annual investment earnings to the laureates as prizes, while the remaining earnings are reinvested back into the principal for reinvestment to keep the fund growing. Through these methods, the Nobel Foundation has succeeded in growing Nobel's estate from an initial 31 million Swedish kronor to billions of Swedish kronor, ensuring that the Nobel Prize continues to be awarded and that the amount of the prize money increases over time.
We know that 50% of assets are invested in stocks, and the hottest stocks of the last few years are basically related to artificial intelligence. The impact of this Nobel Prize on AI stocks is huge. The market is likely to be optimistic about the future potential of AI, especially companies involved in machine learning and neural network technology, such as Google and NVIDIA. Investors are likely to increase their investments in these areas in anticipation of long-term innovative breakthroughs and commercial applications.
The Nobel Foundation is run not only by Swedes, but also by Norwegians, as the Peace Prize is awarded by the Nobel Committee of the Norwegian Parliament, and one of the biggest funders of the Nobel Foundation is the Norwegian Wealth Sovereign Fund, which is heavily invested in NVIDIA and Google. This background may lead to some kind of association, that is, the Nobel Prize in some way to favor AI, is "money" can be justified.
If the Nobel Prize Committee is really "knee-jerk" to AI, then it seems that as long as the results of AI-related research will be given priority and extra care, which may encourage researchers in various fields to study AI. Such a bias in favor of disciplines that are different from the traditional disciplines puts the Committee in a dilemma, and the solution to this dilemma may be the creation of a new prize, such as the Nobel Science and Technology Prize. This dilemma may be solved by creating new awards, such as the Nobel Prize in Science and Technology. The development and growth of the computer science discipline can already compete with the traditional physics, chemistry and biology disciplines, so why can't we set up a new prize? The development of anything needs to keep abreast of the times, and change is the only way to achieve constancy. If a new award cannot be set up, AI should not be forced to be included in the existing award system, which looks both awkward and embarrassing.