OpenAI’s latest artificial intelligence (AI) system was launched in September with bold promises. The company behind the chatbot ChatGPT introduced o1, its latest suite of large-scale language models (LLMs), as having “a new level of AI functionality.” San Francisco, California-based OpenAI claims that o1 works in a way that is closer to how humans think than previous LLMs.
The release added new fuel to a debate that has been smoldering for decades. How long will it take before machines can perform the full range of cognitive tasks that human brains can handle, such as generalizing from one task to another, abstract reasoning, and planning? And which aspects of the world do you choose to investigate and learn from?
Big AI chatbots tend to spout nonsense, but people don’t necessarily realize it
Such “artificial general intelligence” (AGI) could tackle vexing problems such as climate change, pandemics, treatments for cancer, Alzheimer’s disease, and other diseases. However, such a huge power would also bring uncertainty and pose risks to humanity. “Bad things can happen because of the misuse of AI or because we lose control of it,” said Yoshua Bengio, a deep learning researcher at the University of Montreal in Canada.
The revolution in LLMs over the past few years has led to speculation that AGI is becoming more attractive. However, some researchers say that LLM alone is not enough to reach AGI, given how LLM is constructed and trained. “There are still some pieces missing,” Bengio said.
What is clear is that questions about AGI are more important than ever. “For most of my life, I thought people who talked about AGI were weirdos,” says Subbarao Kambampati, a computer scientist at Arizona State University in Tempe. “Now, of course, everyone’s talking about it. You can’t say everyone’s a weirdo.”
Why the AGI debate has changed
The term artificial general intelligence entered the zeitgeist around 2007 after it was mentioned in a book of the same name edited by AI researchers Ben Goertzel and Cassio Pennatin. Although its exact meaning remains elusive, it broadly refers to AI systems with human-like reasoning and generalization abilities. Vague definitions aside, it’s clear that for most of AI’s history, we haven’t yet reached AGI. AlphaGo is an AI program created by Google DeepMind to play the board game Go. He beats the world’s best human players in the game, but his superhuman qualities have their limits. Because that’s all you can do.
The new features of the LLM have fundamentally changed the landscape. Like the human brain, the LLM has a wide range of capabilities, which has led some researchers to seriously consider the idea that some form of AGI may be imminent 1 or already exist. It’s even possible that you are.
This wide range of capabilities is particularly surprising given that researchers only partially understand how LLMs achieve this. LLM is a neural network, a machine learning model loosely inspired by the brain. A network consists of artificial neurons, or computing units, arranged in layers, with tunable parameters that indicate the strength of connections between neurons. During training, the most powerful LLMs, such as o1, Claude (built by Anthropic in San Francisco), and Google’s Gemini, rely on a method called next-token prediction. In this method, the model is repeatedly fed fragmented samples of text. Chunks known as tokens. These tokens can be whole words or just sets of characters. The last token in the sequence is hidden or “masked” and the model is asked to predict it. The training algorithm then compares the predictions to the masked tokens and adjusts the model’s parameters to make better predictions next time.
How AI will reshape science and society
This process continues, typically using billions of languages, scientific documents, and fragments of programming code, until the model can reliably predict masked tokens. By this stage, the model parameters capture the statistical structure of the training data and the knowledge it contains. The parameters are then fixed and the model uses them to predict new tokens when given new queries or “prompts” that are not necessarily present in the training data. This is a process known as inference.
Through the use of a type of neural network architecture known as a transformer, LLM significantly exceeds previous achievements. Transformers allow a model to learn that some tokens have a particularly strong influence on other tokens, even if the tokens are widely separated within a sample of text. This allows LLMs to parse language in a way that appears to mimic the way humans do. For example, we can distinguish between two meanings of the word “bank” in the following sentence: It is impossible to withdraw money. ”
This approach has proven to be very successful in a wide range of situations, including generating computer programs that solve problems written in natural language, summarizing academic papers, and answering mathematical questions.
Other new features are also emerging, especially as LLMs grow in size, and AGI is likely to emerge if LLMs grow large enough. One example is a Chain of Thought (CoT) prompt. This includes showing the LLM an example of how to break down the problem into smaller steps to solve it, or simply asking the LLM to solve the problem step by step. CoT prompts allow LLMs to correctly answer questions that previously stumped them. However, this process does not work very well for small LLMs.
Limitations of LLM
According to OpenAI, CoT prompts are integrated into the o1 mechanics and are the basis of the model’s great functionality. Francois Cholet, an AI researcher at Google in Mountain View, Calif., who left in November to start a new company, said the model includes a CoT generator that creates a number of CoT prompts in response to a user’s query. , we believe that there is a built-in mechanism to select appropriate queries. Prompts you with choices. During training, o1 not only learns to predict the next token, but also chooses the best CoT prompt for a given query. For example, the addition of CoT reasoning explains why o1-preview, an advanced version of o1, correctly solved 83% of the problems in the qualifying exam for the International Mathematics Olympiad, a prestigious mathematics competition for high school students, according to OpenAI. will be done. . By comparison, the company’s most powerful LLM to date, GPT-4o, scored just 13%.
In AI, is bigger always better?
But despite its sophistication, o1 has limitations and does not constitute AGI, Kambampati and Chollet say. For example, for tasks that require planning, Kambapati’s team found that o1 performs admirably for tasks requiring up to 16 planning steps, but as the number of steps increases between 20 and 402. We showed that performance degrades rapidly. He challenged o1-preview with an abstract reasoning and generalization test he designed to measure progress toward AGI. The test takes the form of a visual puzzle. Solving them requires looking at examples, inferring abstract rules, and using them to solve new instances of similar puzzles, which humans can do relatively easily. .
Regardless of size, LLMs are limited in their ability to solve problems that require recombining what has been learned to tackle new challenges, Chollet said. “LLMs cannot truly adapt to novelty because they fundamentally do not have the ability to acquire knowledge and do fairly sophisticated recombinations of that knowledge on the fly to adapt to new situations.”
Can an LLM provide AGI?
So will LLMs ever deliver AGI? One advantage they have is that the underlying transformer architecture can handle images and audio in addition to text, provided there is a way to properly tokenize the data. and other types of information to find statistical patterns. Andrew Wilson, a machine learning researcher at New York University in New York City, and his colleagues showed that this may be because different types of data all have characteristics in common. . That is, such datasets have a low “Kolmogorov complexity,” defined as the length of the dataset. The shortest computer program needed to create them is 3. The researchers also found that transformers are suitable for learning patterns in data with low Kolmogorov complexity, and that this suitability depends on the size of the model. It was also shown to increase. Transformers have the ability to model a wide range of possibilities, increasing the likelihood that the training algorithm will find a good solution to the problem, and this “expressive power” increases as size increases. These are “some of the ingredients that are really needed for universal learning,” Wilson said. Wilson believes that AGI is currently out of reach, but that LLMs and other AI systems that use transformer architectures have some of the key characteristics of AGI-like behavior. states.
Can AI review scientific literature and figure out what it means?
However, there are indications that transformer-based LLMs have limitations. First, there is a lack of data used to train the model. Researchers at Epoch AI, a San Francisco-based institute that studies AI trends, estimate that the existing inventory of publicly available text data used for training could be depleted between 2026 and 2032. We estimate that there are 4. There are also signs that an LLM could be beneficial. Even though they’re bigger, they’re still not as great as they used to be, but I wonder if this has something to do with the data being less novel because there’s so much data in use now, or something else. It is not clear whether they are related. The latter would bode badly for the LLM.
Raia Hadsell, vice president of research at Google DeepMind in London, raises a different question. Although powerful transformer-based LLMs are trained to predict the next token, she argues that this singular focus is too limited to provide AGI. Instead, she says, building models that generate solutions all at once or in large chunks could move closer to AGI. Algorithms that could help build such models are already working in some existing non-LLM systems, such as OpenAI’s DALL-E, and are realistic and responsive to natural language descriptions. Sometimes it produces bizarre images. However, it lacks the broad capabilities of an LLM.
Build your world model
Intuition about what breakthroughs are needed to move toward AGI comes from neuroscientists. They argue that our intelligence is the result of our brains being able to construct “world models” that represent our environment. It can be used to imagine different courses of action, predict their outcomes, and plan or reason. It can also be used to generalize skills learned in one domain to new tasks by simulating different scenarios.