1. Meta will begin charging for use of its Llama models.
Meta is the world’s standard bearer for open-weight AI. In a fascinating case study in corporate strategy, while rivals like OpenAI and Google have kept their frontier models closed source and charged for their use, Meta has chosen to give its state-of-the-art Llama models away for free.
So it will come as a surprise to many next year when Meta begins charging companies to use Llama.
To be clear: we are not predicting that Meta will make Llama entirely closed source, nor that anyone who uses the Llama models will have to pay for them.
Instead, we predict that Meta will make the terms of Llama’s open-source license more restrictive, such that companies who use Llama in commercial settings above a certain scale will need to start paying to access the models.
Technically, Meta already does a limited version of this today. The company does not allow the very largest companies—the cloud hyperscalers and other companies with more than 700 million monthly active users—to freely use its Llama models.
Back in 2023, Meta CEO Mark Zuckerberg said: “If you’re someone like Microsoft, Amazon or Google, and you’re going to basically be reselling (Llama), that’s something that we think we should get some portion of the revenue for. I don’t think that that’s going to be a large amount of revenue in the near-term, but over the long term, hopefully that can be something.”
Next year, Meta will substantially expand the set of organizations who must pay to use Llama to include many more large and mid-sized enterprises.
Why would Meta make this strategic pivot?
Keeping up with the LLM frontier is incredibly expensive. Meta will need to invest many billions of dollars every year if it wants Llama to stay at or near parity with the latest frontier models from OpenAI, Anthropic and others.
Meta is one of the world’s largest and most deep-pocketed companies. But it is also a publicly traded company that is ultimately answerable to its shareholders. As the cost of building frontier models skyrockets, it is increasingly untenable for Meta to devote such vast sums to train next-generation Llama models with zero expectation of revenue.
Hobbyists, academics, individual developers and startups will continue to be able to use the Llama models free of charge next year. But 2025 will be the year that Meta gets serious about monetizing Llama.
2. Scaling laws will be discovered and exploited in areas beyond text—in particular, in robotics and biology.
No topic in AI has generated more discussion in recent weeks than scaling laws—and the question of whether they are coming to an end.
First introduced in a 2020 OpenAI paper, the basic concept behind scaling laws is straightforward: as the number of model parameters, the amount of training data, and the amount of compute increase when training an AI model, the model’s performance improves (technically, its test loss decreases) in a reliable and predictable way. Scaling laws are responsible for the breathtaking performance improvements from GPT-2 to GPT-3 to GPT-4.
Much like Moore’s Law, scaling laws are not in fact laws but rather simply empirical observations. Over the past month, a series of reports have suggested that the major AI labs are seeing diminishing returns to continued scaling of large language models. This helps explain, for instance, why OpenAI’s GPT-5 release keeps getting delayed.
The most common rebuttal to plateauing scaling laws is that the emergence of test-time compute opens up an entirely new dimension on which to pursue scaling. That is, rather than massively scaling compute during training, new reasoning models like OpenAI’s o3 make it possible to massively scale compute during inference, unlocking new AI capabilities by enabling models to “think for longer.”
This is an important point. Test-time compute does indeed represent an exciting new avenue for scaling and for AI performance improvement.
But another point about scaling laws is even more important and too little appreciated in today’s discourse. Nearly all discussions about scaling laws—starting with the original 2020 paper and extending all the way through to today’s focus on test-time compute—center on language. But language is not the only data modality that matters.
Think of robotics, or biology, or world models, or web agents. For these data modalities, scaling laws have not been saturated; on the contrary, they are just getting started. Indeed, rigorous evidence of the existence of scaling laws in these areas has not even been published to date.
Startups building foundation models for these newer data modalities—for instance, EvolutionaryScale in biology, Physical Intelligence in robotics, World Labs in world models—are seeking to identify and ride scaling laws in these fields the way that OpenAI so successfully rode LLM scaling laws in the first half of the 2020s. Next year, expect to see tremendous advances here.
Don’t believe the chatter. Scaling laws are not going away. They will be as important as ever in 2025. But the center of activity for scaling laws will shift from LLM pretraining to other modalities.
3. Donald Trump and Elon Musk will have a messy falling-out. This will have meaningful consequences for the world of AI.
A new administration in the U.S. will bring with it a number of policy and strategy shifts on AI. In order to predict where the AI winds will blow under President Trump, it might be tempting to focus on the president-elect’s close relationship with Elon Musk, given Musk’s central role in the AI world today.
One can imagine a number of different ways in which Musk might influence AI-related developments in a Trump administration. Given Musk’s deeply hostile relationship with OpenAI, the new administration might take a less friendly stance toward OpenAI when engaging with industry, crafting AI regulation, awarding government contracts, and so forth. (This is a real risk that OpenAI is worried about today.) On the flipside, the Trump administration might preferentially favor Musk’s own companies: for instance, slashing red tape to enable xAI to build data centers and get a leg up in the frontier model race; granting rapid regulatory approval for Tesla to deploy robotaxi fleets; and so forth.
More fundamentally, Elon Musk—unlike many other technology leaders who have Trump’s ear—takes existential AI safety risks very seriously and is therefore an advocate for significant AI regulation. He supported California’s controversial SB 1047 bill, which sought to impose meaningful restrictions on AI developers. Musk’s influence could thus lead to a more heavy-handed regulatory environment for AI in the U.S.
There is one problem with all these speculations, though. Donald Trump and Elon Musk’s cozy relationship will inevitably fall apart.
As we saw time and time again during the first Trump administration, the median tenure of a Trump ally, even the seemingly staunchest, is remarkably short—from Jeff Sessions to Rex Tillerson to James Mattis to John Bolton to Steve Bannon. (And, of course, who can forget Anthony Scaramucci’s ten-day stint in the White House?) Very few of Trump’s deputies from his first administration remain loyal to him today.
Both Donald Trump and Elon Musk are complex, volatile, unpredictable personalities. They are not easy to work with. They burn people out. Their newfound friendship has proven mutually beneficial to this point, but it is still in its honeymoon phase. We predict that, before 2025 has come to an end, the relationship will have soured.
What will this mean for the world of AI?
It will be welcome news for OpenAI. It will be unfortunate news for Tesla shareholders. And it will be a disappointment for those concerned with AI safety, as it will all but ensure that the U.S. government will take a hands-off approach to AI regulation under Trump.
4. Web agents will go mainstream, becoming the next major killer application in consumer AI.
Imagine a world in which you never have to directly interact with the web. Whenever you need to manage a subscription, pay a bill, schedule a doctor’s appointment, order something on Amazon, make a restaurant reservation, or complete any other tedious online task, you can simply instruct an AI assistant to do so on your behalf.
This concept of a “web agent” has been around for years. If something like this existed and worked, there is little doubt that it would be a wildly successful product. Yet no functioning general-purpose web agent is available on the market today.
Startups like Adept—which raised hundreds of millions of dollars with a highly pedigreed founding team but failed to deliver on its vision—have become cautionary tales in this category.
Next year will be the year that web agents finally start working well enough to go mainstream. Continued advances in language and vision foundation models, paired with recent breakthroughs on “System 2 thinking” capabilities as a result of new reasoning models and inference-time compute, will mean that web agents will be ready for primetime.
(In other words, Adept had the right idea; it was just too early. In startups, as in much in life, timing is everything.)
Web agents will find all sorts of valuable enterprise use cases, but we believe that the biggest near-term market opportunity for web agents will be with consumers. Despite all the recent AI fervor, relatively few AI-native applications beyond ChatGPT have yet broken through to become mainstream consumer successes. Web agents will change that, becoming the next true “killer app” in consumer AI.
5. Multiple serious efforts to put AI data centers in space will take shape.
In 2023, the critical physical resource bottlenecking AI growth was GPU chips. In 2024, it has become power and data centers.
Few storylines have gotten more play in 2024 than AI’s enormous and fast-growing energy needs amid the rush to build more AI data centers. After remaining flat for decades, global power demand from data centers is projected to double between 2023 and 2026 thanks to the AI boom. In the U.S., data centers are projected to consume close to 10% of all power by 2030, up from just 3% in 2022.
Today’s energy system is simply not equipped to handle the tremendous surge in demand coming from artificial intelligence workloads. A historic collision between these two multi-trillion-dollar systems—our energy grid and our computing infrastructure—is looming.
Nuclear power has gained momentum this year as a possible solution to this Gordian knot. Nuclear represents an ideal energy source for AI in many ways: it is zero-carbon, available 24/7 and effectively inexhaustible. But realistically, new nuclear energy sources won’t be able to make a dent in this problem until the 2030s, given long research, project development and regulatory timelines. This goes for traditional nuclear fission power plants, for next-generation “small modular reactors” (SMRs) and certainly for nuclear fusion power plants.
Next year, an unconventional new idea to tackle this challenge will emerge and attract real resources: putting AI data centers in space.
AI data centers in space—at first blush, this sounds like a bad joke about a VC trying to combine too many startup buzzwords. But there may in fact be something here.
The biggest bottleneck to rapidly building more data centers on earth is accessing the requisite power. A computing cluster in orbit can enjoy free, limitless, zero-carbon power around the clock: the sun is always shining in space.
Another meaningful advantage to putting compute in space: it solves the cooling problem. One of the biggest engineering obstacles to building more powerful AI data centers is that running many GPUs at the same time in a confined space gets very hot, and high temperatures can damage or destroy computing equipment. Data center developers are resorting to expensive, unproven methods like liquid immersion cooling to try to solve this problem. But space is extremely cold; any heat generated from computing activity would immediately and harmlessly dissipate.
Of course, plenty of practical challenges remain to be solved. One obvious issue is whether and how large volumes of data can be moved cost-efficiently between orbit and earth. This is an open question, but it may prove solvable, with promising work underway using lasers and other high-bandwidth optical communications technology.
A buzzy startup out of Y Combinator named Lumen Orbit recently raised $11m to pursue this exact vision: building a multi-gigawatt network of data centers in space to train AI models.
As Lumen CEO Philip Johnston put it: “Instead of paying $140 million for electricity, you can pay $10 million for a launch and solar.”
Lumen will not be the only organization taking this concept seriously in 2025.
Other startup competitors will emerge. Don’t be surprised to see one or more of the cloud hyperscalers launch exploratory efforts along these lines as well. Amazon already has extensive experience putting assets into orbit via Project Kuiper; Google has a long history of funding moonshot ideas like this; and even Microsoft is no stranger to the space economy. Elon Musk’s SpaceX could conceivably make a play here too.
6. An AI system will pass the “Turing test for speech.”
The Turing test is one of the oldest and most well-known benchmarks for AI performance.
In order to “pass” the Turing test, an AI system must be able to communicate via written text such that the average human is not able to tell whether he or she is interacting with an AI or interacting with another human.
Thanks to dramatic recent advances in large language models, the Turing test has become a solved problem in the 2020s.
But written text is not the only way that humans communicate.
As AI becomes increasingly multimodal, one can imagine a new, more challenging version of the Turing test—a “Turing test for speech”—in which an AI system must be able to interact with humans via voice with a degree of skill and fluidity that make it indistinguishable from a human speaker.
The Turing test for speech remains out of reach for today’s AI systems. Solving it will require meaningful additional technology advances.
Latency (the lag between when a human speaks and when the AI responds) must be reduced to near-zero in order to match the experience of speaking with another human. Voice AI systems must get better at gracefully handling ambiguous inputs or misunderstandings in real-time—for instance, when they get interrupted mid-sentence. They must be able to engage in long, multi-turn, open-ended conversations while holding in memory earlier parts of the discussion. And crucially, voice AI agents must learn to better understand non-verbal signal in speech—for instance, what it means if a human speaker sounds annoyed versus excited versus sarcastic—and to generate those non-verbal cues in their own speech.
Voice AI is at an exciting inflection point as we near the end of 2024, driven by fundamental breakthroughs like the emergence of speech-to-speech models. Few areas of AI are advancing more rapidly today, both technologically and commercially. Expect to see the state of the art in voice AI leap forward in 2025.
7. Major progress will be made on building AI systems that can themselves autonomously build better AI systems.
The concept of recursively self-improving AI has been a frequent touchpoint in AI circles going back decades.
Back in 1965, for instance, Alan Turing’s close collaborator I.J. Good wrote:
“Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man, however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind.”
The idea of AI that can invent better AI is an intellectually fascinating concept. But, even today, it retains a whiff of science fiction.
However—while it is not yet widely appreciated—this concept is in fact starting to get more real. Researchers at the frontiers of AI science have begun to make tangible progress toward building AI systems that can themselves build better AI systems.
We predict that next year, this vein of research will burst into the mainstream.
To date, the most notable public example of research along these lines is Sakana’s “AI Scientist.” Published in August, the AI Scientist work represents a compelling proof of concept that AI systems can indeed carry out AI research entirely autonomously.
Sakana’s AI Scientist executes the entire lifecycle of artificial intelligence research itself: reading the existing literature, generating novel research ideas, designing experiments to test those ideas, carrying out those experiments, writing up a research paper to report its findings, and then conducting a process of peer review on its work. It does this entirely autonomously, with no human input. Some of the research papers that the AI Scientist produced are available online to read.
Rumors abound that OpenAI, Anthropic and other research labs are devoting resources to this idea of “automated AI researchers,” though nothing has yet been publicly acknowledged.
Expect to see much more discussion, progress and startup activity in this field in 2025 as it becomes more widely appreciated that automating AI research is in fact becoming a real possibility.
The most meaningful milestone, though, will be when a research paper written entirely by an AI agent is accepted into a top AI conference for the first time. (Because papers are blindly reviewed, conference reviewers won’t know that a paper was written by an AI until after it has been accepted.) Don’t be surprised to see research work produced by an AI get accepted at NeurIPS, CVPR or ICML next year. It will be a fascinating, controversial and historic moment for the field of AI.
8. OpenAI, Anthropic and other frontier labs will begin “moving up the stack,” increasingly shifting their strategic focus to building applications.
Building frontier models is a tough business to be in.
It is staggeringly capital intensive. Frontier model labs burn historic amounts of cash. OpenAI raised a record $6.5 billion in funding just a few months ago—and it will likely have to raise even more before long. Anthropic, xAI and others are in similar positions.
Switching costs and customer loyalty are low. AI applications are often built to be model-agnostic, with models from different providers frictionlessly swapped in and out based on changing cost and performance comparisons.
And with the emergence of state-of-the-art open models like Meta’s Llama and Alibaba’s Qwen, the threat of technology commoditization constantly looms.
AI leaders like OpenAI and Anthropic cannot and will not stop investing in building cutting-edge models. But next year, in an effort to develop business lines that are higher-margin, more differentiated and stickier, expect to see the frontier labs make a big push to roll out more of their own applications and products.
Of course, one wildly successful example of an application from a frontier lab already exists: ChatGPT.
What other kinds of first-party applications might we expect to see from the AI labs in the new year?
One obvious answer is more sophisticated and feature-rich search applications. OpenAI’s SearchGPT effort is a sign of things to come here.
Coding is another obvious category. Again, initial productization efforts are already underway, with the debut of OpenAI’s canvas product in October.
Might OpenAI or Anthropic launch an enterprise search offering in 2025? Or a customer service product? How about a legal AI or a sales AI product? On the consumer side, one can imagine a “personal assistant” web agent product, or a travel planning application, or perhaps a generative music application.
One of the most fascinating parts of watching frontier labs move up the stack to the application layer is that this move will bring them into direct competition with many of their most important customers: in search, Perplexity; in coding, Cursor; in customer service, Sierra; in legal AI, Harvey; in sales, Clay; and on and on.
9. As Klarna prepares for a 2025 IPO, the company’s claims about its use of AI will come under scrutiny and prove to be wildly overstated.
Klarna is a “buy now, pay later” provider based in Sweden that has raised close to $5 billion in venture capital since its founding in 2005.
Perhaps no company has made more grandiose claims about its use of AI than has Klarna.
Just a few days ago, Klarna CEO Sebastian Siemiatkowski told Bloomberg that the company has stopped hiring human employees altogether, instead relying on generative AI to get work done.
As Siemiatkowski put it: “I am of the opinion that AI can already do all of the jobs that we as humans do.”
Along similar lines, Klarna announced earlier this year that it had launched an AI customer service platform that has fully automated the work of 700 human customer service agents. The company has also claimed that it has stopped using enterprise software products like Salesforce and Workday because it can simply replace them with AI.
To put it directly, these claims are not credible. They reflect a poorly informed understanding of what AI systems are and are not capable of today.
It is not plausible to claim to be able to replace any given human employee, in any given function of an organization, with an end-to-end AI agent. This would amount to having solved general-purpose human-level AI.
Leading AI startups today are working hard at the cutting edge of the field to build agentic systems that can automate specific, narrowly defined, highly structured enterprise workflows—for instance, a subset of the activities of a sales development representative or a customer service agent. Even in these narrowly circumscribed contexts, these agents do not yet work totally reliably, although in some cases they have begun to work well enough to see early commercial adoption.
Why would Klarna make such overstated claims about the value it is deriving from AI?
There is a simple answer. The company plans to IPO in the first half of 2025. Having a compelling AI narrative will be critical to a successful public listing. Klarna remains an unprofitable business, with $241 million in losses last year; it may hope that its AI story will persuade public market investors about its ability to dramatically reduce costs and swing to lasting profitability.
Without doubt, every organization in the world, including Klarna, will enjoy vast productivity gains from AI in the years ahead. But many thorny technology, product and organizational challenges remain to be solved before AI agents can completely replace humans in the workforce. Overblown claims like Klarna’s do a disservice to the field of AI and to the hard-fought progress that AI technologists and entrepreneurs are actually making toward developing agentic AI.
As Klarna prepares for its public offering in 2025, expect to see greater scrutiny and public skepticism about these claims, which so far have mostly gone unchallenged. Don’t be surprised to see the company walk back some of its more over-the-top descriptions of its AI use.
(And of course—get ready for the word “AI” to appear in the company’s S-1 many hundreds of times.)
10. The first real AI safety incident will occur.
As artificial intelligence has become more powerful in recent years, concerns have grown that AI systems might begin to act in ways that are misaligned with human interests and that humans might lose control of these systems. Imagine, for instance, an AI system that learns to deceive or manipulate humans in pursuit of its own goals, even when those goals cause harm to humans.
This general set of concerns is often categorized under the umbrella term “AI safety.”
In recent years, AI safety has moved from a fringe, quasi-sci-fi topic to a mainstream field of activity. Every major AI player today, from Google to Microsoft to OpenAI, devotes real resources to AI safety efforts. AI icons like Geoff Hinton, Yoshua Bengio and Elon Musk have become vocal about AI safety risks.
Yet to this point, AI safety concerns remain entirely theoretical. No actual AI safety incident has ever occurred in the real world (at least none that has been publicly reported).
2025 will be the year that this changes.
What should we expect this first AI safety event to look like?
To be clear, it will not entail Terminator-style killer robots. It most likely will not involve harm of any kind to any humans.
Perhaps an AI model might attempt to covertly create copies of itself on another server in order to preserve itself (known as self-exfiltration). Perhaps an AI model might conclude that, in order to best advance whatever goals it has been given, it needs to conceal the true extent of its capabilities from humans, purposely sandbagging performance evaluations in order to evade stricter scrutiny.
These examples are not far-fetched. Apollo Research published important experiments earlier this month demonstrating that, when prompted in certain ways, today’s frontier models are capable of engaging in just such deceptive behavior. Along similar lines, recent research from Anthropic showed that LLMs have the troubling ability to “fake alignment.”
We expect that this first AI safety incident will be detected and neutralized before any real harm is done. But it will be an eye-opening moment for the AI community and for society at large.
It will make one thing clear: well before humanity faces an existential threat from all-powerful AI, we will need to come to terms with the more mundane reality that we now share our world with another form of intelligence that may at times be willful, unpredictable and deceptive—just like us.
See here for our 2024 AI predictions, and see here for our end-of-year retrospective on them.
See here for our 2023 AI predictions, and see here for our end-of-year retrospective on them.
See here for our 2022 AI predictions, and see here for our end-of-year retrospective on them.
See here for our 2021 AI predictions, and see here for our end-of-year retrospective on them.