What happened with AI in 2024?

The idea this week: despite claims of stagnation, AI research still advanced rapidly in 2024.

Some people say AI research has plateaued. But a lot of evidence from the last year points in the opposite direction:

  • New capabilities were developed and emerged
  • Research indicates existing AI can accelerate science

And at the same time, important findings about AI safety and risk came out (see below).

AI advances might still stall. Some leaders in the field have warned that a lack of good data, for example, may impede further capability growth, though others disagree. Regardless, growth clearly hasn’t stopped yet.

Meanwhile, the aggregate forecast on Metaculus of when we’ll see the first “general” AI system — which would be highly capable across a wide range of tasks — is 2031.

All of this matters a lot, because AI poses potentially existential risks. We think making sure AI goes well is a top pressing world problem.

If AI advances fast, this work is not only important but urgent.

Here are some of the key developments in AI from the last year:

New AI models and capabilities

OpenAI announced in late December that its new model o3 achieved a large leap forward in capabilities. It builds on the o1 language model (also released in 2024), which has the ability to deliberate about its answers before responding. With this more advanced capability, o3 reportedly:

  • Scored a breakthrough 87.5% on ARC-AGI, a test designed to be particularly hard for leading AI systems
  • Pushed the frontier of AI software engineering, scoring 71.7% on a key benchmark using real tasks compared to 48.9% accuracy for o1
  • Achieved a 25% score on a new (and extremely challenging) FrontierMath benchmark — while previous leading AI models couldn’t get above 2%
Graph showing OpenAI's models exponential increase in scores on ARC-AGI benchmark.
Scores on the ARC-AGI benchmark from OpenAI’s models since 2019. Chart from Riley Goodside of ScaleAI.

While not released publicly yet, it seems clear that o3 is the most capable language model we’ve seen. It still has many limitations and weaknesses, but it undermines claims that AI progress stalled in 2024.

It may be the most impressive advance in 2024, but the last year had many other major developments:

  • AI video generation gained steam, as OpenAI released Sora for public use and Google DeepMind launched Veo.
  • Google DeepMind released AlphaFold 3 — a successor to a Nobel Prize-winning AI system — which can predict how proteins interact with DNA, RNA, and other structures at the molecular level.
  • Anthropic introduced the capability for its chatbot Claude to use your computer at your direction.
  • AI systems are increasingly able to take audio and visual inputs, and larger amounts of text, while also engaging with users in voice mode.
  • By combining the models AlphaProof and AlphaGeometry 2, Google DeepMind was able to use AI to achieve silver medal performance in the International Mathematical Olympiad.
  • The Chinese company DeepSeek said that its newest model only cost $5.5 million to train — a dramatic decrease from the reported $100 million OpenAI spent training the comparably capable GPT-4.

And there’s a lot more that could be included here! We won’t be surprised if 2025 and 2026 see many more leaps forward in AI capabilities.

AI helping with science

Recent research indicates that AI can help speed up scientific progress, including AI research itself:

Graph showing AI models outperfor humans on ML research engineering tasks at 2 hours but not with longer periods of time.
METR found that in a limited time window, LLMs can outperform humans on a sample of ML research engineering tasks.

Some key developments in AI risk and safety research

Meanwhile, we’ve seen a mix of encouraging and worrying results in research on AI safety. Here are a few of the important publications this year:

What you can do

These developments show the fast pace and potential risks of advancing AI. To help, you can:

We also recommend checking out recent posts from our founder Benjamin Todd on:

We plan to continue to cover this topic in the coming year, and we wouldn’t be surprised to see many additional changes and major AI developments. Continue following along with us, and consider sharing this information with your friends by forwarding this email if you find it helpful.

This blog post was first released to our newsletter subscribers.

Join over 450,000 newsletter subscribers who get content like this in their inboxes weekly — and we’ll also mail you a free book!

Learn more: