roberta Raileanu

google deepmind

The Research and Applied AI Summit (RAAIS) is a community for entrepreneurs and researchers who accelerate the science and applications of AI technology. In the run up to our 10th annual event on June 12th 2026 in London, we’re running a series of speaker profiles to shed more light on what you can expect to learn on the day!

At RAAIS we have a focus on translating cutting edge technology and research into production-grade products for real-world problems.

The Research and Applied AI Summit (RAAIS) is a community for entrepreneurs and researchers who accelerate the science and applications of AI technology. In the run up to our 10th annual event on June 12th 2026 in London, we’re running a series of speaker profiles to shed more light on what you can expect to learn on the day!

Roberta Raileanu is a Senior Staff Research Scientist at Google DeepMind, where she leads work on the Open-Endedness team, and an Adjunct Professor at UCL, advising PhD students connected to UCL-DARK. Her research focuses on how frontier models are increasingly asked to do long-horizon work plan, use tools, recover from mistakes, and keep improving through interaction. This exposes a gap between systems that look capable in short bursts and systems that keep acquiring skills in messy environments. Roberta’s research is about closing that gap.

From exploration to open-ended learning

Roberta’s early work was shaped by a classic reinforcement learning problem that keeps resurfacing in new guises: exploration. If an environment gives sparse or delayed reward, brute-force search fails, and the right intrinsic objective can determine whether an agent learns at all.

Two papers anchor this period. RIDE: Rewarding Impact-Driven Exploration for Procedurally-Generated Environments (ICLR 2020) proposes an intrinsic signal that rewards actions changing an agent’s learned state representation, evaluated in procedurally generated settings where revisiting the same state is unlikely. Learning with AMIGo: Adversarially Motivated Intrinsic Goals (ICLR 2021) tackles sparse reward by pairing a goal-generating “teacher” with a goal-conditioned “student,” producing an automatic curriculum of increasingly challenging goals. In parallel, Decoupling Value and Policy for Generalization in Reinforcement Learning (ICML 2021, oral) argues that shared representations for policy and value can contribute to overfitting, and proposes a decoupled approach that improves generalisation on benchmarks like Procgen.

This portfolio matters because open-endedness is not a slogan. It is a technical demand: systems should continue to learn without requiring a human to constantly rewrite the task distribution.

The tool-use gap

Before joining DeepMind, Roberta was a Research Scientist at Meta, where she started and led the Tool Use team for Llama 3. This work aimed at enabling models to use tools like search and code execution, and to generalise to new tools at test time. The products that shipped from this work - Meta AI, Data Analyst, AI Studio, Ads Business Agent - are now used by hundreds of millions of people.

She was also a co-author on Toolformer: Language Models Can Teach Themselves to Use Tools (2023), one of the papers that helped establish tool use as a core capability for language models rather than an afterthought. Toolformer showed that a model can learn when and how to call external APIs - calculators, search engines, translators - with minimal supervision, by generating its own training data from a handful of demonstrations.

Tool use is not a feature checkbox. It changes what we can reasonably ask models to do, because it introduces feedback loops, memory, and failure recovery. It also introduces new failure modes: an agent that can call a tool can also call it badly, repeatedly, and confidently. Roberta’s treatment of agent behaviour as a sequential decision problem with real constraints - not a prompt-engineering exercise - is exactly the lineage you want when the field moves from “can it answer” to “can it execute.”

Why open-endedness is becoming a practical requirement

At DeepMind, Roberta now leads the Open-Endedness team and is building a new Open-Ended Discovery group focused on autonomously discovering novel artefacts - new knowledge, capabilities, or algorithms - in a self-improving loop.

Open-endedness is sometimes framed as a path to general intelligence. In practice, it is also a path to systems that do not collapse outside curated benchmarks. Most real deployments present a shifting distribution: new tools, new data, new user behaviour, and new adversarial pressures. A model that cannot keep learning becomes a periodic retraining job with brittle edges.

At Meta, Roberta also led an “AI Scientist” effort focused on agents that can iterate through parts of the research loop - implementing methods, running experiments, analysing results, and repeating the cycle. That work has now crystallised into MLGym: A New Framework and Benchmark for Advancing AI Research Agents (2025), which positions evaluation around concrete machine learning research tasks and frames the problem in a way that invites iteration by the broader community rather than one-off demos. If “AI scientist” systems are going to matter, we need ways to compare approaches, reproduce results, and identify what actually moves the needle. A benchmark is not the whole answer, but it forces precision about what the agent is allowed to do, what counts as success, and what is being optimised.

Roberta’s background

Roberta received her PhD in Computer Science from NYU in 2021, advised by Rob Fergus. Before that, she studied Astrophysical Sciences at Princeton, where she worked on theoretical cosmology and supernovae simulations - and before that, competed in the International Physics Olympiad and the International Olympiad on Astronomy and Astrophysics. That path from physics instincts to sequential decision-making research shows up in her taste for problems where scale alone is not enough.

She also co-developed and co-teaches a course on open-endedness and general intelligence at UCL, which signals something about where the field is heading: this is becoming a discipline with ideas worth teaching, not a loose collection of intuitions.

watch raais on youtube