In my talk at the 2026 GoldLab Symposium, I argue that training models on text is different than training them on observational data at the edges of known science. Success in that regime need more than just training bigger generalist models on larger datasets. Beyond the immediate problem of data being scarce at the frontiers of science, there’s a bigger issue: there’s no guarantee that models will learn the underlying mechanisms behind the observations when they lack the scaffold of the human understanding of the world that’s baked into text. Rather, I show that models “cheat” via memorization and heuristics instead.

I use AlphaFold to show that for models trained on observations to generalize, they need to be carefully designed with the right inductive priors. With a few experiments I show that general-purpose models without domain-specific inductive priors don’t reliably learn the underlying mechanisms behind the observations, even with unlimited data.

I’ve since expanded the talk into a full essay, Why aren’t there more AlphaFolds?, which goes deeper into the history (scurvy, Kepler, and Bode’s law), AlphaFold’s inductive priors, and the n-body experiments.