Pretrain Experiments

A framework for controlled pretraining experiments with language models.

Take a language model checkpoint, continue training with targeted data interventions, and evaluate the result — all from a single YAML config. Built to support the experiments in Train Once, Answer All (ICLR 2026).

Features

  • Inject texts or tokens at precise positions in the training data

  • Supports OLMo-2 and OLMo-3, extensible to other frameworks

  • Run benchmarks and custom evaluation scripts on every checkpoint

  • Automatic Weights & Biases logging

  • YAML configs with environment variable substitution and CLI overrides