SLDBench & SLDAgent: Automated Scaling Law Discovery
We frame scaling law discovery as a benchmarked "scientific agent" task by curating thousands of experiments across diverse settings. We introduce SLDAgent, an evolution-based agent that jointly searches scaling-law functional forms and fits parameters. Across tasks, the discovered laws extrapolate more accurately than established human-derived laws, validating usefulness in both pretraining and finetuning scenarios.
Predicts language-modeling loss as a function of model size N and inference parallelism P.
Models unigram-normalized loss as a function of non-vocabulary model size N, vocabulary size V, and dataset size D.
Models finetuning loss based on the supervised fine-tuning dataset size D.
Models each domain's pre-training loss based on that domain's proportion in the training mixture.
Models loss in relation to network size N and number of experts E.
Models pre-training loss as a function of network size N, dataset size D, and number of unique tokens U.
Models pre-training loss based on learning rate l, batch size b, dataset size D, and network size N.
Models performance (Brier score) as a non-monotonic function of compute measured in FLOPs C.
| Rank | Agent | Model | Mean R² | Max R² | Runs | Solution |
|---|
@article{lin2025sld,
title={Can Language Models Discover Scaling Laws?},
author={Lin, Haowei and Ye, Haotian and Feng, Wenzheng and Huang, Quzhe and Li, Yujun and Lim, Hubert and Li, Zhengrui and Wang, Xiangyu and Ma, Jianzhu and Liang, Yitao and Zou, James},
journal={arXiv preprint arXiv:2507.21184},
year={2025}
}
Want to connect or contribute new scaling law tasks to SLDBench?
Contact: Haowei Lin (linhaowei@pku.edu.cn)