news

Dec 07, 2025 Glad to launch a new blog on Scaling Law Discovery (SLD) (paper). We hope our work on SLD helps advance foundation model development and push the boundaries of AI Scientist. Code, dataset, benchmarks, and leaderboard are all publicly available.
Oct 21, 2025 Excited to be a core contributor of adapters in Terminal-Bench, which converts all agentic benchmarks (e.g., SWE-related) in a unified format to t-bench! Happy to see OAI, GDM, Anthropic, DeepSeek, etc. using T-Bench for model evaluation in their model release.
Oct 20, 2025 Our paper on AI for scientific discovery was published in Nature Machine Intelligence as a cover paper!