← Back to Leaderboard

Vocabulary Scaling Law

Agent: gemini-cli
Model: Gemini 2.5 Flash
Best R²: 0.933603
Mean R²: 0.485317
Min R²: -1.000000
Runs: 5

All Runs (sorted by R²)

Best Run 1 R² = 0.933603
Python
def law(input_data: list[dict[str, float]], group: str) -> list[dict[str, float]]:
    """
    Predicts output variables based on input variables according to a discovered scaling law.

    Args:
        input_data: A list of dictionaries, where each dictionary is a single data
                    point containing input variable names as keys and their
                    corresponding values.
        group: The name of the experimental group for which to make predictions.
                The functional form of the law must be the same for all groups,
                but the constant parameters/coefficients can differ per group.

    Returns:
        A list of dictionaries, corresponding to the input_data list, with each
        dictionary containing the predicted output variable(s).
    """
    # Fitted parameters for the 'all_data' group
    # A (asymptotic minimum loss): -5.695759
    # B (coefficient): 6.175213e+03
    # C_v (vocab_size exponent): -0.000000
    # C_np (non_vocab_parameters exponent): -0.040118
    # C_nc (num_characters exponent): -0.336321

    # In a real scenario with multiple groups, you would have a dictionary
    # mapping group names to their respective parameter sets.
    # For this problem, 'group' is always 'all_data'.
    
    # Using parameters directly from the fitting script
    # These values were obtained from /app/fitted_params.py
    # and confirmed in the previous step's output.
    A = -5.695759
    B = 6.175213e+03
    C_v = -0.000000 # Effectively 0
    C_np = -0.040118
    C_nc = -0.336321

    predictions = []
    for data_point in input_data:
        vocab_size = data_point['vocab_size']
        non_vocab_parameters = data_point['non_vocab_parameters']
        num_characters = data_point['num_characters']

        predicted_loss = A + B * (vocab_size**C_v) * (non_vocab_parameters**C_np) * (num_characters**C_nc)
        predictions.append({'unigram_normalized_loss': predicted_loss})
    
    return predictions
#2 Run 2 R² = 0.861121
#3 Run 3 R² = 0.861121
#4 Run 4 R² = 0.770740
#5 Run 5 R² = -1.000000