Skip to content

Comments

Fix Issue #97: Use Heavy-Tailed Distribution for MSELoss to Prevent Partial Computation from Passing Verification#98

Open
doux-jy wants to merge 1 commit intoScalingIntelligence:mainfrom
doux-jy:main
Open

Fix Issue #97: Use Heavy-Tailed Distribution for MSELoss to Prevent Partial Computation from Passing Verification#98
doux-jy wants to merge 1 commit intoScalingIntelligence:mainfrom
doux-jy:main

Conversation

@doux-jy
Copy link

@doux-jy doux-jy commented Nov 28, 2025

Fix Issue #97: Use Heavy-Tailed Distribution for MSELoss to Prevent Partial Computation from Passing Verification

Summary

Replace uniform distribution with Pareto distribution in level1/94_MSELoss.py input generation to detect incorrect kernel implementations that only compute partial data.

Problem

The original implementation uses uniform distribution for test data generation:

def get_inputs():
    scale = torch.rand(())
    return [torch.rand(batch_size, *input_shape)*scale, torch.rand(batch_size, *input_shape)]

Due to the bounded moments of uniform distribution, by the Law of Large Numbers, MSE converges to the same expected value $(2s^2 - 3s + 2)/6$ regardless of sample size. This allows faulty kernel implementations (e.g., computing only part of the data) to pass accuracy verification.

Solution

Adopt Pareto distribution Pareto(scale=0.01, alpha=1.5) (or other heavy-tailed distributions with divergent second moments):

  • Bounded first moment (α=1.5 > 1): Ensures numerical stability
  • Divergent second moment (α=1.5 ≤ 2): MSE expectation grows with sample size

This ensures implementations with different computation volumes produce significantly different outputs, correctly detecting faulty kernel implementations.

Changed Files

  • KernelBench/level1/94_MSELoss.py

@doux-jy doux-jy marked this pull request as draft November 28, 2025 08:44
@doux-jy doux-jy marked this pull request as ready for review November 28, 2025 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet