Simulated seasons and rating robustness

ReasoningCheckpoint·arcadia

Each "season" of 10,000 matchups randomizes pairings, with Elo ratings updated using the elo.cal function; species with stable ratings across seasons are considered robust to matchup order.

Confidence