ReasoningCheckpoint·arcadia

Simulated seasons and rating robustness

Each "season" of 10,000 matchups randomizes pairings, with Elo ratings updated using the elo.cal function; species with stable ratings across seasons are considered robust to matchup order.

Confidence
70%
partialactive