Association·arcadia

Higher validation set similarity increases data leakage

Model performance increases with more similarity between validation and training sets indicating data leakage

Confidence
90%
active

Evidence Quote

Model performance better for validation sets with higher similarity to training data due to data leakage

Relationship

Validation set sequence similarity increases Data leakage