Factor·arcadia

Naive clustering-based data split

A popular data splitting method using clustering that produces higher model performance due to increased data leakage between training and test sets.

Confidence
80%
active

Source

Study reporting naive split approach and effects on model performance