Association·arcadia
Pseudoreplication and non-independence limit language model generalizability
Claim that human language datasets also exhibit non-independence and pseudoreplication, which, if unaccounted for, limit linguistic model generalizability.
Confidence
80%
active
Evidence Quote
“Human language datasets also display pseudoreplication and non-independence ... limit generalizability of linguistic models.”
Relationship
Phylogenetic bias decreases Machine learning models for protein design