Association·arcadia
Filtering stringency affects data leakage
More stringent sequence similarity filtering reduces likelihood of data leakage, affecting model performance
Confidence
90%
active
Evidence Quote
“Stringency of sequence similarity filtering is a proxy for likelihood of data leakage”
Relationship
Sequence similarity filtering stringency decreases Data leakage
Connections (5)
Data leakage and training data bias impact model performanceInferenceChain
Genome contamination yields HGT false positivesAssociation
Impact of sequence similarity filtering on data leakage and sequence diversityInferenceChain
Data leakage and training data biases impact model performanceInferenceChain
Data leakage and biases impact biological foundation model performanceInferenceChain