Fylo›ARCADIA›Graph
Hubs
Association·arcadia

User-level data curation bias affects performance

User curation of training data creates biases influencing model learning and judgments

Confidence
90%
active

Evidence Quote

“User-level bias in training data curation influences model learning and plausibility judgments”

Relationship

User-level bias in training data curation influences pLM performance bias

Arguments

User-level bias in training data curationsubject
plm-utils Python packageobject

Connections (3)

Data leakage and training data bias impact model performanceInferenceChain
Elo ratings updated using R package eloAssociation
Data leakage and biases impact biological foundation model performanceInferenceChain

Evidence

“Preprint discussing how protein language model fitness scores reflect preferences rather than absolute fitness”

Gordon C et al. (2024). Protein Language Model Fitness Is a Matter of Preference doi:10.1101/2024.10.03.616542 ↗

“Evidence describing NIH support for model organism research, as reported by Lauer et al.”

A Look at NIH Support for Model Organisms, Part Two

“Evidence summarizing trends in publications for research involving model organisms, backed by the Dietrich reference.”

Publication Trends in Model Organism Research

“Evidence summarizing trends and insights on animal models used in preclinical gene therapy product studies.”

A review of animal models utilized in preclinical studies of approved gene therapy products: trends and insights

“Evidence summarizing mouse models of human disease from an evolutionary perspective.”

Mouse Models of Human Disease: An Evolutionary Perspective