Factor·arcadia
Protein redundancy reduction via CD-HIT
Using CD-HIT to cluster proteins to reduce redundancy, retaining a single representative per cluster; similarity threshold 0.90 for transcriptomes and 0.95 for genome-derived datasets excluding UniProt.
Confidence
90%
active
Source
Methods section of the main paper