Factor·arcadia

Protein redundancy reduction via CD-HIT

Using CD-HIT to cluster proteins to reduce redundancy, retaining a single representative per cluster; similarity threshold 0.90 for transcriptomes and 0.95 for genome-derived datasets excluding UniProt.

Confidence
90%
active

Source

Methods section of the main paper