Association·arcadia

Clustering reduces database size, maintains diversity

Claim that clustering the NCBI nr protein database using sequence similarity decreases its size by over half while preserving taxonomic diversity in search results

Confidence
90%
active

Evidence Quote

Clustering the NCBI non-redundant protein database collapses similar sequences, reduces database by over half, and maintains diversity.