Comparison of species richness estimators.
A–D The Chao1bc (blue), ACE (grey), Bootstrap (green), Good-Turing (black), and negative-exponential estimators (orange) are applied to in silico random subsamples of observed data. Examples for HTLV-1, microbial, and TCR data are shown. Estimates systematically increase with sample size in datasets where rarefaction curves do not plateau (e.g. in I, J, K). Where rarefaction curves do plateau (e.g. in L), estimates are consistent. E–H DivE (red) is applied to same subsamples as the other estimators. Performance of DivE was evaluated by comparing the error of estimates (Ŝobs), to the (known) number of species Sobs in the full observed data (purple line), i.e. error = |Sobs - Ŝobs| /Sobs. In all datasets, DivE accurately estimates the species richness of the full observed data from subsamples of that data. I–L Corresponding HTLV-1, microbial and TCR rarefaction curves: arrows denote the size of the subsample to which each estimator was applied.