Classic formula, updated algorithm and application of rarefaction: bias correction in fossil diversity through subsampling
Article
Figures
Metrics
Preview PDF
Reference
Related
Cited by
Materials
Abstract:
Taxonomic diversity of paleocommunities is a key metric for tracing the evolution of life and underlying geological events. However, the taxonomic richness of fossil collections or compiled data is easily biased by differences in sampling size. Rarefaction is a routine statistical method to mitigate such biases by reducing larger collections to a consistent sample size with the smaller ones. Traditional individual-based rarefaction has been increasingly superseded in the literature by coverage-based rarefaction (or SQS, shareholder quorum subsampling as named by some paleontologists). However, some case studies still show certain misunderstanding of this longstanding method, and coverage-based rarefaction has rarely been clarified in the Chinese literature. In order to better apply this method, this paper introduces the principle, details of calculation and suggestions for application of the rarefaction techniques. The core idea of rarefaction is to randomly resample from the original samples until the subsamples reach a consistent sample level, then the mathematical expectation of the taxonomic richness of these subsamples is calculated for comparison. Traditional rarefaction method evaluates such consistency by the same sample size, such as the number of specimens or fossil occurrences in literature. One major drawback of this traditional method is that the information of larger samples is often severely compressed. To address this problem, an updated method, i.e., coverage-based rarefaction, requires resampling until the equal sample coverage is achieved. The degree of coverage is measured by the sum of the individual frequencies in the community covered by the taxa in the subsamples. It has been well demonstrated that the updated method could more faithfully reflect the true ratio of taxonomic richness among communities. Both the traditional and updated rarefaction methods can be implemented by algorithmic simulation or analytical derivation, and software such as PAST or iNext is convenient for implementation. The primary requirement for applying rarefaction is that the samples at hand are as representative of the paleocommunity as possible. We also suggest several potential directions to further develop the rarefaction techniques in the field of quantitative paleontology.