Biodiversity, DNA Barcoding, DNA Taxonomy, Methodology, Phylogenetic, Taxonomy

The minimum sample size for DNA Barcoding

We recently published a paper on the minimum sample size in DNA Barcoding in the journal Ecology and Evolution (doi: 10.1002/ece3.1846). It tried to use simulated datasets to examine the effects of sample size on four estimators of genetic diversity, mismatch distribution, nucleotide diversity, the number of haplotypes, and maximum pairwise distance. As found by the previous project by Ai-Bing ZHANG et al. (2010, doi:10.1016/j.ympev.2009.09.014), this project confirms again that larger sample size helps to find the better results from DNA Barcoding. Besides, we found the minimum sample size of 20 individuals is required for each subsample.

Dr A-Rong LUO led the project. She collaborated with researchers and student in Yunnan University, Beijing University of Chemical Technology, Capital Normal University and University of Sydney. Mr. Hai-Qiang LAN, the joint graduate student between Yunnan University of Finance and Economics and Institute of Zoology, Chinese Academy of Sciences finished his thesis during the project. The project was mainly supported by grants from the National Science Foundation, China, and partially supported by the Program of Ministry of Science and Technology of the People’s Republic of China.

我们最近在Ecology and Evolution上发表了一篇论文,研究了DNA Barcoding的最小取样量问题(doi: 10.1002/ece3.1846)。该工作用模拟数据,对错配分布、核苷酸多样性、单倍型数量和最大配对距离等四个遗传多样性的取样量效应进行了比较分析。和张爱兵等(2010, doi:10.1016/j.ympev.2009.09.014)发现的一样,我们发现取样量越大,DNA Barcoding的结果越好;同时,我们的结果发现每个亚群取样量至少为21个个体。

罗阿蓉博士为第一作者。她和云南财经大学、北京化工大学、首都师范大学、悉尼大学等研究人员合作完成。通过这个项目,云南大学和中国科学院动物研究所联合培养了一名硕士研究生,蓝海强完成一篇学位论文。该工作主要得到自然科学基金委面上和特殊学科点项目,部分得到科学与技术部基础专项的支持。

Luo, A., Lan, H., Ling, C., Zhang, A., Shi, L., Ho, S. Y. W. and Zhu, C. (2015), A simulation study of sample size for DNA barcoding. Ecol Evol, 5: 5869–5879. doi:10.1002/ece3.1846

English Abstract:

For some groups of organisms, DNA barcoding can provide a useful tool in taxonomy, evolutionary biology, and biodiversity assessment. However, the efficacy of DNA barcoding depends on the degree of sampling per species, because a large enough sample size is needed to provide a reliable estimate of genetic polymorphism and for delimiting species. We used a simulation approach to examine the effects of sample size on four estimators of genetic polymorphism related to DNA barcoding: mismatch distribution, nucleotide diversity, the number of haplotypes, and maximum pairwise distance. Our results showed that mismatch distributions derived from subsamples of ≥20 individuals usually bore a close resemblance to that of the full dataset. Estimates of nucleotide diversity from subsamples of ≥20 individuals tended to be bell-shaped around that of the full dataset, whereas estimates from smaller subsamples were not. As expected, greater sampling generally led to an increase in the number of haplotypes. We also found that subsamples of ≥20 individuals allowed a good estimate of the maximum pairwise distance of the full dataset, while smaller ones were associated with a high probability of underestimation. Overall, our study confirms the expectation that larger samples are beneficial for the efficacy of DNA barcoding and suggests that a minimum sample size of 20 individuals is needed in practice for each population.

中文摘要:

DNA条形码可以为某些生物类群分类、进化生物学和物种多样性评估等研究提供有效的辅助性作用。但是,条形码的效力取决于每个物种的取样程度。只有足够的取样量才能可靠地估计遗传多样性,从而精确界定物种。我们通过数据模拟,对4个影响DNA条形码相关的遗传多样性变量进行了分析:错配分布、核苷酸多样性、单倍型数量和最大配对距离。我们的结果表明:20个(包括)以上的个体组成亚组得到的错配分布和全数据集的相似;20个以上个体亚组的核苷酸多样性估值在全数据集附近形成钟形分布,而20个以下个体亚组则非钟形分布;加大取样量通常会提高单倍型数量;20个以上个体亚组可以较好估计全数据集的最大配对距离。综上,我们的研究确认DNA条形码取样量的重要性,每个种群至少取样20个。

Standard