Avoiding common problems with statistical analysis of biological experiments using a simple nested data simulator

, , , , , , ,
March 12, 2021
March 29, 2021
March 31, 2021
Despite an extensive literature on statistical methods and their proper application to biological data, incorrect analyses remain a critical and widely spread problem in research papers. Inherently hierarchical (nested, clustered) structure of biological measurements is often erroneously neglected, leading to pseudo-replication and false positive results. This, in turn, complicates the correct assessment of statistical power and impairs optimal planning of experiments. In order to attract more attention to this problem and to illustrate the importance of direct account for the nested structure of biological data, in this article we present a simple open-source simulator of two-level normally distributed stochastic data. By defining ‘true’ mean values and ‘true’ intra- and inter-cluster variances of the simulated data, users of the simulator can test various scenarios, appreciate the importance of using correct multi-level analysis and the danger of neglecting the information about the data structure. Here we apply our nested data simulator to highlight some commonly arising mistakes with data analysis and propose a workflow, in which our simulator could be employed to correctly compare two nested groups of experimental data and to optimally plan new experiments in order to increase statistical power when necessary.
7 257
article views
times cited