Mikhail Anisimov

Open access In silico systems

Avoiding common problems with statistical analysis of biological experiments using a simple nested data simulator

Veronika Alexandrova¹

Veronika Alexandrova

Veronika Alexandrova

1

Center For Theoretical Problems of Physico-Chemical Pharmacology RAS, 109029, Srednyaya Kalitnikovskaya 30, Moscow, Russia

Find this author on Google Scholar

,

Mikhail Anisimov^1,2

Mikhail Anisimov

Mikhail Anisimov

1

Center For Theoretical Problems of Physico-Chemical Pharmacology RAS, 109029, Srednyaya Kalitnikovskaya 30, Moscow, Russia

2

Lomonosov Moscow State University, Physics Faculty 119991, Leninskiye Gory 1-2, Moscow, Russia

Find this author on Google Scholar

,

Igor Eltsov

3

Moscow Institute of Physics and Technology, Dolgoprudny, Russia

Find this author on Google Scholar

,

Anastasia Kilina¹

Anastasia Kilina

Anastasia Kilina

1

Center For Theoretical Problems of Physico-Chemical Pharmacology RAS, 109029, Srednyaya Kalitnikovskaya 30, Moscow, Russia

Find this author on Google Scholar

,

Iuliia Lopanskaia¹

Iuliia Lopanskaia

Iuliia Lopanskaia

1

Center For Theoretical Problems of Physico-Chemical Pharmacology RAS, 109029, Srednyaya Kalitnikovskaya 30, Moscow, Russia

Find this author on Google Scholar

,

Liubov Makarova³

Liubov Makarova

Liubov Makarova

3

Moscow Institute of Physics and Technology, Dolgoprudny, Russia

Find this author on Google Scholar

,

Maxim Vovchenko³

Maxim Vovchenko

Maxim Vovchenko

3

Moscow Institute of Physics and Technology, Dolgoprudny, Russia

Find this author on Google Scholar

,

Nikita Gudimchuk^1,4,2

Nikita Gudimchuk

Nikita Gudimchuk

Сorrespondence:[email protected]

1

Center For Theoretical Problems of Physico-Chemical Pharmacology RAS, 109029, Srednyaya Kalitnikovskaya 30, Moscow, Russia

4

Dmitriy Rogachev National Medical Research Center of Pediatric Hematology, Oncology Immunology Ministry of Healthcare of Russian Federation, 117997, Samory Machela 1, Moscow Russia

2

Lomonosov Moscow State University, Physics Faculty 119991, Leninskiye Gory 1-2, Moscow, Russia

Find this author on Google Scholar

1.

Center For Theoretical Problems of Physico-Chemical Pharmacology RAS, 109029, Srednyaya Kalitnikovskaya 30, Moscow, Russia

2.

Lomonosov Moscow State University, Physics Faculty 119991, Leninskiye Gory 1-2, Moscow, Russia

3.

Moscow Institute of Physics and Technology, Dolgoprudny, Russia

4.

Dmitriy Rogachev National Medical Research Center of Pediatric Hematology, Oncology Immunology Ministry of Healthcare of Russian Federation, 117997, Samory Machela 1, Moscow Russia

show the whole list

Despite an extensive literature on statistical methods and their proper application to biological data, incorrect analyses remain a critical and widely spread problem in research papers. Inherently hierarchical (nested, clustered) structure of biological measurements is often erroneously neglected, leading to pseudo-replication and false positive results. This, in turn, complicates the correct assessment of statistical power and impairs optimal planning of experiments. In order to attract more attention to this problem and to illustrate the importance of direct account for the nested structure of biological data, in this article we present a simple open-source simulator of two-level normally distributed stochastic data. By defining ‘true’ mean values and ‘true’ intra- and inter-cluster variances of the simulated data, users of the simulator can test various scenarios, appreciate the importance of using correct multi-level analysis and the danger of neglecting the information about the data structure. Here we apply our nested data simulator to highlight some commonly arising mistakes with data analysis and propose a workflow, in which our simulator could be employed to correctly compare two nested groups of experimental data and to optimally plan new experiments in order to increase statistical power when necessary.

Schematic of a typical biological experiment design, generating nested data

7 257

#nested data #statistical analysis #p-value #false positive #false negative #statistical power #simulated data #intra-cluster correlation