Studies and research design in medicine

N. M. Bulanov; O. B. Blyuss; D. B. Munblit; N. A. Nekliudov; D. V. Butnaru; K. B. Kodzoeva; M. Yu. Nadinskaya; A. A. Zaikin

doi:10.47093/2218-7332.2021.12.1.4-17

Studies and research design in medicine

N. M. Bulanov, O. B. Blyuss, D. B. Munblit, N. A. Nekliudov, D. V. Butnaru, K. B. Kodzoeva, M. Yu. Nadinskaya, A. A. Zaikin

https://doi.org/10.47093/2218-7332.2021.12.1.4-17

Full Text:

PDF (Eng) PDF (Rus) HTML HTML (Rus) XML XML (Rus)

Generate QR code

Contents

Scroll to:

Abstract

Adequate design is an essential condition for conducting a successful study. This review describes the most common types of research design in medicine. We discuss the differences between different types of observational and interventional studies, their advantages and limitations providing examples for each study design. The concept of bias and its potential sources in different studies are covered. We suggest the most suitable approaches to study design for different research objectives and outline approaches to data presentation. During the last decades, several guidelines for conducting and reporting different types of research were proposed and they are also covered in this manuscript.

Keywords

statistics, case report, cross-sectional studies, case-control studies, cohort studies, randomised controlled trials as Topic, meta-analysis as Topic, bias

Scientific studies in medicine can be described as ‘a planned and systematic effort based on evidence for the solution of any health problems using data with high degree of accuracy’ [1]. Thus, the aim of medical research is not limited to the acquisition of pure scientific knowledge, but also implies beneficial contribution to the wellbeing of individual patients and public health. This imposes great responsibility on scientists and physicians organising and undertaking medical studies. Chances for conducting a successful research and obtaining high-quality data depend on the adequacy of the study design – a concept which is often underestimated and overlooked. However, in contrast to the errors in application of the statistical methods it is almost impossible to correct failures of study design after the research has been conducted. Poor planning of the research design is one the most common causes of manuscript rejection by scientific journals [2]. Unfortunately, inadequate research design is one of the major problems in studies conducted by postgraduate students in Russia.

Designing a study includes two key considerations that are often overlooked. The first and most important step is the proper formulation of the research question, followed by a thorough scientific literature search and defining the existing gaps in knowledge [3]. The studied problem should be ethical, researchable, novel and clinically significant. Quite often the researchers waste their time, effort and resources on the problems that have already been studied or, on the opposite, formulate a research hypothesis that can hardly be tested properly in the real-life setting. In some cases, the authors do not give sufficient attention to the ethics of study conduction and its reporting.

The second step in research planning is study design type selection. The potential objectives of medical research include risk factors and disease aetiology, their prevalence and incidence rates, patients’ survival, quality of screening procedures, diagnostic methods, efficacy and safety of treatment, prevention measures and patient-reported outcome measures, all of which may require different approach to study planning. Some types of study design are more accessible and easier to organise while others require significant resources and planning. The aim of this review is to provide basic information regarding approaches to research planning, highlighting differences between the most common types of study design, their possible application and limitations, as well as current standards of conducting and reporting research in clinical medicine.

DEFINITIONS

There are several terms used throughout this review, which require definition.

Subject is an individual (patient) participating in the study.

Prospective and retrospective studies. In prospective studies a group of subjects is actively followed by the researchers for predefined period of time to determine the outcomes that will happen in the future. In retrospective studies, the authors have information about existing outcomes and collect data on past exposures (e.g., from medical records). To complicate things even more, there are also ambidirectional studies that include both retrospective and prospective phases.

Prevalence is the proportion of a population who have a specific characteristic at a specific time point or in a given time period, regardless of when they first developed the characteristic. Prevalence is reported as a percentage (e.g., 10%, or 10 people out of 100), or as the number of cases per 1000 (or any other number of people, for example 10 000 or 100 000) people.

Incidence indicates the number of new cases of a disease or condition that develop in a population in a specified time period. Incidence is reported as a number of new cases per 1000 (or 10 000, or 100 000) people per a certain time period, for example 10 cases per 1000 people per year.

Association is a statistical relationship between two variables (e.g., exposure and outcome), however it does not necessarily mean that there is a cause-effect relationship between them.

Causation means that the exposure produces the effect.

Bias is a systematic error in the interpretation of the data due to a factor that has not been accounted for in statistical analysis. In other words, bias is a tendency to overestimate or underestimate a studied parameter. Bias exists in all types of research design and can occur at any stage of the research process – from data accumulation to statistical analysis [4].

Confounding is a distortion of the true relationship between exposure and outcome by the influence of one or more factors, called confounders. Confounders are connected both with the cause and the outcome (Fig. 1). Failure to control for confounding factors can lead to so-called confounding bias.

FIG. 1. Illustration of confounding.
РИС. 1. Пример конфаундинга.
Note: statistical analysis showed that coffee consumption is a significant risk factor for heart disease occurrence. However, available evidence suggests that many subjects who regularly drink coffee are also smokers. In reality, smoking is the true risk factor of heart diseases development, but if the results of the study would not be adjusted for a potential confounder this may lead to significant bias – and the effect of coffee consumption on the occurrence of the heart diseases would be overestimated.
Примечание: представим, что по результатам некоторого статистического анализа чрезмерное употребление кофе является достоверным фактором риска сердечно-сосудистых заболеваний (ССЗ). При этом известно, что многие люди, регулярно употребляющие кофе, также являются курильщиками. Курение – один из истинных факторов риска ССЗ. Таким образом, если при статистическом анализе не сделать поправку на курение (стандартизацию по курению), можно допустить систематическую ошибку и влияние употребления кофе на развитие ССЗ будет переоценено.

Selection bias occurs when systematic differences exist between baseline characteristics of the groups that are compared. As a result, the difference in baseline characteristics of the groups (but not the studied exposure or intervention) can play a major role in producing the different outcomes. Randomisation helps to prevent this via random allocation of interventions to participants.

Performance bias is specific to differences that occur due to knowledge of interventions allocation, in either the researcher or the participant.

Measurement bias occurs when individual measurements, for example biochemical, are inaccurate.

Recall bias occurs in retrospective studies when participants do not remember previous events or experiences accurately or omit details.

Observer bias occurs if the investigator knows the exposure status of the subject – this knowledge can influence measurements (in other words the researcher sees what he wants to see). Only double-blinded studies are not prone to observer bias, because neither the subjects nor the investigators know the exposure status.

Information bias results from imperfect definitions of study variables of flawed data collection. An example could be accidental misclassification of people with the disease as controls thus affecting the discrimination (sensitivity and specificity) of the diagnostic test in case-control studies.

Temporal bias occurs when the researchers assume a wrong sequence of events which misleads the reasoning about causality. Study designs where participants are not followed over time are prone to temporal bias.

Attrition bias occurs when participants leave during a study. Different rates of loss to follow-up in the exposure or control groups, or losses of different types of participants, whether at similar or different frequencies, may change the characteristics of the groups, irrespective of the exposure or intervention. Schulz and Grimes [5] considered that loss of 5% of participants is unlikely to introduce bias, loss of between 5% and 20% might be a source of bias, and loss of 20% of patients or more gives concerns about the bias.

Publication bias occurs when the outcome of a study influences the decision whether to publish it. For example, studies with statistically significant results are more likely to be published than those without.

STUDY CLASSIFICATION

There are several ways to classify study design in medicine based on the data collection technique, causality, relationship with time, descriptive or analytical (inferential) approach and several other parameters (Fig. 2).

FIG. 2. Classification of study design in medicine.
РИС. 2. Классификация медицинских исследований.

Basic studies

Basic studies, also known as experimental research, are designed to assess cause-outcome relationships between the variation of the independent variable and its effect on dependent variable in a highly controlled setting. This type of research is required to develop and improve analytical procedures, including biochemical and genetical tests, imaging techniques, statistical methods and models. It also includes animal experiments, cell culture studies, genetic, biochemical, pharmacological and physiological evaluations (see example [6] in Table 1). Basic studies require precise specification and implementation of the procedures and experimental design, e.g. the studied animal species, number of groups and cases, conditions of the experiments, dosages of the studied medications etc. [7]. This allows controlling for potential confounding and achieving high internal validity of the study (low risk of bias). However, the results of the basic studies often cannot be directly implemented in the clinical setting, in other words their external validity is sometimes limited. The standards of conducting and reporting the results of the basic studies have been developed, for example the ARRIVE¹ (Animal Research: Reporting of In Vivo Experiments) guidelines for animal experiments [8].

Table 1. Examples of different study designs in the published articles (free full text was available for all articles at the time of manuscript submission)
Таблица 1. Примеры различных дизайнов исследований в опубликованных статьях (полные тексты всех статей были в открытом доступе на момент подачи рукописи)

Study design	Publication	Results
Basic study	Krieger N.S. et al. [6]	Growing mice who lacked proton-activated ovarian cancer OGR1 had an increased bone turnover and a greater increase in bone formation and resorption than the wild type mice. The results suggest the role of OGR1 in the response of growing bone to protons
Case report	Shaigany S. et al. [9]	The authors described the first case of an adult male patient who presented with COVID-19-associated Kawasaki-like multisystem inflammatory syndrome, that had previously been described only in children
Case series	Centers for Disease Control and Prevention [10]	The first report of Pneumocystis pneumonia in previously healthy male homosexual patients. It was the first official notice of a new disease, later recognised as HIV/AIDS
Cross-sectional	Mokdad A.H. et al. [12]	This cross-sectional telephone survey demonstrated the high prevalence of obesity and diabetes mellitus among adult individuals in the USA. Obesity was strongly associated with diagnosed diabetes, high blood pressure, high cholesterol levels and several other major health risk factors
Ecological	Shaposhnikov D. et al. [16]	The authors showed the interaction between high temperatures, increased air pollution and daily non-accidental deaths during the Moscow heat wave and wildfire of 2010
Case-control	D’Souza G. et al. [17]	The study of 100 patients with newly diagnosed oropharyngeal cancer and 200 controls without cancer showed that oral human papillomavirus infection was strongly associated with oropharyngeal cancer in subjects with or without the traditional risk factors
Prospective cohort	Rico-Campà A. et al. [19]	The study of 19 899 participants who were followed-up every two years between 1999 and 2014 demonstrated that a higher consumption of ultra-processed food with an increased hazard for all-cause mortality
Retrospective cohort	De Blok C.J.M. et al. [20]	The study showed an increased risk for breast cancer in trans women compared to cisgender men, and a lower risk in trans men compared to cisgender women
Pre-post	Chauhan K. et al. [21]	There was a significant improvement in knowledge, attitude and practice towards hand hygiene among medical students after an educational intervention
Non-randomised controlled	Agusti A. et al. [22]	No benefit of hydroxychloroquine was demonstrated neither in viral dynamics nor in resolution of clinical symptoms in healthcare workers with mild COVID-19
Randomised controlled trial	Wiviott S.D. et al. [23]	In patients with type 2 diabetes treated with SGLT2 inhibitor dapagliflozin the rate of major adverse cardiovascular events was similar to the control group, but the rate of hospitalisation for heart failure was lower
Systematic review	Coughlin S.S. et al. [25]	Social determinants such as neighbourhood disadvantage, immigration status, lack of social support and social isolation play an important role in myocardial infarction risk and survival
Systematic review with meta-analysis	Wang B. et al. [26]	The analysis of 42 studies showed that in patients with COVID-19 the presence of chronic kidney disease and development of acute kidney injury were associated with a significantly increased risk of disease progressing to a severe condition and death

Note / Примечание: OGR1 – G-protein-coupled receptor 1, рецептор 1, связанный с G-белком; COVID-19 – COrona VIrus Disease 2019, коронавирусное заболевание 2019 года; HIV/AIDS–Human Immunodeficiency Virus /Acquired Immunodeficiency syndrome, ВИЧ / СПИД, вирус иммунодефицита человека / синдром приобретенного иммунодефицита; SGLT2 – sodium-glucose cotransporter-2, натрий-глюкозный котранспортер 2-го типа

Observational studies

Observational studies do not utilise any experiments or interventions. The investigated factors cannot be controlled in observational studies; however, their results are closer to real-life setting. Observational studies are classified as descriptive, which report separate disease cases or cohorts, or analytical, which investigate the associations between the characteristics of patients and outcomes (Fig. 1). Quite often observational studies combine descriptive and analytical approaches. Observational studies include case reports, case series, case-control, cross-sectional, cohort and ecological studies.

Case report and case series

Case report describe rare or remarkable patient and disease characteristics in a single patient (see example [9] in Table 1). If the study includes more than one patient it is called a case series (see example [10] in Table 1). Case reports and case series represent the simplest type of research because they do not require a control group for comparison. However, the authors should provide a clear and detailed description of a well-defined condition in each case to ensure the readers recognise it in clinical practice. In case series the characteristics of all patients must be provided in a similar fashion. The report of the case series usually includes only descriptive statistics – proportions for discrete variables (characteristics which can be defined only as ‘present’ [yes] or ‘not present’ [no]) and means with standard deviations or medians with interquartile ranges for continuous variables (numeric, e.g. systolic blood pressure or serum creatinine concentration).

These types of research are required to generate hypotheses and to plan further, more complicated studies, as well as to inform the professional community about new emerging diseases. They are simple, cheap and easy to perform in clinical setting, and the data may be collected retrospectively. The cons of case reports and case series include the lack of a comparison group and biased selection of cases which are all identified in clinical practice and usually represent either the most typical or the most atypical examples of the disease. This leads to a poor generalizability of the study results (low external validity). And any associations discovered in these types of research are prone to potentially unmeasured confounding not identified by the investigators.

The CARE² (standing for CAse REports) guidelines were developed to increase the accuracy, transparency and usefulness of case reports [11].

Cross-sectional studies

Cross-sectional studies (also known as prevalence studies) investigate the prevalence of diseases, risk factors, outcomes or any other health-related characteristics in a particular population in a certain moment of time (see example [12] in Table 1). The main parameter assessed in this type of studies is prevalence of a condition of interest that is the proportion of individuals with the condition (e.g., chronic kidney disease) at some moment of time among all the people at risk. To conduct a cross-sectional study, the researchers need to define the studied population, to create the sample population and to determine the presence or absence of the condition of interest in each individual of the sample population. Sampling should be performed in such a way that each combination of individuals in the general population has an equal probability of being selected to achieve adequate representation in the study sample. This is usually achieved by random sampling. However, in some situations convenience sampling (when the sample is taken from a group of people easy to contact or to reach) is also valid. Another requirement is the strict and clear definition of the studied condition and methods of its diagnosis. Algorithms of data acquisition should be similar in all individuals (e.g., questionnaires, electronic documentation, imaging techniques, etc.). The results of the cross-sectional studies report the prevalence of the studied condition as a percentage or the number of cases per some number of individuals (e.g., 1000 or 100 000 adults) with 95% confidence interval (CI).

In many cases, the researchers in cross-sectional studies (surveys) also acquire data on the prevalence of exposure to factors that can be associated with disease outcome [13]. For example, the scientists can gather information about smoking habit (exposure) and the evidence of cardiovascular diseases (outcome) from each individual in the sample population. In that case each participant will fall into one of the four groups: (a) people who have been exposed and have the disease, (b) people who have been exposed, but do not have the disease, (c) people who have not been exposed, but have the disease and (d) people who have not been exposed and do not have the disease. These groups can be represented in 2 × 2 contingency table (crosstab), where rows indicate exposure and columns indicate outcomes or disease occurrence (Table 2).

Table 2. 2 × 2 contingency table
Таблица 2. Четырехпольная таблица

	Disease	No disease	Total
Exposed	a	b	a + b
Not exposed	c	d	c + d
Total	a + c	b + d	N

Knowing these four numbers we can calculate the following parameters:

the number of all individuals exposed: a + b;
the number of all individuals unexposed: c + d;
the number of all individuals with disease (outcome): a + c;
the number of all individuals without disease (outcome): b + d;
the total number of the studied individuals (N): a + b + c + d;
the prevalence of disease in exposed and unexposed individuals: a/(a+b) and c/(c+d) respectively;
the prevalence of exposure in patients with and without disease: a/(a+c) and b/(b+b) respectively.

To analyse the association between exposure and disease occurrence (outcome) one can calculate either the odds ratio (OR) or the relative risk (RR) with 95% CI using logistic regression. OR and RR calculation will be described in a separate review. It can be performed using statistical software (R, SPSS, etc.) and online-calculators^{3 4}.

Cross-sectional studies are useful for public health planning, understanding risk factors and aetiology of common diseases, and generating hypotheses for further investigations. However, they provide no data about causal relationships and only describe associations. Cross-sectional studies are less prone to the potential bias that is common in case series and allow the researchers to obtain representative results, because the sample is often taken from the general population or certain population of interest (e.g., heavy industry workers). But as a result, the generalizability of the conclusions is limited to the certain sampled population, and there is always a risk for selection bias.

Additionally, since cross-sectional studies may assess both exposures and outcomes simultaneously, they are prone to temporal bias meaning that the results of these studies do not enable the researchers to discern whether the exposure happened before the outcome, or after. This is one of the major limitations of cross-sectional studies; however, it can be alleviated if we use a questionnaire for data collection where one can specify the timeline of exposure and outcome. Nevertheless, there is no loss to follow-up which is a common problem in longitudinal studies. Cross-sectional studies can be relatively inexpensive and take limited time to conduct, however, it is true only for common conditions and outcomes. Rare diseases (their definition varies worldwide, on average, 40 to 50 cases per 100 000 population [14] in Russia – 10 cases per 100 000 population⁵) require extremely large sample sizes, that makes cross-sectional studies less suitable for their investigation than case-control studies. The same is true about common diseases with low duration like respiratory infections, that are better characterised by their incidence (number of new cases occurring during some time frame), that cannot be assessed in cross-sectional studies.

The standards of reporting cross-sectional studies are guided by the STROBE⁶ (The Strengthening the Reporting of Observational Studies in Epidemiology) guidelines for cross-sectional studies [15].

Ecological studies

Ecological (correlation) studies investigate associations between disease occurrence and exposure to potential risk factor. However, incidence and exposures are measured not in individual patients, but in several populations or communities (see example [16] in Table 1). Ecological studies utilise the data, that have already been collected in populational studies or reports and assess associations between different parameters. They are cheap and easy to perform, include large samples and help to generate hypotheses about etiological relationships. However, ecological studies do not provide the data on exposure and outcomes in each individual. As a result, they can lead to interpretation errors, when conclusions are inappropriately inferred about individuals from the study results. This phenomenon is called ecological fallacy. It is also not possible to control for confounders in ecological studies, and without additional research no conclusions about true associations can be made.

Case-control studies

Case-control studies are conducted retrospectively and analyse participants identified on the basis of their case status, i.e., presence or absence of the disease or another outcome (see example [17] in Table 1). Subjects with the disease of interest form the case group, and the control group consists of subjects without disease (Figure 3). The groups are compared by the presence of one or several potential risk factors. The underlying principle is to identify the significant difference in the frequency (or intensity) of the risk factors between the case and control groups. Both groups should be matched for as many parameters (factors) as possible, except those under investigation, to control for potential confounders. However, there are several assumptions that can be a source of bias in case-control studies. The first is that all cases are representative for the patients with studied condition. For example, the cases chosen among the patients admitted to hospital might be different from the patients who are treated in out-patient facilities. The second is that controls are representative of the healthy population. This can theoretically be achieved by randomisation; however, it is often impractical. In some cases, it is easier to alleviate the differences between the groups by enrolling patients and controls from the similar setting. For example, if cases are chosen among the patients admitted to hospital, the controls can be chosen from the patients admitted to the same hospital for reasons other than the studied condition. Lastly, the approach to data collection should be similar in both groups and the definitions of disease (outcome) and risk factor must be unambiguous.

FIG. 3. Schematic diagram of a case-control study design.
РИС. 3. Схема исследования «случай-контроль».
Note: data analysis in case-control studies is usually performed in a similar way to cross-sectional surveys, where all subjects are divided into four groups depending on the presence of diseases (outcomes) and potential risk factors (see cross-sectional studies section). The association between risk factors and outcomes is represented by the odds ratio or relative risk with 95% confidence interval.
Примечание: в исследованиях «случай-контроль» (как и в одномоментных исследованиях) все участники распределяются по четырем группам в зависимости от наличия заболевания (исхода) и потенциального фактора риска (см. раздел «Одномоментные исследования»). Взаимосвязь между фактором риска и исходом выражается в виде ОШ или ОР с 95% ДИ.

Case-control studies are the most efficient way to investigate risk factors of rare diseases, because other study designs would require enormous sample sizes. Several potential risk factors can be studied at the same time. In case-control studies the scientists are capable of controlling for several confounders if all the important assumptions are satisfied. However, in real-life setting it is often hard to obtain reliable information about individual’s exposure status over a large period of time. Case-control studies are prone to recall and sampling bias and other potential sources of systematic errors which can lead to confounding. Nevertheless, well-designed case-control studies provide evidence for the causal nature of associations. Unlike cross-sectional design case-control studies are unsuitable for the assessment of disease prevalence. Reporting of case-control studies is guided by the STROBE statement for case-control studies [15].

A nested case-control study is a variation of a case-control study in which cases and controls are drawn from the population in a large cohort study (see cohort studies section). The researchers minimise time and cost of the study utilising the previously collected data.

Another type of research design similar to case-control studies is a diagnostic accuracy study, in which the efficacy of a novel diagnostic method is compared to the gold standard. The conducting and reporting of diagnostic accuracy studies are guided by the STARD⁷ (STAndards for Reporting Diagnostic Accuracy) statement [18].

Cohort studies

Cohort is a group of subjects, selected on the basis of some certain characteristics, risk factors or outcomes. In cohort studies the researchers identify study participants based on their exposure status and either follow them over time (prospective setting) to identify which participants will develop the outcome, or looking back at data created in the past (retrospective setting) prior to the development of the outcome (see examples [19, 20] in Table 1). Thus, cohort studies assess the effect of the potential prognostic factor on the disease occurrence (outcome). Opposite to case-control studies in cohort studies the subjects are divided on the basis of the exposure, not the outcome (Fig. 4).

FIG. 4. Schematic diagram of a prospective cohort study design.
РИС. 4. Схема проспективного когортного исследования.

Prospective cohort studies are the gold standard of observational studies. Similar to the other types of research design exposure and outcome should be clearly and identically defined in all cases, and the studied cohort should be representative. The problem is that prospective studies might take years to conduct, and during this time frame the diagnostic criteria for the studied conditions might change. Another assumption which can lead to bias is that the exposure would not change during the study period, which is often not true in real-world setting. For example, cholesterol levels, blood pressure, smoking status might change over time, and require adjustment to avoid potential bias. Retrospective cohort studies (historical cohort studies) can be performed if the researchers have access to the detailed and reliable medical documentation of the large groups of individuals. In that case the course of the disease from exposure to outcome can be studied at one time. However, there is a high a risk of bias due to discrepancies of the medical records.

The association between exposures and outcomes can be measured by using 2×2 contingency tables and calculating the odds ratio or relative risk. In cohort studies the time to event is known in each case, that allows to calculate not only disease prevalence, but also incidence rates and hazard ratio with 95% CI, which cannot be measured in case-control studies. Cohort studies are less prone to systematic bias and allow the investigation of multiple exposures and outcomes in a single study. However, prospective cohort studies are expensive and time consuming. They are less suitable to study rare diseases, but the role of rare exposures can be investigated in cohort studies. The reporting of the cohort studies is guided by the STROBE⁵statement for cohort studies [15].

Interventional studies

Interventional (experimental) studies compare the effect of the studied treatment (intervention) in the experimental group with a control group of subjects, who receive either placebo or a different treatment. They can be used to define causative relationships. Sometimes a term ‘trial’ is used to describe the interventional studies. There are several types of interventional study design. The guidance on the design, conduct, analysis and evaluation of clinical trials is provided in ICH guidelines (The International Council for Harmonisation of Technical Requirements for Pharmaceuticals for Human Use)⁸.

Pre-post (self-controlled, before-and-after study) studies

Pre-post studies (PPS) measure the occurrence of an outcome before and after the implementation of a particular intervention (see example [21] in Table 1). PPS can include a single group (arm) of patients, in which the outcome is measured before and after the intervention. Thus, to define the effect of treatment the patients are used as their own controls. The difference in the discrete and continuous parameters before and after intervention can be analysed. The temporality of PPS studies allows the researchers to suggest cause-effect relationships between the intervention and outcome. However, the researchers cannot control the other factors that might predetermine the outcome and change unpredictably at the same time frame.

Non-randomised controlled studies

Non-randomised trials (NRT) compare the outcomes in the experimental group of subjects, who undergo the studied intervention, with a control group where there is no intervention (see example [22] in Table 1). The participants are not assigned to the groups by chance (i.e., without randomisation). The assignment is either performed by the researchers, or in some cases the patients choose whether they want to receive intervention or not. NRT are easy to conduct and can suggest causative relationships between intervention and outcome. However, they are prone to different types of bias resulting from the lack of randomisation.

Randomised controlled trials

In randomised controlled trials (RCT) a homogenous group of subjects is randomly (by chance) divided into two separate groups. After that the studied intervention is implemented in one group and the outcomes are compared between the groups (see example [23] in Table 1). Successful randomisation is one the critical conditions to achieve adequate results. Theoretically the two groups should be identical in all respects including potential confounders (i.e., age, sex, concomitant medications, disease severity and duration, etc.), the only exception being the studied intervention. However, it is very hard to achieve in real-world setting.

There are several other methods, besides randomisation, which help to improve the quality of the RCTs, such as allocation concealment, blinding, intention to treat concept, simultaneous assessment of all study arms, etc. RCTs are expensive, time- and resource-consuming, and require significant numbers of trained research personnel, but if conducted properly they provide the highest level of evidence about cause-effect relationships among all types of clinical trials. As a result, RCTs have become the gold standard of clinical trials design. The CONSORT⁹ (CONsolidated Standards of Reporting Trials) statement was developed to improve the quality and transparency of the RCTs [24]. The SPIRIT¹⁰ (Standard Protocol Items: Recommendations for Interventional Trials) Statement provides evidence-based recommendations for the minimum content of a clinical trial protocol.

Systematic review and meta-analysis

Sometimes several cohort studies or RCT are conducted over time to investigate the same problem. More often than not their conditions and results are different. Systematic reviews evaluate and interpret the results of all published studies in the clinical area (see example [25] in Table 1). In contrast to traditional review articles, in which the authors may choose which studies to include, a systematic review must contain all the published data of adequate quality. It is sometimes possible to perform statistical analysis to combine the results of separate studies, included in a systematic review, this method is called meta-analysis (see example [26] in Table 1). Currently systematic reviews and meta-analysis provide the highest level of evidence in clinical research. The methodology of systematic reviews and meta-analysis is guided by the PRISMA¹¹ (Preferred Reporting Items for Systematic reviews and meta-analysis) statement [27].

TYPES OF OUTCOMES AND STATISTICAL ANALYSIS

Each type of study design can assess certain categories of outcomes: continuous, binary, rates of events or survival time. There are conventional statistical methods to describe and compare each type of outcome [28]. They will be discussed in the future reviews.

CONSIDERATIONS ON STUDY DESIGN CHOICE

To define the most appropriate study design one should consider the research objective, form a hypothesis, analyse the published data in this scientific area, resources available to the research team, and their own experience and expertise. It is much more preferrable to conduct a simpler study of high-quality, than to spend time and effort on a more complicated, but flawed design that will produce unreliable data. The suggested application of different study designs depending on the research objective is provided in Table 3.

It should be taken into account that different types of study design are associated with different types of bias and other potential disadvantages, which should be considered in advance and controlled for, if possible (Table 4) [29]. One of the important steps in planning clinical studies is sample size calculation, that helps to control for bias, achieve statistically and clinically significant results and optimise costs. Underestimation of the sample size can lead to obtaining statistically insignificant results (while the difference might in fact exist). Overestimation of the sample size results in extra costs, unnecessary exposure of large number of subjects and detection of statistically significant, but clinically irrelevant differences. There are several approaches to sample size calculation depending on the type of outcome. If the sample size is predefined, it is possible to calculate the study power instead to assess which level of difference between groups (effect size) can possibly be detected in this setting. Sample size calculation can be performed with statistical software (for example, in R) or using online-calculators^{12 13}.

Table 3. Preferred types of study design for different objectives
Таблица 3. Выбор дизайна исследования в зависимости от цели и задач

	Ecological	Cross-sectional	Case-control	Cohort	Interventional
Investigation of rare diseases	+	−	+	−	−
Investigation of rare causes (exposures)	+/−	−	+/−	+	−
Studying multiple effects of exposure	+/−	+/−	−	+	−
Studying multiple exposures	+/−	+/−	+	+	−
Measurement of prevalence	−	+	−	+	−
Measurement of incidence	−	−	+/−	+	−
Measurement of time relationships	−	−	+/−	+	−
Effect of interventions	−	−	−	−	+

Note: + preferred, +/- possible, but less suitable, – unsuitable.
Примечание: + предпочтительный, +/- менее предпочтительный, но допустимый, – неподходящий.

Table 4. Probability of bias and limitations in different types of studies
Таблица 4. Частота систематических ошибок и прочих затруднений при проведении различных видов исследований

	Ecological	Cross-sectional	Case-control	Cohort
Selection bias	n/a	M	H	L
Recall bias	n/a	H	H	L
Loss to follow-up	n/a	n/a	L	H
Confounding	H	M	M	M
Time required	L	M	M	H
Cost	L	M	M	H

Note: H – High, L – Low, М – Medium, n/a – not available.
Примечание: H – высокая, L – низкая, М – средняя, n/a – не применимо.

The plan of the study, called research protocol, which includes all considerations and details about research objective, ethical issues, study design, inclusion and exclusion criteria, outcomes and statistical analysis plan should be drawn in advance and carefully implemented. Protocols of the clinical trial that involve human subjects should be registered in the special registration systems such as clinicaltrials.gov database¹⁴ or the WHO International Clinical Trials Registry Platform¹⁵ before their results will be reported. Through these systems all the important data about each study including its design can be obtained and assessed by any researcher. This approach contributes to the clinical trials transparency and many medical journals strongly recommend to provide the registration number of the trial in the manuscript.

CONCLUSION

Adequate research design is an essential condition for conducting a successful study. Therefore, the choice of study design is among the most important things in planning the scientific investigation, that should be decided upon wisely and rationally. Certain types of observational and interventional studies are suitable for different research objectives. The studied exposures and outcomes should be clearly defined. To control for potential bias the studied population samples should be representative and of sufficient size. Several guidelines for conducting and reporting research in medicine have been proposed to improve the quality of studies and avoid bias.

AUTHOR CONTRIBUTIONS

Nikolay М. Bulanov, Oleg B. Blyuss, Daniil B. Munblit, Nikita A. Nekliudov, Alexey A. Zaikin, Khava B. Kodzoeva and Maria Yu. Nadinskaia, participated in writing the text of the manuscript. Oleg B. Blyuss, Daniil B. Munblit, and Nikita A. Nekliudov searched and analyzed the literature on the review topic. Oleg B. Blyuss, Daniil B. Munblit and Denis V. Butnaru developed the general concept of the article and supervised its writing. All authors participated in the discussion and editing of the work. All authors approved the final version of the publication.

ВКЛАД АВТОРОВ

Н.М. Буланов, О.Б. Блюсс, Д.Б. Мунблит, Н.А. Неклюдов, А.А. Заикин, Х.Б. Кодзоева и М.Ю. Надинская участвовали в написании текста рукописи. О.Б. Блюсс, Д.Б. Мунблит и Н.А. Неклюдов выполняли поиск и анализ литературы по теме обзора. О.Б. Блюсс, Д.Б. Мунблит и Д.В. Бутнару разработали общую концепцию статьи и осуществляли руководство ее написанием. Все авторы участвовали в обсуждении и редактировании работы. Все авторы утвердили окончательную версию публикации.

Conflict of interests. The authors declare that there is no conflict of interests.

Financial support. The study was not sponsored (own resources).

Конфликт интересов. Авторы заявляют об отсутствии конфликта интересов.

Финансирование. Исследование не имело спонсорской поддержки (собственные ресурсы).

1. https://arriveguidelines.org/

2. https://www.care-statement.org/

3. https://www.medcalc.org/calc/odds_ratio.php

4. https://medstatistic.ru/calculators/calcrisk.html

5. https://minzdrav.gov.ru/documents/7025-federalnyy-zakon-323-fz-ot-21-noyabrya-2011-g

6. https://www.strobe-statement.org/

7. https://www.equator-network.org/reporting-guidelines/stard/

8. https://www.ema.europa.eu/en/ich-e9-statistical-principles-clinical-trials

9. https://www.equator-network.org/reporting-guidelines/consort/

10. https://www.spirit-statement.org/

11. https://www.equator-network.org/reporting-guidelines/prisma/

12. https://www.stat.ubc.ca/~rollin/stats/ssize/index.html

13. https://medstatistic.ru/calculators/calcsize.html

14. https://www.clinicaltrials.gov/

15. https://www.who.int/clinical-trials-registry-platform

References

1. Süt N. How can we improve the quality of scientific research and publications? Guidelines for authors, editors, and reviewers. Balkan Med J. 2013; 30(2): 134–135. https://doi.org/10.5152/balkanmedj.2013.009 PMID: 25207088

2. Dhammi I.K., Rehan-Ul-Haq. Rejection of manuscripts: Problems and solutions. Indian J Orthop. 2018; 52(2): 97–99. https://doi.org/10.4103/ortho.IJOrtho_68_18 PMID: 29576635

3. Farrugia P., Petrisor B.A., Farrokhyar F., Bhandari M. Research questions, hypotheses and objectives. Can J Surg. 2010; 53(4): 278–281. PMID: 20646403

4. Smith J., Noble H. Bias in research. Evid Based Nurs. 2014; 17(4): 100–101. https://doi.org/10.1136/eb-2014-101946 PMID: 25097234

5. Schulz K.F., Grimes D.A. Sample size slippages in randomised trials: Exclusions and the lost and wayward. Lancet. 2002; 359(9308): 781–785. https://doi.org/10.1016/S0140-6736(02)07882-0 PMID: 11888606

6. Krieger N.S., Yao Z., Kyker-Snowman K., et al. Increased bone density in mice lacking the proton receptor OGR1. Kidney Int. 2016; 89(3): 565–573. https://doi.org/10.1016/j.kint.2015.12.020 PMID: 26880453

7. Röhrig B., Du Prel J.B., Wachtlin D., Blettner M. Types of study in medical research – Part 3 of a Series on Evaluation of Scientific Publications. Dtsch Arztebl Int. 2009; 106(15): 262–268. https://doi.org/10.3238/arztebl.2009.0262 PMID: 19547627

8. Du Sert N.P., Ahluwalia A., Alam S., et al. Reporting animal research: Explanation and elaboration for the arrive guidelines 2.0. PLoS Biol. 2020; 18(7): e3000411. https://doi.org/10.1371/journal.pbio.3000411 PMID: 32663221

9. Shaigany S., Gnirke M., Guttmann A., et al. An adult with Kawasaki-like multisystem inflammatory syndrome associated with COVID-19. Lancet. 2020; 396(10246): e8–10. https://doi.org/10.1016/S0140-6736(20)31526-9 PMID: 32659211

10. Centers for Disease Control. Pneumocystis Pneumonia – Los Angeles. Morb Mortal Wkly Rep. 1981; 30(21): 250–252. PMID: 6265753

11. Riley D.S., Barber M.S., Kienle G.S., et al. CARE guidelines for case reports: explanation and elaboration document. J Clin Epidemiol. 2017; 89: 218–235. https://doi.org/10.1016/j.jclinepi.2017.04.026 PMID: 28529185

12. Mokdad A.H., Ford E.S., Bowman B.A., et al. Prevalence of obesity, diabetes, and obesity-related health risk factors, 2001. JAMA. 2003; 289(1): 76–79. https://doi.org/10.1001/jama.289.1.76 PMID: 12503980

13. Setia M.S. Methodology series module 3: Cross-sectional studies. Indian J Dermatol. 2016; 61(3): 261–264. https://doi.org/10.4103/0019-5154.182410 PMID: 27293245

14. Richter T., Nestler-Parr S., Babela R., et al. Rare disease terminology and definitions-a systematic global review: report of the ISPOR rare disease special interest group. Value Heal. 2015; 18(6): 906–914. https://doi.org/10.1016/j.jval.2015.05.008 PMID: 26409619

15. Von Elm E., Altman D.G., Egger M., et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: Guidelines for reporting observational studies. PLoS Med. 2007; 4(10): 1623–1627. https://doi.org/10.1371/journal.pmed.0040296 PMID: 17941714

16. Shaposhnikov D., Revich B., Bellander T., et al. Mortality related to air pollution with the Moscow heat wave and wildfire of 2010. Epidemiology. 2014; 25(3): 359–364. https://doi.org/10.1097/EDE.0000000000000090 PMID: 24598414

17. D’Souza G., Kreimer A.R., Viscidi R., et al. Case–control study of human papillomavirus and oropharyngeal cancer. N Engl J Med. 2007; 356(19): 1944–1956. https://doi.org/10.1056/NEJMoa065497 PMID: 17494927

18. Bossuyt P.M., Reitsma J.B., Bruns D.E., et al. STARD 2015: An updated list of essential items for reporting diagnostic accuracy studies. BMJ. 2015; 351: h5527. https://doi.org/10.1136/bmj.h5527 PMID: 26511519

19. Rico-Campà A., Martínez-González M.A., Alvarez-Alvarez I., et al. Association between consumption of ultra-processed foods and all cause mortality: SUN prospective cohort study. BMJ. 2019; 365: l1949. https://doi.org/10.1136/bmj.l1949 PMID: 31142450

20. De Blok C.J.M., Wiepjes C.M., Nota N.M., et al. Breast cancer risk in transgender people receiving hormone treatment: Nationwide cohort study in the Netherlands. BMJ. 2019; 365: l1652. https://doi.org/10.1136/bmj.l1652 PMID: 31088823

21. Chauhan K., Pandey A., Thakuria B. Hand hygiene: An educational intervention targeting grass root level. J Infect Public Health. 2019; 12(3): 419–423. https://doi.org/10.1016/j.jiph.2018.12.014 PMID: 30679038

22. Agusti A., Guillen E., Ayora A., et al. Efficacy and safety of hydroxychloroquine in healthcare professionals with mild SARS-CoV-2 infection: Prospective, non-randomized trial. Enferm Infecc Microbiol Clin. 2020; S0213-005X(20)30413-4. https://doi.org/10.1016/j.eimc.2020.10.023 PMID: 33413989

23. Wiviott S.D., Raz I., Bonaca M.P., et al. Dapagliflozin and cardiovascular outcomes in type 2 diabetes. N Engl J Med. 2019; 380(4): 347–357. https://doi.org/10.1056/NEJMoa1812389 PMID: 30415602

24. Moher D., Hopewell S., Schulz K.F., et al. CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials. BMJ. 2010; 340: c869. https://doi.org/10.1136/bmj.c869 PMID: 20332511

25. Coughlin S.S., Young L. Social determinants of myocardial infarction risk and survival: A systematic review. Eur J Cardiovasc Res. 2020; 1(1): 1–12. https://doi.org/10.31487/j.ejcr.2020.01.02 PMID: 33089252

26. Wang B., Luo Q., Zhang W., et al. The involvement of chronic kidney disease and acute kidney injury in disease severity and mortality in patients with COVID-19: A meta-analysis. Kidney Blood Press Res. 2021; 46(1): 17–30. https://doi.org/10.1159/000512211 PMID: 33352576

27. Moher D., Liberati A., Tetzlaff J., et al. Preferred reporting items for systematic reviews and meta-analyses: The PRISMA statement. PLoS Med. 2009; 6(7): e1000097. https://doi.org/10.1371/journal.pmed.1000097 PMID: 19621072

28. Kirkwood B., Stern J. Essential Medical Statistics. 2nd ed. Blackwell Publishing; 2003; 512 p.

29. Gallin J., Ognibene F., Johnson L. Principles and practice of clinical research. 4th ed. Elsevier; 2017; 824 p. eBook

About the Authors

N. M. Bulanov

Sechenov First Moscow State Medical University (Sechenov University)
Russian Federation

Nikolay M. Bulanov, Cand. of Sci. (Medicine), Associate Professor, Department of Internal, Occupational Diseases and Rheumatology

8/2, Trubetskaya str., Moscow, 119991

O. B. Blyuss

Sechenov First Moscow State Medical University (Sechenov University); University of Hertfordshire
United Kingdom

Oleg B. Blyuss, Cand. of Sci. (Phys. and Math.), Associate Professor, Department of Paediatrics and Paediatric Infectious Diseases; Senior Lecturer

College Lane, Hatfield, AL10 9AB

D. B. Munblit

Sechenov First Moscow State Medical University (Sechenov University); Imperial College London
United Kingdom

Daniil B. Munblit, PhD, Professor, Department of Paediatrics and Paediatric Infectious Diseases; Honorary Senior Lecturer, Inflammation, Repair and Development Section, National Heart and Lung Institute, Faculty of Medicine

Exhibition Rd, South Kensington, London, SW7 2BU

N. A. Nekliudov

Sechenov First Moscow State Medical University (Sechenov University)
Russian Federation

Nikita A. Nekliudov, student, Institute of Child’s Health named after N.F. Filatov

8/2, Trubetskaya str., Moscow, 119991

D. V. Butnaru

Sechenov First Moscow State Medical University (Sechenov University)
Russian Federation

Denis V. Butnaru, Cand. of Sci. (Medicine), Vice-rector for Research

8/2, Trubetskaya str., Moscow, 119991

K. B. Kodzoeva

Sechenov First Moscow State Medical University (Sechenov University)
Russian Federation

Khava B. Kodzoeva, Postgraduate Student, Department of Internal Medicine Propaedeutics, Gastroenterology and Hepatology

8/2, Trubetskaya str., Moscow, 119991

M. Yu. Nadinskaya

Sechenov First Moscow State Medical University (Sechenov University)
Russian Federation

Maria Yu. Nadinskaia, Cand. of Sci. (Medicine), Associate Professor, Department of Internal Medicine Propaedeutics, Gastroenterology and Hepatology

8/2, Trubetskaya str., Moscow, 119991

A. A. Zaikin

Sechenov First Moscow State Medical University (Sechenov University); University College London
United Kingdom

Alexey A. Zaikin, Cand. of Sci. (Phys. and Math.), Deputy Director, Centre for Analysis of Complex Systems; Professor of Systems Medicine, Institute for Women’s Health and Department of Mathematics

Gower str., London, WC1E 6BT

Supplementary files

Review

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 2218-7332 (Print)
ISSN 2658-3348 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Sechenov Medical Journal

Studies and research design in medicine

Full Text:

Abstract

Keywords

DEFINITIONS

STUDY CLASSIFICATION

TYPES OF OUTCOMES AND STATISTICAL ANALYSIS

CONSIDERATIONS ON STUDY DESIGN CHOICE

CONCLUSION

AUTHOR CONTRIBUTIONS

ВКЛАД АВТОРОВ

References

About the Authors

Supplementary files

Review

Cookies policy