Why Do We Need Clinical Trials?

February 25, 2026

Why do we test medications and treatments? The answer should be simple and obvious to everyone: to separate what works and is safe from what does not work and is not safe. To know how to use the new medication or treatment, at what doses and for how long. To obtain this information, the history of medicine and public health developed what we call clinical trials. There are several modalities, and the so-called gold standard — the one that answers these questions with the highest degree of probability of being correct — is the famous “randomized, double-blind, placebo-controlled clinical trial.” Or RCT, for short.

During the COVID-19 pandemic, RCTs were discussed so extensively that, on one hand, the public became familiar with the idea that medications and vaccines must undergo rigorous and controlled testing; on the other hand, it created an illusion of knowledge: most people understood the general concept, but did not learn the details, the possible exceptions, the choices that are made to ensure that all volunteers are treated fairly and ethically, while preserving scientific rigor.

We now need to take a step back and devote time to understanding the nuances of trials and the rules they follow.

A clinical trial seeks to be a fair comparison between the medication or therapy we want to understand and something we already know well: it may be an established treatment, or nothing, or a placebo. The central idea is to compare similar groups of people, where all conditions are the same, except only for the drug being tested. By isolating the drug as the only relevant difference between the two groups, it becomes easier to attribute any observed effect to the drug.

One group receives the medication. This is the treatment group. The other group is the control group. This group may receive nothing, which would be a negative control, or it may receive a placebo, an “imitation” of the treatment, which may be a flour or sugar pill, a saline injection, and there have even been cases of “sham surgeries,” in which an incision was made at the same location as the surgical procedure, with sutures: everything identical to the treatment being tested, except for the crucial part that is supposed to make a difference.

A controlled clinical trial must ask the following question: I want to know what happens when I apply treatment/medication X. To isolate the specific effect of the treatment from other factors that may influence my observation, I need to have an idea of what would happen, under those same circumstances, if I did not apply treatment/medication X. This reasoning is the “counterfactual.”

The placebo group is very useful for “blinding” volunteers and, when possible, researchers and the medical team as well. This is the so-called “double-blind.” The advantage of double-blinding, when neither volunteers nor researchers interacting with them know who is receiving the real treatment, is to reduce the impact of cognitive biases, which everyone has. We have, for example, confirmation bias — the desire to see the experiment succeed — which may lead us to “see” results that are not actually there. We may “see” patients improving, or patients may report improvement, motivated by the fact that they know they are receiving a medication they consider promising.

The reverse is also true: if people know they are in the placebo group, they may “see” symptoms that are not actually there. Concerned physicians, who know that these volunteers are not receiving the real treatment, may want to order more tests, provide more care, and treat these patients differently. All of this can bias the evaluation of results and compromise the trial.

Another common type of bias is selection bias. If groups are not “randomized,” that is, if people are not distributed randomly into treatment and control groups, the groups may not be similar to one another. Imagine studying a medication for high blood pressure, but having one group of young athletes and another of sedentary elderly individuals. The example is extreme, but it gives an idea of the importance of randomization.

Randomization also helps eliminate what we call “confounding factors,” which are exactly that — factors that confound results. Demographic data, such as the age difference in the example above, may be a confounding factor. Other examples include differences in general health status, diet, exercise habits, access to medical and hospital care, place of residence, and even family income and education. All these factors should, in theory, be equalized between the groups of participants in a clinical trial, and if that is not possible, we must use statistical mechanisms to try to “adjust” for these differences.

The size of the groups also matters. The more people involved, the greater the statistical power of the comparison — that is, its ability to detect a real effect. A low-power study (with an insufficient number of volunteers) runs two risks: failing to detect a real effect or detecting an illusory effect. If I work with only a few people, the result that appears may be due to chance, to luck. With a larger group, we reduce this interference. Who does not remember the vaccine clinical trials conducted with thousands of people? Thirty thousand, forty thousand? That was why — to ensure a solid statistical analysis showing that the vaccine worked for a large number of people, with a significant difference between the treatment group (which received the vaccine) and the control group (which received a placebo). The comparison, in this case, should show that many more people in the vaccine group were protected compared to the control group.

But could we not compare with the baseline level of disease in the population? Do we really need a control group? The control group should be used whenever possible because it is the best way to ensure a fair comparison. Without a control, we lose similar group conditions. Comparing with the population baseline would mean giving up control of study conditions. We would be comparing different groups of people, at different times, with different disease transmission rates, in different places, and sometimes even with different parameters. Conducting a controlled study is the most reliable way to ensure a fair comparison, knowing that everything was tested under exactly the same conditions.

⸻

And the Exceptions?

There are situations in which a controlled trial is not necessary, but these situations are rare and generally very easy to recognize: when the effect of the treatment is dramatic and obvious. A good example is the use of ventricular defibrillators for cardiac arrest. This is an acute emergency situation in which applying an electrical current directly to the chest restores the heartbeat. This is a dramatic and observable effect. Other dramatic effects described in the book Testing Treatments, by Evans and collaborators, include draining pus from abscesses and blood transfusions for hemorrhage. These are observable effects that could not happen spontaneously or because of confounding factors in the counterfactual.

Another dramatic factor is effect size. A classic example is the effect of vitamin C in treating scurvy. In the mid-18th century, a British naval physician named James Lind conducted what would likely become the first controlled clinical trial in history for a disease and intervention. At that time, scurvy — which we now know is caused by vitamin C deficiency — killed more sailors than wars. Prolonged periods at sea, without access to healthy food with fruits and vegetables, created the ideal environment for the disease to develop. Vitamin C is essential for collagen formation, which in turn makes up connective tissue. It is also necessary in metabolic pathways for energy production. Lack of vitamin C causes anemia, weakness, muscle pain, impaired wound healing, blackening of the gums, and tooth loss.

Lind was fortunate. If the effect size had been more subtle, it might not have appeared in such a small group. The same reasoning applies to adverse effects. During vaccine clinical trials in the pandemic, we knew that rare adverse effects would only appear in large numbers of people, perhaps only after the vaccine had already been approved. Which in fact happened: some effects were documented only after certain approved vaccines had been administered to millions. Another classic example of a dramatic effect size is the use of insulin for people with diabetes.

Sample size matters for this reason as well. The larger the sample, the more likely we are to identify smaller effects and adverse effects. The type of outcome we want to measure will also help determine the necessary sample size — in other words, how many people I need to recruit to have the statistical power necessary to conclude whether the treatment works. The more objective and easy to measure, the better. If the outcome is more subjective and dependent on interpretation, it becomes more difficult to draw conclusions and may require involving many more people.

⸻

And the Ethical Question?

It is not always possible or ethical to conduct a controlled study with a control or placebo group. The most commonly used standard to define the ethics of a clinical trial is what we call “equipoise.” Equipoise is defined as a genuine state of uncertainty about the merit of a particular treatment. In other words, if no one knows whether the new treatment works or not, or whether it is better than the pre-existing one, it is ethical to conduct a clinical trial. If there is sufficient information that the new treatment works and is better, or that it does not work and is worse, then it would be unethical to conduct the trial.

Another important ethical issue is the use of a placebo group. As we have seen, placebo groups are excellent for reducing cognitive biases and confounding factors, especially when it is possible to conduct a double-blind trial. But it is not always ethical to use a placebo group if that would mean depriving a group of people of receiving already approved and effective treatments. For example, to test a new cancer medication, one cannot allocate sick individuals to a treatment group receiving the new drug and other sick individuals to receive a sugar pill. In such conditions, what is done is to compare the new treatment with the standard therapy for that disease. Returning to the vaccine example, imagine someone wants to test a new vaccine for a disease, but an older vaccine already exists. The new vaccine can then be tested in comparison with the older one, in a “non-inferiority” trial. Here, we seek to show that the new vaccine is at least as good as the older one. But no one goes without protection.

There are various experimental designs that allow for an ethical trial and a fair comparison between groups without depriving anyone of receiving existing treatments. In addition to testing against standard treatment, we can add the new drug in an A + B model. In some cases, it is even possible to include a placebo. For example, I can design a trial where both groups receive the approved standard treatment A. But one of the groups also receives the new medication B. And the control group receives a placebo for B.

To adapt the RCT to ethical standards that do not deprive anyone of appropriate care, the “SAME” method is generally used, from the English acronym meaning Substitution, Augmentation, Maintenance, and/or Elimination. If people are allocated to groups with something distinct from the standard, it is substitution; if they receive something in addition to the standard, it is augmentation; and so on.

An example was the trial of the antiretroviral nevirapine. When this medication was tested, other approved medications for HIV already existed. The way the clinical trial was designed was as follows: both arms of the clinical trial received the standard treatment, zidovudine and didanosine. Only one arm received the “augmentation” with nevirapine. In order to maintain blinding, the control group received a placebo for nevirapine. The study was published as a “controlled, double-blind, placebo-controlled clinical trial.” And entirely ethical.

Effect size can also create an ethical issue. If the effect is very dramatic, it becomes unethical to continue the study with people in the control group without access to the superior treatment. In this case, it is justified to interrupt the study, break the blinding, and offer the treatment to the placebo group.

All of this is determined on a case-by-case basis. It may be perfectly acceptable to use a placebo, or a control that “eliminates” part of the standard procedure, in a clinical condition that causes only discomfort when untreated, and volunteers agree to participate in a clinical trial for a certain period of time, hoping that this will result in a better and more practical medication in the future. For diseases such as cancer or HIV/AIDS this would not be acceptable, and using “augmentation” appears to be the best strategy. There is no single, immutable solution, nor something generalizable like “we must always use placebo” or “we must never use placebo.”

Finally, it is important to emphasize that the double-blind, placebo-controlled RCT is generally the last phase of testing. Before reaching that point, there is a long path of preclinical tests, in cells and in animals (rodents and non-rodents), before clinical trials in humans begin, starting with Phase 1, which uses a few dozen people to test only the safety of the new treatment; then Phase 2, with hundreds of people, in which parameters and markers are tested, such as antibodies, blood markers, imaging tests, and may or may not include a control group, depending on the experimental design; and finally Phase 3, where, if possible, a full RCT will be conducted with thousands of people. After everything is completed, if approved, the medication is released to the market, but we still have Phase 4, in which monitoring continues to investigate adverse effects, drug interactions, and to see how the new medication behaves in the real world.

All of this takes years, usually at least between 5–8 years. Few molecules that appear promising in preclinical tests reach Phase 3. So when we say that something has appeared promising in animals, or in small pilot tests in humans, or even in Phase 1, all we can say is exactly that: it appears promising. To know whether it works, there is no other way but to follow scientific methodology. Regulatory agencies know this. That is why they require specific studies for each phase. And that precaution keeps us safe and equipped with the knowledge necessary to make the best use of each innovation.

⸻

Natalia Pasternak holds a PhD in microbiology, is president of the Instituto Questão de Ciência, adjunct senior researcher at the Center for Science and Society and adjunct professor at the School of International Relations and Public Affairs (SIPA), both at Columbia University (USA). She is an associate researcher at the Department of Microbiology of the Institute of Biomedical Sciences at the University of São Paulo (USP) and co-author of the books Ciência no Cotidiano (Editora Contexto), winner of the Jabuti Prize; Contra a Realidade (Papirus 7 Mares); and Que Bobagem! (Editora Contexto).

⸻

REFERENCES

Pearl, Judea, and Dana Mackenzie. The Book of Why: The New Science of Cause and Effect. New York: Basic Books, 2018.

Evans I, Thornton H, Chalmers I, Glasziou P. Testing Treatments: Better Research for Better Healthcare. Second Edition.

The James Lind online library. United Kingdom. http://www.jameslindlibrary.org/articles/james-lind-and-scurvy-1747-to-1795/

Freedman B. Equipoise and the ethics of clinical research. N Engl J Med. 1987 Jul 16;317(3):141-5. doi: 10.1056/NEJM198707163170304. PMID: 3600702.

Senn S, Chalmers I. Giving and taking: ethical treatment assignment in controlled trials. Journal of the Royal Society of Medicine.

Check the original version:

https://revistaquestaodeciencia.com.br/artigo/2026/02/25/por-que-precisamos-de-testes-clinicos