← Blog
essayCausal Inference

The Replication Crisis Is Your Problem Too

Most findings from nutrition, psychology, and medicine that shaped your beliefs about health and behavior have not replicated. This is not a footnote — it changes what you should believe and how you should act.

DoOperator Research · May 24, 2026

Decision takeaway

Most findings from nutrition, psychology, and medicine that shaped your beliefs about health and behavior have not replicated. This is not a footnote — it changes what you should believe and how you should act.

The Replication Crisis Is Your Problem Too

Most findings from nutrition, psychology, and medicine that shaped your beliefs about health, behavior, and decision-making have not replicated. This is not a footnote. It changes what you should believe and how you should act.

The replication crisis is often framed as a problem for academic science — a quality control failure in peer review, a pathology of publish-or-perish incentives, a concern for methodologists and journal editors. This framing lets everyone else off the hook. If the crisis belongs to science, it can be safely ignored by people who just want to know what to eat, how to sleep better, or what actually reduces their anxiety.

That move is not available. The unreplicable findings are the ones most people have heard of. The results that made headlines, the ones cited in bestselling books, the studies referenced by productivity gurus and longevity researchers — these are disproportionately the findings that failed to hold up. The boring, well-replicated findings rarely become cultural touchstones, because boring and well-replicated are often the same thing as small and hedged.

What Failed

The list of high-profile failures is long and continues to grow.

Power posing — the finding that holding a confident body posture for two minutes changes hormone levels and risk tolerance — failed to replicate in a large preregistered study. The original result had a sample of 42 participants. It made the cover of TIME, got a TED talk viewed 60 million times, and influenced management training programs worldwide.

Ego depletion — the finding that willpower is a limited resource that gets used up through the day — was the foundation of an entire framework for understanding self-control. A large multi-site replication found no evidence for the effect. The textbooks had not caught up with the research.

Social priming — the idea that subtle environmental cues prime behavior, so that elderly-related words make people walk more slowly — collapsed under replication pressure. The effect sizes in the original studies were implausibly large. The protocols turned out to be sensitive to unconscious experimenter influence in ways that invalidated the results.

Nutrition is worse. Studies linking specific foods to cancer, heart disease, or longevity are overwhelmingly observational, riddled with confounding, and generate findings that reverse with disturbing frequency. Eggs were bad, then fine, then good, then a matter of ongoing controversy. Dietary fat caused heart disease until it did not. Red wine was protective until the studies controlling for abstainer bias were run.

Why It Matters for Individual Decisions

The practical response to the replication crisis is not nihilism. It is calibration.

Findings from small, single studies should be held lightly, regardless of how many media outlets covered them. Effect sizes that are implausibly large should be treated with suspicion. Results that depend on specific experimental protocols and have never been independently replicated are evidence, but weak evidence — weaker than the way they are typically described.

Pre-registered, well-powered, independently replicated findings are more reliable. They also tend to show smaller effects. The calibrated view of psychology and nutrition research is that real effects exist but are modest, heterogeneous, and hard to generalize to any individual's specific context.

This has a concrete implication for behavioral change. When the evidence for a specific intervention is weak, the most honest approach is not to assume the intervention works and adjust for effect size — it is to treat the intervention as a hypothesis worth testing for yourself. The population-level finding, if it exists at all, tells you what was true on average in a study sample under specific conditions. It does not tell you what will happen to you.

The Right Response

The right response to the replication crisis is more experimentation, not less trust in everything.

Systematic self-experimentation — careful, controlled, honest — generates evidence about what actually works for a specific person. This evidence is particular and local rather than general, but it is also actual rather than probabilistic. Knowing that a specific change produced a specific outcome in your specific life is different from knowing that a population study found a statistically significant average effect.

The failure modes of population research — confounding, measurement error, sample bias, publication bias — do not disappear in individual experimentation. But they take different forms, and some of them can be addressed through design choices available to the individual experimenter: controlling the intervention carefully, measuring outcomes consistently, maintaining blind conditions where possible, and running the experiment long enough to distinguish signal from noise.

The replication crisis reveals that much of what we call evidence-based practice is not as evidence-based as it appeared. That is a reason to be more rigorous about evidence, not a reason to give up on it. The answer to weak science is better science — including, when population data is unavailable or unreliable, the small-sample science of your own life.

More from the blog

Correlation Was Never the Problem"Correlation is not causation" is one of the most-repeated phrases in empirical research. It is also, as usually understood, a dramatic understatement of the actual difficulty. The real challenge is not distinguishing correlation from causation — it is identifying which causal story is correct when several are consistent with the same data.May 29, 2026The Illusion of Control: Why Most A/B Tests Mislead More Than They InformOrganizations run thousands of A/B tests every year and congratulate themselves on being data-driven. Most of those tests are statistically invalid. Here is why — and what rigorous experimentation actually requires.May 27, 2026What N-of-1 Trials Get Right That Population Studies Get WrongRandomized trials on populations measure average effects in heterogeneous groups. N-of-1 trials measure what actually happens to one specific person. For individual decision-making, the latter is usually more relevant.May 26, 2026
The Replication Crisis Is Your Problem Too — DoOperator Research | DoOperator