
Beyond P-Values: How EFSA assesses biological relevance in nutraceutical trials
Key Takeaways
- EFSA prioritizes biological relevance over statistical significance, requiring meaningful, consistent, and reproducible health impacts for nutraceutical approval.
- Common pitfalls in trials include small effect sizes, short durations, and lack of dose-response data, leading to EFSA rejections.
How to design a study so that it passes EFSA's appraisal of biological relevance.
In nutraceutical research, a p-value below 0.05 (p<0.05) often feels like a win. It quickly becomes headlines, “clinically proven” labels, and confidence for investors. But when the same data reaches the European Food Safety Authority (EFSA), that celebration often ends with an unexpected response: “Not sufficient.”
The reason is clear. EFSA doesn’t judge results by numbers alone. A p-value only indicates that something didn’t happen by chance; it doesn’t reveal if the change is large enough, consistent enough, or relevant enough to improve human health. A small dip in cholesterol or a short-lived biomarker shift may still appear “significant,” but it won’t pass EFSA’s appraisal of biological relevance.
This shift from focusing on p-values to demonstrating real-world impact determines whether a claim gains trust or fades away. This article explains “Statistical Significance to Real-World Impact” through EFSA’s perspective and offers practical tips for designing trials that meet EFSA standards.
EFSA’s Framework for Biological Relevance
Effective trial design begins with a clear view of what EFSA considers biologically relevant and the type of evidence required to prove it. Decoding EFSA’s framework for Biological Relevance is the first step towards this journey.
- The Foundational Guidance (2017): EFSA’s Guidance on the Assessment of the Biological Relevance of Data in Scientific Assessments (2017) remains the cornerstone. It doesn’t just ask whether a result is statistically significant; it asks whether the observed change in humans is genuinely meaningful for health. This guidance laid the groundwork for moving beyond numbers to physiological impact.1
- Statistical Significance versus Biological Relevance: Back in 2011, EFSA’s Scientific Committee established a clear and lasting benchmark. Their statement, “Statistical Significance and Biological Relevance,” made it clear: a p-value below 0.05 does not guarantee approval. Instead, reviewers consider effect size, consistency across outcomes, reproducibility across multiple studies, dose-response relationships, and relevance to normal human physiology.
In other words, it’s not just about proving something happens; it’s about proving it from multiple angles, demonstrating that the finding is not merely a statistical fluke, but a consistent, reproducible, dose-linked effect that truly matters in the real world.2 - The Evolution of Guidance (2024): More recently, EFSA’s Scientific Committee (2024) has pushed the conversation further. Today, dossiers must weave together multiple layers of evidence: epidemiology, mechanistic plausibility, and high-quality human trials. This holistic approach reflects EFSA’s growing demand for proof that is both statistically sound and biologically credible.3
What Does Biological Relevance Mean in Practice?
For EFSA, a p-value below 0.05 is just a number. What matters is whether the change is real, lasting, and meaningful for people’s health. Biological relevance simply means trial outcomes should correlate with an impact on human physiology in real life.
Below are the core elements EFSA looks for in Biological Relevance, and how they apply in the context of nutraceutical trials:
Component
What EFSA Expects
Why It Matters in Nutraceutical Trials
Effect Size
A measurable change large enough to influence health, not just a decimal shift.
A 1% drop in LDL-C may be “significant,” but only a 10–15% drop is meaningful for risk reduction.
Dose-Response
Evidence that higher intake produces stronger or sustained effects.
Supports causality and helps EFSA judge the “real-world” serving size needed for a benefit.
Population Relevance
Results in healthy or at-risk groups, not diseased patients.
EFSA authorizes claims for the general population; diseased cohorts shift evidence into drug territory.
Duration & Sustainability
Effects must persist for the duration relevant to the claim.
A short-lived biomarker spike won’t convince EFSA of lasting health impact.
Consistency Across Studies
Replication in multiple trials and settings.
A single positive trial is rarely enough; reproducibility strengthens credibility.
Mechanistic Plausibility
A clear biological explanation linking the ingredient to the effect.
Mechanism builds trust that the result isn’t a fluke and aligns with known physiology.
Pitfalls in Nutraceutical Trials That Trigger EFSA Rejections
Despite positive p-values (p <0.05), countless nutraceutical trials fail to clear EFSA’s bar for biological relevance. Below are the most common pitfalls that derail otherwise promising dossiers:
- Tiny effect sizes that are statistically significant due to large sample sizes, yet offer negligible real-world benefit.
- Short-duration trials where the measured effect might fade quickly.
- Endpoint mismatch: Using biomarkers or lab measures not validated or relevant for human health (e.g., total antioxidant capacity measured in vitro rather than biomarkers of oxidative stress in vivo).
- Lack of dose-response data: Testing only one dose, or extreme doses that are impractical or unsafe in regular use.
- Failing to replicate: Only one trial, lacking consistency across populations or centers.
- Neglecting baseline characteristics: Populations with abnormal starting values (e.g., very high cholesterol) when the claim is for maintenance of normal values may mislead EFSA.
Cracking EFSA’s Code: Best Practices Beyond P-Values
Here are some best practices to design and plan trials that move beyond p-values and stand up to EFSA’s test of biological impact:
- Define “Meaningful” Before the First Subject Enrollment: A clinical trial must not conclude with doubts about the biological relevance of the observed effect; that decision should be documented in the protocol at the design stage.
- Define the primary outcomes to be tested in the trial. Secondary outcomes are allowed, but do not overdo. Do not make your study a fishing experiment, but focus on the primary hypothesis.
- Set a threshold for meaningful impact. For example, in a relatively healthy or at-risk population, an LDL-C reduction of about 20 mg/dL signifies a biologically relevant change. In contrast, the same reduction in patients with already elevated LDL may not have the same biological importance. Conversely, a mere 2% change, although statistically detectable in a large sample size, has little physiological significance in any group.
- Anchor the target effect to accepted outcomes. Minimum relevant differences should align with established clinical cut-offs or recognized risk reduction benchmarks that resonate with EFSA’s guidelines of biological relevance.
- Include Multiple Doses / Dose-Response Arm(s): Testing more than one dose allows you to observe whether an increase in dose yields an increased effect, or whether there is a threshold or plateau. This helps EFSA judge dose credibility.
- Choose Endpoints That Are Validated, Relevant, and Recognized
- EFSA mainly accepts health claims based on validated and established biomarkers. Hence, use biomarkers that EFSA has recognized in its previous health claim guidance documents—for example, LDL-C for cardiovascular risk, HbA1c for glucose control, stool transit measures for gut function, and validated inflammatory markers. These endpoints appear in EFSA’s published guidance on cardiovascular health, glucose metabolism, and GI/immune system claims.4,5
- Avoid surrogate endpoints unless strong evidence exists that they predict meaningful outcomes in humans. For instance, EFSA guidance (Guidance for the scientific requirements for health claims related to antioxidants and oxidative damage, 2018) indicates that in vitro antioxidant capacity assays such as Oxygen Radical Absorbance Capacity (ORAC), Ferric Reducing Antioxidant Power (FRAP), and similar tests are not sufficient alone as predictors of human health benefit, unless there is strong evidence linking them to meaningful physiological outcomes.6
- Ensure Duration & Follow-Up are Long Enough
- Some effects (like lipid changes, gut microbiome shifts, etc.) may take weeks to months to stabilize.
- Follow-up periods help show whether the effect persists or regresses over time.
- Statistical Power Plus Clinical Relevance Criterion
- Don’t just calculate sample size to detect any statistical difference; power the trial to identify a biologically meaningful difference. The p-value shows whether an observed effect is likely due to chance, while the study’s power indicates how likely it is to detect a real effect if one exists. A well-powered study makes sure that if a significant biological effect is present, the trial has a high chance of finding it. Underpowered studies may pick up tiny changes that are statistically significant but unimportant for health, or even miss truly important biological effects entirely. Adequate power is what separates a simple statistical anomaly from a change with real-world biological relevance.
- Consider dual-criterion designs: significance and a minimal clinically relevant threshold. (See literature on dual-criterion designs in clinical trials.7
- Mechanistic / Biomarker Endpoints
- Including mechanistic or biomarker endpoints (e.g., inflammatory cytokines, oxidative stress metrics) adds biological credibility.
- These should be secondary endpoints or exploratory, but well-designed and using validated assays.
- Reproducibility & External Validity
- Multi-centre or cross-population trials improve generalizability.
- Compare with prior studies; ensure consistency in effect.
- Also check for reproducibility within the same study across related outcome measures, to show that the observed effect is not an isolated finding.
- Typically, EFSA will want to see at least two independent studies showing the same outcome.
- Transparent Reporting & EFSA-Friendly Dossier Design
- Notify the study to EFSA before the start (both the applicant and the testing facility).
- Report effect sizes with confidence intervals, not just p-values.
- Report negative or non-significant results.
- Disclose variability, baseline values, and dropouts.
- Use EFSA’s format and full data disclosure.8
EFSA’s Official Guidance: Key Takeaways
A few specific official points from EFSA guidance docs that researchers must integrate:
- EFSA’s Guidance on the Assessment of the Biological Relevance of Data in Scientific Assessments urges applicants to address four main aspects: consistency, dose-response, magnitude of effect, and biological credibility.1
- EFSA expects effects to be meaningful within the context of human physiology and relevant to the normal function or condition of the target population. For example, modest changes in clinical biomarkers that do not translate into changes in risk are of limited value.1
- The 2011 EFSA opinion Statistical Significance & Biological Relevance makes clear that statistical significance is only one part of evaluating findings; effect size, consistency across outcomes, and reproducibility are also critical. EFSA’s 2017 Guidance further states that non-significant results may still be biologically relevant, particularly when the effect is large, related outcome measures align, or the study has limited power. (EFSA Scientific Committee, 2011; EFSA Scientific Committee, 2017 Guidance).1,9
Turning EFSA’s guidance into real-world trial practice is rarely straightforward. The challenge is not just in running statistics or picking endpoints; it is in achieving the delicate balance between biological relevance and regulatory acceptance. That balance becomes possible only when sponsors engage early with experienced research partners, sharing their strategic and regulatory goals before a single protocol is drafted. Published nutraceutical trials have shown this in action—for instance, Hancke et al. (2019) demonstrated how rigorous endpoint selection and a carefully engineered design, developed with CRO expertise, translated into evidence that could withstand both scientific and regulatory scrutiny.10
When this level of dialogue occurs upfront, studies achieve sharper designs, more consistent outcomes, and significantly greater credibility during regulatory review.
EFSA approval is important, but it is only the start. Trials rooted in biological relevance go beyond p-values, providing evidence that regulators respect, clinicians trust, and consumers believe. In a marketplace full of claims, this is what distinguishes fleeting products from lasting brands.
References
- Hardy, A.; Benford, D.; Halldorsson, T.; Jeger, M.J.; Knutsen, H.K.; More, S.; Naegeli, H.; Noteburn, H.; et al. Guidance on the assessment of the biological relevance of data in scientific assessments. EFSA Journal. 2017, 15 (8), 4970. DOI: 10.2903/j.efsa.2017.4970
- EFSA Scientific Committee. Statistical Significance and Biological Relevance. EFSA Journal. 2011, 9 (9), 2372. DOI: 10.2903/j.efsa.2011.2372
- More, S.; Bampidis, V.; Benford, D.; Bragard, C.; Hernadez-Jerez, A.; Bennekou, S.H.; Koutsoumanis, K.; Lambre, C.; et al. Scientific Committee guidance on appraising and integrating evidence from epidemiological studies for use in EFSA's scientific assessments. EFSA Journal. 2024, 22 (7), e8866. DOI: 10.2903/j.efsa.2024.8866
- Parma, U.; Martini, D.; Del Rio, D.; Bedogni, G.; Pruneti, C.; Ventura, M.; Passeri, G.; Vitale, M.; Dei Cas, A.; Zavaroni, I.; et al. GP/EFSA/NUTRI/2014/01 Scientific substantiation of health claims made on food: Collection, collation and critical analysis of information in relation to claimed effects, outcome variables and methods of measurement. EFSA Support. Publ. 2018, 15, 1272E. DOI: 10.2903/sp.efsa.2018.EN-1272.
- Hoevenaars, F.; van der Kamp, J.W.; van den Brink, W.; Wopereis, S. Next Generation Health Claims Based on Resilience: The Example of Whole-Grain Wheat. Nutrients. 2020, 12 (10), 2945. DOI: 10.3390/nu12102945
- Turck, D.; Bresson, J.L.; Burlingame, B.; Dean, T.; Fairweather-Tait, S.; Heinonen, M.; Hirsch-Ernst, K.I.; Mangelsdorf, I.; et al. Guidance for the scientific requirements for health claims related to antioxidants, oxidative damage and cardiovascular health. EFSA Journal. 2018, 16 (10), e05136. DOI: 10.2903/j.efsa.2018.5136
- Roychoudhury, S.; Scheuer, N.; Neuenschwander, B. Beyond p-values: a phase II dual-criterion design with statistical significance and clinical relevance. Clinical Trials. 2018, 15 (5). DOI:10.1177/174077451877066
- EFSA Panel on Dietetic Products, Nutrition and Allergies. Scientific and technical guidance for the preparation and presentation of a health claim application (Revision 3). EFSA Journal. 2021, 19 (3), e06554. DOI: 10.2903/j.efsa.2021.6554
- EFSA Scientific Committee, 2011. Statistical significance and biological relevance. EFSA Journal. 2011, 9 (9):2372, 17 pp. DOI: 10.2903/j.efsa.2011.2372
- Hancke J. L., Srivastav S., Cáceres D. D., Burgos R. A. A double-blind, randomized, placebo-controlled study to assess the efficacy of Andrographis paniculata standardized extract (ParActin®) on pain reduction in subjects with knee osteoarthritis. Phytotherapy Research. 2019, 33 (5), 1469–1479. DOI: /10.1002/ptr.6339
Newsletter
From ingredient science to consumer trends, get the intel you need to stay competitive in the nutrition space—subscribe now to Nutritional Outlook.





