Methods -

Evidence identification

Our goal is to provide reliable, up-to-date answers to clinically relevant questions through continuous evidence synthesis. Electronic database searches are conducted at predefined intervals to identify newly published and ongoing studies. In addition, clinicians, researchers, and other stakeholders may submit potentially relevant studies through the platform. All submitted studies undergo formal eligibility assessment using predefined criteria and methodological standards prior to inclusion.

If you have identified a study that may meet the eligibility criteria for one of our living reviews, you can submit it for consideration here

Eligibility criteria and scope of evidence

This platform focuses primarily on evidence derived from randomised controlled trials (RCTs), as these provide the most reliable estimates of intervention effectiveness by minimising bias and confounding. Eligibility criteria are intentionally broad to ensure comprehensive capture of relevant clinical evidence, while remaining structured around clearly defined research questions formulated using the Population, Intervention, Comparator, and Outcome (PICO) framework

Priority is given to outcomes of direct clinical importance to patients, clinicians, and healthcare systems. These include measures such as symptom resolution, functional recovery, return to activity, re-injury, and adverse events. This approach ensures that each review addresses clinically meaningful questions, reflects real-world decision-making, and provides estimates of effect that are directly interpretable and applicable in practice. Ongoing and prospectively registered trials are also identified and tracked where possible, allowing incorporation of emerging evidence in future updates.

Data Pooling

Meta-analysis is undertaken where studies are sufficiently comparable in terms of population, intervention, comparator, and outcome. For dichotomous outcomes, pooled effect estimates are calculated as odds ratios with 95% confidence intervals. Analyses are conducted on an intention-to-treat basis, using the number of participants randomised as the denominator where available. Absolute risk differences and number needed to treat are derived from pooled effect estimates and baseline event rates.

Statistical heterogeneity is assessed using the I² statistic, H², tau-squared, and Cochran’s Q test. Where heterogeneity is present, the pooled estimate is interpreted as the mean of a distribution of effects rather than a single common effect

Certainty of Evidence Assessment

We assess certainty in the evidence using a structured approach informed by the Grading of Recommendations Assessment, Development and Evaluation (GRADE) framework. Consistent with a minimal contextualised approach, we evaluate certainty in relation to a single critical outcome (e.g. re-injury) and prespecified thresholds for clinical importance. We place particular emphasis on the domains of risk of bias and precision, while also considering inconsistency, missing outcome data, and potential small-study effects as contextual factors that may influence confidence in the estimate. Randomised controlled trials begin as high-certainty evidence and are downgraded where concerns arise in these domains

Risk of Bias

We assess risk of bias at the study level using the PEDro scale, a validated and widely used tool for evaluating methodological quality in randomised controlled trials of rehabilitation and musculoskeletal interventions. We categorise studies as:

Low risk of bias: PEDro score 7–10
Moderate risk of bias: PEDro score 4–6
High risk of bias: PEDro score 0–3

We determine the overall risk of bias for each pooled estimate by considering both the methodological quality of included studies and their relative statistical contribution (weight) to the pooled effect. We downgrade certainty where the pooled estimate is primarily informed by studies at moderate or high risk of bias.

Precision and Clinical Importance

We assess precision using a minimal contextualised approach, based on the position of the pooled effect estimate and its 95% confidence interval relative to prespecified thresholds for clinical importance. Effects within these thresholds are considered clinically trivial, while effects outside this range represent potentially clinically meaningful benefit or harm.

We downgrade certainty for imprecision where the confidence interval overlaps the trivial zone (e.g. OR 0.90–1.10), indicating that the data remain compatible with both clinically meaningful and trivial effects, or with benefit and harm. Where the confidence interval lies entirely outside this range, we consider the estimate sufficiently precise to support a clear clinical interpretation.

In addition to this minimal contextualised approach, we also examine statistical precision by comparing the width of the confidence interval with the magnitude of the pooled effect estimate. This is expressed as the ratio of the confidence interval width to the absolute value of the pooled effect size. The confidence interval width is calculated as the difference between the upper and lower bounds of the 95% confidence interval.

This ratio is used descriptively to contextualise estimate stability. When the confidence interval is substantially wider than the point estimate, the estimate is considered unstable and potentially sensitive to additional data. Ratios closer to or below 1 indicate relatively greater precision, whereas larger ratios indicate increasing imprecision and uncertainty in the true treatment effect. This measure is interpreted alongside the broader assessment of imprecision based on confidence intervals and clinical thresholds.

Small-study effects

Where sufficient studies are available, we explore small-study effects using visual inspection of funnel plots and statistical methods such as trim-and-fill. These analyses are interpreted cautiously, particularly when the number of included studies is small. Evidence of asymmetry or attenuation of effect estimates is considered as a potential source of uncertainty, but is not used in isolation to determine certainty ratings.

Statistical Robustness (Supportive Analyses)

If pooled effect sizes are statistically significant, we evaluate statistical robustness using fragility-based metrics. We define the fragility index (FI) as the minimum number of outcome events that must change to alter statistical significance. We interpret this alongside the fragility quotient and the susceptibility index (SI), which we define as the ratio of the fragility index to the number of participants lost to follow-up.

These measures provide additional context regarding the stability and vulnerability of statistical conclusions, particularly in relation to missing outcome data. We use fragility-based metrics as supportive indicators of robustness, but do not use them directly to determine certainty ratings.

Susceptibility Index Interpretation

The susceptibility index (SI) indicates how vulnerable statistically significant findings may be to missing outcome data. It represents the ratio of the fragility index (FI) to the number of participants lost to follow-up.

Because outcomes for participants lost to follow-up are unknown, the SI evaluates whether the number of missing participants is large relative to the number of event reversals required to change statistical significance. Lower SI values indicate greater vulnerability, as the number of participants lost to follow-up exceeds the number of event reversals needed to alter significance. Higher SI values indicate greater robustness, as statistical significance would only be altered if a relatively large number of missing participants had experienced different outcomes.

As a general guide:

SI < 1 — high vulnerability to missing outcome data
SI 1–5 — some vulnerability to missing outcome data
SI > 5 — relatively robust to missing outcome data

We apply these measures alongside cumulative meta-analysis to evaluate how statistical robustness evolves as evidence accrues.

Rating the Overall Certainty of Evidence

We classify overall certainty according to GRADE Working Group definitions:

High certainty — we are very confident that the true effect lies close to that of the estimate of the effect.
Moderate certainty — we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate, but there is a possibility that it is different.
Low certainty — our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate
VVery low certainty — we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate.

Randomised controlled trials begin as high-certainty evidence. We downgrade certainty where concerns arise in the domains of risk of bias or precision.

Community Contributions

If you have identified a study that may meet the eligibility criteria for one of our living reviews, you can submit it for consideration here. All suggested studies are screened using the same criteria as formal searches.

Conflicts of Interest

The Knowledge Hub is independent. Sponsors or partners have no influence on evidence selection, analysis, or conclusions.