The level of evidence

As we saw in the ASK section and previously in this module, there are a number of types of questions we can pose; these can, in turn, be answered by a number of different studies designed in various ways.

Information on treatment, prognosis and incidence, aetiology and risk, diagnosis, and prevalence can be derived from different sources and study designs. All study designs, however, are prone to methodological issues that will affect the certainty that the findings presented are valid. There is no perfect study, but the limitations of studies should be clear to the reader and help you along in trying to interpret and apply the information. It is very important to emphasise that the assessment of the quality of the study should be the first step when considering the implementation of new concepts into practice.

A number of study types exist and well-designed studies in each of these categories can be used to address veterinary questions of interest. One way to start is to think about how much evidence a particular type of study can provide for the type of question you are asking. As long as the study you are evaluating is of acceptable quality, choosing the type of study that is most appropriate to answering your question is the best way forward. The table below ranks studies in this way, with those that can show the best evidence at the top of the table, down to those that show the least evidence towards the bottom.

In human medicine, D.L. Sackett introduced the ‘pyramid’ or ‘hierarchy’ of evidence to aid in teaching appraisal of the scientific literature. This pyramid was designed to enable people to understand the concept that not all study designs are equal, but it is most applicable to questions about interventions (treatments). It is now widely accepted that appraising literature is more complicated than this pyramid can account for. In veterinary medicine especially, this pyramid is difficult to apply, and beginning with your clinical question may be more straightforward. Again, remember that a good quality study is essential, no matter what level of evidence the study seems to provide.

In your clinical decision-making, you should rely on the strongest evidence available, and therefore determine the level of evidence a paper provides before implementing the information in clinical practice. You may also need to accept that the “best available” evidence may be lower down in this table than you might prefer (e.g. there may only be a handful of individual case reports rather than a systematic review), but take heart – some evidence is better than none!

Also remember that within each level of evidence, individual information should be evaluated and may be considered to be stronger or weaker after a thorough appraisal.

Question Intervention
Level of evidence Treatment Prognosis Risk Diagnosis Prevalence Incidence
1 (most robust) Systematic review and meta-analysis Systematic review and meta-analysis Systematic review and meta-analysis Systematic review and meta-analysis Systematic review and meta-analysis Systematic review and meta-analysis
2 Randomised controlled trial Cohort study Cohort study Diagnostic  test evaluation study Cross-sectional study Cohort study
3 Cohort study Case-control study
4 Case report or case study Case report or case study Case report or case study Case report or case study Case report or case study Case report or case study
5 (least robust) Opinion consensus Opinion consensus Opinion consensus Opinion consensus Opinion consensus Opinion consensus

This table has been adapted and simplified from the Oxford Centre for Evidence-based Medicine – Levels of Evidence 2009 and further information about each type of study is provided below.

Systematic reviews and meta-analyses follow a very strict protocol for summarising the evidence. They can be applied to many different types of questions. RSVC Knowledge provide a checklist of issues to consider in the evaluation of a systematic review.

There are many sources of veterinary evidence, and it can be helpful to break them down into primary (original research) and secondary (reviews with commentary on a number of primary studies) sources. With secondary sources, it’s important to distinguish systematic reviews from narrative reviews of the scientific literature.

Systematic reviews
Systematic reviews employ standardised and rigorous methodologies to review scientific literature, with a view to minimising bias. They conduct a comprehensive literature search to identify, appraise, and synthesise all the relevant studies on a particular topic. They will formally and openly report the sources they use as well as the search strategies used to find those sources, so that searches can be peer-reviewed and replicated.
Narrative reviews
Narrative reviews, on the other hand, do not involve explicitly systematic searches of the literature, and so only tend to cover a subset of studies based on availability or author selection. This can introduce an element of selection bias. It is worth noting that narrative reviews can be informative, particularly if a systematic review does not currently exist on a particular topic, but they are not as robust as systematic reviews.

Meta-analysis involves applying statistics to the systematic review, providing a quantitative summary of the information obtained, and traditionally it focused on the estimation of a combined measure (e.g. relative risks or treatment effects) and weighted the included studies according to their size. By combining the results of several studies, the precision of the estimate can be increased through increased sample size and resultant statistical power.

For instance, ten studies looking at one specific type of treatment, when taken together, are much more powerful than one study on its own. The type of objective quantitative assessment of the results that a meta-analysis provides enables conclusions to be drawn based on information included in all the studies available. Nowadays, it is widely accepted that meta-analysis also provides an estimate of the relative importance of different factors affecting the outcome of interest. From a clinical point of view, this is very useful, since it helps to identify important risk factors that could apply to a particular patient with the characteristic of interest and the most likely outcome of a particular intervention.

Randomised Controlled Trials are clinical trials with random allocation of the animals to at least two groups (e. g. intervention and control). Given that the allocation of animals to the intervention of interest is performed randomly, all other characteristics of the population should be equally distributed across the treatment groups, thus decreasing bias. Therefore, evidence of a cause–effect relationship is more credible in these types of studies.The EBVM Toolkit from RCVS Knowledge provides a checklist of issues to consider in the evaluation of controlled trials.

When considering evidence provided by controlled trials without randomisation and cohort studies, one of the important points to consider is how well executed these studies might be, because without randomisation, equal distribution of characteristics of the animals included is not assured, and the results may be biased. However, well-designed and well-analysed studies in this level could provide strong evidence of a cause-effect relationship. Cohort studies are also particularly useful in assessing evidence relating to prognosis. The EBVM Toolkit from RCVS Knowledge provides a checklist of issues to consider in the evaluation of a cohort study.

Case-control studies are considered to be a lower level of evidence for risk factors, given that they are more susceptible to multiple types of bias than cohort studies. However, well-designed and properly analysed case-control studies can provide solid evidence, for instance on risk factors for specific conditions. The EBVM Toolkit also provides a checklist of issues to consider in the evaluation of a case-control study.

Cross-sectional studies are best at measuring prevalence of disease in a population. However, well-executed cross-sectional studies can provide valuable evidence for certain risk factors, e.g. sex of the animal. The EBVM Toolkit provides a checklist of issues to consider in the evaluation of a cross sectional study.

Diagnostic evaluation studies are specifically designed to establish whether a new diagnostic tool accurately identifies disease in ill animals, and absence of disease in healthy animals. Animals are tested for the disease using both the existing ‘gold standard’ diagnostic test, and the new diagnostic test, and the sensitivity, specificity and likelihood ratios for the new test are calculated. Guidelines on evaluating diagnostic evaluation tests can be found here.

Case reports lack a comparison group, so it is very difficult to establish cause-effect relationships or indeed to be sure if an intervention made a difference in the first place, as we do not know what would have happened if another course of treatment, or no treatment was given.

Opinion consensus (or expert judgement) reports are positions reached by individuals or groups of experts, and are not necessarily supported by clinical research data. However, depending on how the expert opinions are elicited, results from expert opinion studies can have as much weight as the evidence from some other study types, and sometimes may provide better evidence than case reports or case series. For more information on study designs, the Centre for Evidence-based Medicine provides a guide.