Research Paper|Articles in Press

Randomization, blinding, data handling and sample size estimation in papers published in Veterinary Anaesthesia and Analgesia in 2009 and 2019

Published:September 28, 2021

Abstract

Objective

To evaluate reporting of items indicative of bias and weak study design.

Study design

Literature survey.

Population

Papers published in Veterinary Anaesthesia and Analgesia.

Methods

Reporting of randomization, blinding, sample size estimation and data exclusion were compared for papers published separated by a 10 year interval. A reporting rate of more than 95% was considered ideal. The availability of data supporting results in a publicly accessible repository was also assessed. Selected papers were randomized and identifiers removed for review, with data from 59 (57 in 2009, two in 2008) and 56 (52 in 2019, four in 2018) papers analyzed. Items were categorized for completeness of reporting using a previously published operationalized checklist. Two reviewers reviewed all papers independently.

Results

Full reporting of randomization increased over time from 13.6% to 85.7% [95% confidence interval (CI), 57.8–86.6%; p < 0.0001], as did sample size estimation (from 0% to 20%; 95% CI, 7.6–32.4%; p = 0.002). Reporting of blinding (49.2% and 50.0%; 95% CI, –18.3% to 20.0%; p = 1.0) and exclusions of samples/animals (39.0% and 50.0%; 95% CI, –8.8% to 30.8%; p = 0.3) did not change significantly. Data availability was low (2008/2009, zero papers; 2018/2019, two papers). None of the items studied exceeded the predetermined ideal reporting rate.

Conclusions and clinical relevance

These results indicate that reporting quality remains low, with a risk of bias.

Introduction

Complete reporting of research is fundamental to interpreting study design and supporting study reproducibility. In particular, bias in research can be defined as ‘a systematic error, or deviation from the truth, in results’ and, when present, leads to under- or over-estimation of the true effect of a treatment/intervention (
• Boutron I.
• Page M.J.
• Higgins J.P.T.
• et al.
Considering bias and conflicts of interest among the included studies.
). Techniques to limit bias include randomizing animals to group allocation, blinding (masking) researchers to these group allocations and/or outcome assessment, and transparency in data handling. In biomedical research involving animals, incomplete reporting of measures to reduce the risk of bias in studies has been associated with inflated effect sizes, suggesting that these measures were not applied when conducting these studies (
• Macleod M.R.
• van der Worp H.B.
• Sena E.S.
• et al.
Evidence for the efficacy of NXY-059 in experimental focal cerebral ischaemia is confounded by study quality.
,
• Macleod M.R.
• Lawson McLean A.
• Kyriakopoulou A.
• et al.
Risk of bias in reports of in vivo research: a focus for improvement.
). These artificially inflated effect sizes contribute to failures in reproducibility and translation, waste of financial resources and raise ethical concerns regarding the use of animals in research (
• Macleod M.R.
• van der Worp H.B.
• Sena E.S.
• et al.
Evidence for the efficacy of NXY-059 in experimental focal cerebral ischaemia is confounded by study quality.
;
• Freedman L.P.
• Cockburn I.M.
• Simcoe T.S.
The economics of reproducibility in preclinical research.
). In veterinary medicine, clinical trials in dogs and cats that did not report randomization method, blinding or inclusion and exclusion criteria were associated with a higher proportion of positive treatment effects (
• Sargeant J.M.
• Thompson A.
• Valcour J.
• et al.
Quality of reporting of clinical trials of dogs and cats and associations with treatment effects.
).
Quality of reporting of items associated with risk of bias in veterinary clinical and laboratory animal research is historically low. A 1998 study of small animal clinical trials found randomization method was reported in five out of 23 papers (22%) reviewed (
• Lund E.M.
• James K.M.
• Neaton J.D.
Veterinary randomized clinical trial reporting: a review of the small animal literature.
). Of these 23 papers, eight (35%) made a statement regarding blinding. None of the papers reported how sample size was determined (
• Lund E.M.
• James K.M.
• Neaton J.D.
Veterinary randomized clinical trial reporting: a review of the small animal literature.
). Sample size estimation is considered an important indicator of study quality. A more recent review of 120 papers published in veterinary journals in 2016 and 2017 found that reporting quality remains low: randomization (58%), blinding (49%) and exclusion of data (58%) (
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
). Sample size estimation was reported in just 26% of papers (
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
). This review found 14 papers that did not fully report any of these four items, whereas only one paper fully reported all items (
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
). The importance of complete reporting is underlined by recent work showing that methods described as random by authors are not always truly random (
• Di Girolamo N.
• Giuffrida M.A.
• Winter A.L.
• Meursinge Reynders R.
In veterinary trials reporting and communication regarding randomisation procedures is suboptimal.
). For example, use of day of admission to a clinic to assign a treatment intervention should not be considered to be random assignment. Certain patient types may be admitted on particular days of the week, thereby introducing a factor other than chance in determining the assigned treatment.
In laboratory animal research, incomplete reporting led to the development of the Animal Research: Reporting of In Vivo Experiments (ARRIVE) guidelines, first published in 2010 (
• Kilkenny C.
• Browne W.J.
• Cuthill I.C.
• et al.
Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research.
). Unfortunately, the impact of these guidelines has been below expectations. A review of veterinary papers published in 2015 found that support for the guidelines has had a minimal effect on reporting rates, with little difference between journals that explicitly stated support for the ARRIVE guidelines and those that did not (
• Leung V.
• Rousseau-Blass F.
• Beauchamp G.
• Pang D.S.J.
ARRIVE has not ARRIVEd: support for the ARRIVE (Animal Research: Reporting of in vivo Experiments) guidelines does not improve the reporting quality of papers in animal welfare, analgesia or anesthesia.
). It has been suggested that there are two factors contributing to the limited impact of the ARRIVE guidelines on reporting rates. Firstly, although journals may endorse guidelines, enforcement of guidelines is not pursued and secondly, researchers have limited awareness of the consequences of incomplete reporting (
• Percie du Sert N.
• Hurst V.
• Ahluwalia A.
• et al.
The ARRIVE guidelines 2.0: updated guidelines for reporting animal research.
). For clinical trials, the Consolidated Standards of Reporting Trials (CONSORT) reporting guidelines are applicable (
• Moher D.
• Hopewell S.
• Schulz K.F.
CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials.
). As observed with the ARRIVE guidelines, the impact of the CONSORT guidelines has been below expectations in human clinical research (
• Turner L.
• Shamseer L.
• Altman D.G.
• et al.
Consolidated standards of reporting trials (CONSORT) and the completeness of reporting of randomised controlled trials (RCTs) published in medical journals.
). Both ARRIVE and CONSORT reporting guidelines are endorsed by Veterinary Anaesthesia and Analgesia. Although it cannot be known when authors and reviewers first became aware of the ARRIVE and CONSORT reporting guidelines, they were formally introduced to the readership of Veterinary Anaesthesia and Analgesia in an editorial, in 2016 (
• Axiak Flammer S.M.
• Trim C.M.
ARRIVE and CONSORT guidelines: do they have a place in Veterinary Anaesthesia and Analgesia?.
).
One proposal to improve reporting quality is to simplify reporting to focus on key items reflective of risk of bias and study quality: randomization, blinding, data handling (inclusion and exclusion criteria) and sample size estimation (
• Landis S.C.
• Amara S.G.
• et al.
A call for transparent reporting to optimize the predictive value of preclinical research.
). These have been termed the ‘Landis 4’ criteria (
• Cramond F.
• Irvine C.
• Liao J.
• et al.
Protocol for a retrospective, controlled cohort study of the impact of a change in Nature journals’ editorial policy for life sciences research on the completeness of reporting study design and execution.
).
The aim of this study was to compare the reporting of Landis 4 items in papers published before 2010 (before publication of the ARRIVE guidelines and an update of the CONSORT guidelines in 2010) and 10 years later. The primary objective was to identify any improvement over the 10 year reporting period. A predetermined ideal reporting rate of 95% for individual items was set. A secondary objective was to document data availability in publicly accessible internet repositories for the selected papers. We hypothesized that reporting rates would increase significantly over time.

Material and methods

An operationalized checklist was used for evaluating papers (Table 1). Each item on the checklist was identified as fully, partially/not applicable, or not reported. The checklist was based on that used by
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
and
• Cramond F.
• Irvine C.
• Liao J.
• et al.
Protocol for a retrospective, controlled cohort study of the impact of a change in Nature journals’ editorial policy for life sciences research on the completeness of reporting study design and execution.
. Key differences from
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
included the process for identifying excluded animals/data (item 4a) and the addition of inclusion criteria as an item (item 5, Table 1).
Table 1Checklist used to evaluate completeness of reporting for randomization, sample size estimation, blinding, data exclusions, data inclusions and data availability. Checklist based on
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
© British Veterinary Association 2019. ∗Applies to multiple experiments/trials included in the same manuscript. Addition to checklist of
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
Item titleClassificationDescriptor
1) RandomizationFully reported
• If a method of randomization for allocating samples or animals to experimental groups is described.
• If there is a statement describing that randomization was not possible.
Partially reported
• If randomization is not described for each experiment performed.∗
• If randomization is mentioned but the method used is not described.
Not reported
• If there is no statement of randomization.
2) Sample size estimationFully reported
• If sample size is justified based on having adequate power to detect a predetermined difference for the identified primary outcome(s) of interest, including at least three out of four elements (alpha, beta, variability, difference of interest) required to calculate sample size.
Partially reported
• If sample size estimate is mentioned without an explicit statement of power or difference to be detected.
• If sample size is not estimated for each of the primary outcome(s), or if primary outcomes have not been specifically identified.
• If fewer than three elements required to calculate sample size are provided.
Not reported
• No mention of sample size estimation in the paper or deviations from the fully or partially reported descriptors.
3) BlindingFully reported
• If blinding is reported for group allocation and/or when assessing the primary outcome(s) for each experiment.∗
• If there is a statement that blinding was not possible.
Partially reported
• If a general, non-specific statement of blinding is made, for example ‘this was a blinded study’, without specifying blinding was to group allocation or outcome assessment, or both.
• Reported blinding is incomplete: does not include all allocations/outcomes.
Not reported
• No statement on blinding in the manuscript.
4a) Exclusion of samples or animals from the analysisFully reported
• If there is a statement that a sample or animal was excluded from analysis.
• If there is a statement that no data was excluded from analysis.
Partially reported
• If the number of animals (or samples) reported in the results matches the description of the number enrolled, but there is no explicit statement regarding data exclusion.
• If the number of animals (or samples) from which data were collected differs from the number enrolled/included for some measures and the statement(s) for exclusion does not include all measures.
Not reported
• If the number of animals from which data were collected is described in the results and this differs from the number enrolled/included.
4b) Defining exclusion criteriaFully reported
• If there is a description of why or in which situation(s), data would be excluded.
Partially reported
• Not applicable to this item.
Not reported
• If an exclusion is described without explanation.
• If the total number of animals from which data could be collected changes from the methods to the results without explanation.
• If there is no explanation of why or in which situation(s) data were excluded.
4c) Pre-establishing exclusion criteriaNot applicable
• If the response to 4b is not reported.
Fully reported
• If an explicit statement of exclusion criteria being pre-established is made.
Partially reported
• Not applicable to this item.
Not reported
• If there is no explicit statement that exclusion criteria were pre-established
5) Defining inclusion criteriaFully reported
• If there is a statement describing criteria for eligibility to be enrolled in the study.
• If there is a statement describing criteria for ineligibility to be enrolled in the study.
Partially reported
• Not applicable to this item.
Not reported
• If there is no statement describing criteria for eligibility or ineligibility to the study.
6) Data availabilityFully reported
• If data are freely available (no need to contact author(s)), for example, data repository.
Partially reported
• If data are described as being available by contacting the author(s).
Not reported
• If there is no mention of data availability.
Assessments of randomization, inclusion criteria and data availability were applied to all study outcomes listed. Sample size estimation, exclusion criteria (items 4a–c) and blinding were evaluated for primary outcomes. Primary outcomes were identified from the described aims/objectives and hypothesis (if stated). If the paper description did not specify primary or secondary outcomes, all described outcomes in the introduction were considered primary. Any outcomes described elsewhere (e.g. in the methods) as well as items that were described with the use of additive transitions, such as ‘additionally’, ‘also’, ‘further’ etc. were considered secondary and were not included in assessment. If no aims were designated as secondary, but a hypothesis that only included a selection of the outcomes described, that selection was considered primary.
Journal issues for the periods of interest (2009 and 2019) were initially screened for inclusion by reviewing abstracts within the ‘research papers’ section of the journal’s online table of contents (DSJP). Inclusion criteria applied during initial screening were the same as applied during full text review (described below). Beginning with the last issue published in each calendar year, abstracts were screened in the order in which they appeared in the table of contents. Each subsequent issue was screened in reverse chronological order (i.e. September, July, May, etc.) until all issues were screened. Full texts of selected papers were downloaded and sent to a third party to remove identifiers (redaction tool of Adobe Acrobat Pro DC Version 2020.013.20064; Microsoft Corporation, WA, USA) and randomize the papers (random sequence generator tool, random.org) for subsequent review. The third party was an individual unaware of the study goals and uninvolved with paper review.
Using the full texts, the same two reviewers (BAM, PB) assessed papers for inclusion and performed all reviews. Inclusion criteria were: prospective, in vivo studies in which a comparison was made. Predetermined exclusion criteria were papers that were: retrospective, cadaveric, in vitro, observational, pilot or preliminary research studies, brief/short communications and case reports/series. Papers ineligible for inclusion were excluded and replaced. To remain as close as possible to the target years (2009 and 2019), replacement papers were limited to the journal issue in which the 65th paper of each study period was found.
Before beginning paper review, both reviewers practiced applying the operationalized checklist using 12 papers that had been assessed as part of a previous study (
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
). Applied categories were discussed as a group by the co-authors, with support from Dr. Rufiange (Université de Montréal, QC, Canada). The results of this process were used to inform the application of the operationalized checklist in the current study. In any case where categories assigned by the two reviewers differed, a consensus was reached following discussion with a third reviewer (DSJP).
To establish data availability, any indication that data supporting study results were available (e.g. online link to supplementary materials) was recorded during paper review. These were checked by the third reviewer (DSJP) after assessment of all papers was completed (to protect the blinding process).

Statistical analysis

To calculate the sample size needed for each of the items, assumptions of 95% reporting for the 2018/2019 group in all items were made. Sample size calculations were performed using powerandsamplesize.com, to compare two proportions with two groups, two-sided equality and sampling ratio of 1:1. Power was set at 90% and type I error,$α$, at 0.05 for all samples size calculations. The sample size was based on the results of Rufiange et al. (2019; column 1 of Table 2), using these as estimates for percentages of items reported for the 2008/2009 group and applying the most conservative estimate for item improvement. This applied to the items randomization (item 1) and defining exclusion criteria (item 4b, full reporting of both items was 75.0%;
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
), for which approximately 60 papers per period of interest were needed to identify an improvement of 20% (90% power, alpha 0.05) in these two items. This estimated sample size was sufficient for the other items studied, as the size of increase in reporting levels to 95% was greater than that for randomization and defining exclusion criteria. It was planned to identify 65 papers during abstract review to account for potential exclusions. The item, inclusion criteria, was not present in the previous checklist so no existing data were available for its inclusion in estimating sample size.
Table 2Partial and not reporting levels for checklist items. 95% confidence interval (CI), 95% confidence interval of the difference in proportions (2018/2019–2008/2009). na, not available
ItemPartial reporting, 2008/2009 (%)Partial reporting, 2018/2019 (%)p value (95% CI, %)Not reported, 2008/2009 (%)Not reported, 2018/2019 (%)p value (95% CI, %)
1) Randomization79.75.4<0.0001 (–87.9 to –60.7)6.88.90.94 (–9.4 to 13.7)
2) Sample size estimation21.861.8<0.0001 (21.3 to 58.7)78.218.2<0.0001 (–76.8 to –43.2)
3) Blinding17.07.10.19 (–23.3 to 3.7)33.942.90.43 (–10.5 to 28.4)
4a) Exclusion of samples/animals from analysis55.950.00.65 (–25.9 to 14.0)5.100.26 (–12.4 to 2.3)
4b) Defining exclusion criteriananana55.953.60.95 (–22.3 to 17.6)
4c) Pre-establishing exclusion criteriananana100100na
5) Defining inclusion criteriananana00na
6) Data availability00na10096.40.45 (–10.2 to 3.0)
Data were compared between time periods (2008/2009 versus 2018/2019) with a two proportion test to compare individual items. Differences in proportions were reported along with 95% confidence intervals (CIs) and p value. Significance was set as p < 0.05. Data supporting study results are available in a repository:
Pang, Daniel (2021) Reporting quality in Veterinary Anaesthesia and Analgesia over a 10 year period, https://doi.org/10.7910/DVN/TDYGYI, Harvard Dataverse, V1, UNF:6:2jWOeUZw38x1e5F+XaBR+g==

Results

A total of 130 papers were included following initial abstract review. To identify this target number of papers, it was necessary to extend the search periods into 2008 and 2018. During full text review, 21 papers (13 from 2018/2019 and eight from 2008/2009) were excluded. Of these, 14 were listed as pilots/short communications, one was a cadaver study, four were observational studies and two were pharmacokinetics studies. To remain close to the initially defined study periods (2009 and 2019) while still achieving an acceptable sample size, six replacement papers were selected (four from 2018 and two from 2008). Replacement papers were restricted to the journal issue in which the 65th paper for each study period was identified. The total number of papers included in analysis was: 2008/2009, n = 59; 2018/2019, n = 56. Consensus discussion was required for 18 (15.7%) papers. Some papers could not be categorized for individual items. Sample size estimation (item 2), was not applicable in five papers (four in 2008/2009, one in 2018/2019) as the Bland and Altman method was applied (sample size calculation for this method was published during the 10 year observation period;
• Lu M.J.
• Zhong W.H.
• Liu Y.X.
• et al.
Sample size for assessing agreement between two methods of measurement by Bland–Altman method.
). For pre-establishing exclusion criteria (item 4c), 63 papers (33 in 2008/2009, 30 in 2018/2019) did not define exclusion criteria (item 4b) and, therefore, were categorized as not applicable for item 4c.
For randomization, sample size estimation, blinding, sample/animal exclusion (item 4a) and inclusion criteria, levels of full reporting varied from 0–100% in 2008/2009 to 20–100% in 2018/2019 (Fig. 1). Between 2008/2009 and 2018/2019, there were significant improvements in the full reporting of randomization, from 13.6% in 2008/2009 to 85.7% in 2018/2019 (95% CI, 57.8–86.6%; p < 0.0001), and sample size estimation, from 0% in 2008/2009 to 20% in 2018/2019 (95% CI, 7.6–32.4%; p = 0.002; Fig. 1). Full reporting of blinding did not change significantly over time (95% CI, –18.3% to 20.0%; p = 1.0) and was reported in 50.0% of papers in 2018/2019 and 49.2% of papers in 2008/2009 (Fig. 1). For inclusion criteria, 100% of papers fully reported in both time periods (Fig. 1). Full reporting of exclusion of samples/animals (item 4a) did not differ significantly between time periods (95% CI, –8.8% to 30.8%; p = 0.3; Fig. 1). Defining exclusion criteria (item 4b) was fully reported in 46.4% of papers in 2018/2019 and 44.1% of papers in 2008/2009 (95% CI, –17.6% to 22.3%; p = 0.9) and stating if exclusion criteria were pre-established (item 4c) was not reported in any papers of either time period. Data were available in a publicly accessible site for two papers in 2018/2019 (3.6%) and no papers in 2008/2009 (95% CI, –3.0% to 10.2%; p = 0.45). Results for partial and not reporting are presented in Table 2.

Discussion

The main findings of this study are: 1) that there have been improvements in reporting of key items between 2008–2009 and 2018–2019; 2) these improvements are small, below reasonable expectations of high-quality reporting; and 3) they reflect a limited impact of reporting guidelines and indicate a risk of bias in the published literature.
There are currently more than 400 reporting guidelines available for a host of study types (
Equator Network
Enhancing the QUAlity and Transparency Of health Research: search for reporting guidelines.
). The goal of these guidelines is to improve completeness of study reporting, facilitating evaluation of study design and, by extension, improve reproducibility of findings (
• Kilkenny C.
• Browne W.J.
• Cuthill I.C.
• et al.
Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research.
;
• Moher D.
• Hopewell S.
• Schulz K.F.
CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials.
). Current versions of the CONSORT and ARRIVE guidelines overlap in that they both require reporting of key items associated with bias (randomization, blinding and data inclusion and exclusion), and sample size estimation (
• Moher D.
• Hopewell S.
• Schulz K.F.
CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials.
;
• Percie du Sert N.
• Hurst V.
• Ahluwalia A.
• et al.
The ARRIVE guidelines 2.0: updated guidelines for reporting animal research.
).
Assessment of risk of bias in a study depends upon complete reporting of randomization, blinding (also known as masking) and data handling (inclusions and exclusions) (
• Moher D.
• Hopewell S.
• Schulz K.F.
CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials.
). Animals should be randomly allocated to a treatment/intervention and the method of randomization described. The importance of describing randomization method has been highlighted by studies showing that non-random methods may be described as random in 7–17% of veterinary clinical trials (
• Brown D.C.
Control of selection bias in parallel-group controlled clinical trials in dogs and cats: 97 trials (2000–2005).
;
• Di Girolamo N.
• Giuffrida M.A.
• Winter A.L.
• Meursinge Reynders R.
In veterinary trials reporting and communication regarding randomisation procedures is suboptimal.
). Blinding describes the awareness of study investigators (and owners, where applicable) of the assigned treatment/intervention. Blinding can be instituted at several levels, including during animal management, data collection and analysis, and should be specified. Inclusion (eligibility) and exclusion criteria describe how data are handled, informing readers of the population initially included in the trial and final data analyzed. A complete description of data handling allows readers to interpret the applicability of a study.
Failure to report these items is associated with inflated effect sizes (
• Macleod M.R.
• van der Worp H.B.
• Sena E.S.
• et al.
Evidence for the efficacy of NXY-059 in experimental focal cerebral ischaemia is confounded by study quality.
,
• Macleod M.R.
• Lawson McLean A.
• Kyriakopoulou A.
• et al.
Risk of bias in reports of in vivo research: a focus for improvement.
). For example
• Macleod M.R.
• van der Worp H.B.
• Sena E.S.
• et al.
Evidence for the efficacy of NXY-059 in experimental focal cerebral ischaemia is confounded by study quality.
identified an approximately twofold increase in treatment efficacy in laboratory animal trials in which randomization or blinding were not reported. These apparently promising preclinical data, using 408 animals, were used in the case for the study drug going forward to human clinical trials, where it failed. In a study of 76 veterinary clinical trials (dogs and cats), lower proportions of positive treatment effects were found in studies that reported blinding [of person(s) administering treatment and evaluating outcome], the method of randomization or inclusion and exclusion criteria (
• Sargeant J.M.
• Thompson A.
• Valcour J.
• et al.
Quality of reporting of clinical trials of dogs and cats and associations with treatment effects.
).
In this study, the only items for which full reporting increased significantly over the study periods were randomization and sample size estimation. The reporting rate for randomization observed in 2008–2009 is comparable to previous work showing reporting rates ranging from 16% (2008, n = 70 trials) to 22% (1989, n = 23 trials) (
• Lund E.M.
• James K.M.
• Neaton J.D.
Veterinary randomized clinical trial reporting: a review of the small animal literature.
;
• Sargeant J.M.
• Thompson A.
• Valcour J.
• et al.
Quality of reporting of clinical trials of dogs and cats and associations with treatment effects.
). The higher level of reporting of randomization in 2018/2019 is promising, and similar to that recently reported for veterinary subject-specific journals (75%) (
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
). Interestingly, randomization reporting quality in general veterinary journals (review of papers published in 2013 and 2016/2017) appears lower than for subject-specific journals, with reporting rates of approximately 42–47% compared with 85.7% for 2018/2019 in the current study (
• Di Girolamo N.
• Giuffrida M.A.
• Winter A.L.
• Meursinge Reynders R.
In veterinary trials reporting and communication regarding randomisation procedures is suboptimal.
;
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
). Studies should include a sample size sufficient to identify a scientifically important difference. Incomplete or absent sample size estimation reporting substantially limits the interpretation of results, particularly in the case of negative findings (no statistically significant difference) (
• Hofmeister E.H.
• King J.
• Budsberg S.C.
Sample size and statistical power in the small-animal analgesia literature.
;
• Moher D.
• Hopewell S.
• Schulz K.F.
CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials.
;
• Giuffrida M.A.
Type II error and statistical power in reports of small animal clinical trials.
;
• Wagg C.R.
• Kwong G.P.S.
• Pang D.S.J.
Application of confidence intervals to data interpretation.
). Consistent with numerous biomedical and veterinary reports, sample size estimation continues to be poorly reported (see, e.g.
• Lund E.M.
• James K.M.
• Neaton J.D.
Veterinary randomized clinical trial reporting: a review of the small animal literature.
;
• Sargeant J.M.
• Thompson A.
• Valcour J.
• et al.
Quality of reporting of clinical trials of dogs and cats and associations with treatment effects.
;
• Giuffrida M.A.
Type II error and statistical power in reports of small animal clinical trials.
;
• Macleod M.R.
• Lawson McLean A.
• Kyriakopoulou A.
• et al.
Risk of bias in reports of in vivo research: a focus for improvement.
;
• Leung V.
• Rousseau-Blass F.
• Beauchamp G.
• Pang D.S.J.
ARRIVE has not ARRIVEd: support for the ARRIVE (Animal Research: Reporting of in vivo Experiments) guidelines does not improve the reporting quality of papers in animal welfare, analgesia or anesthesia.
;
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
). The incidence reported here represents an improvement over earlier veterinary studies (0–1% reporting rate) and is consistent with more recent work, but remains well below desired levels (
• Lund E.M.
• James K.M.
• Neaton J.D.
Veterinary randomized clinical trial reporting: a review of the small animal literature.
;
• Sargeant J.M.
• Thompson A.
• Valcour J.
• et al.
Quality of reporting of clinical trials of dogs and cats and associations with treatment effects.
;
• Giuffrida M.A.
Type II error and statistical power in reports of small animal clinical trials.
). The sample sizes of many veterinary clinical trials are too small to detect a treatment effect smaller than 50% (
• Hofmeister E.H.
• King J.
• Budsberg S.C.
Sample size and statistical power in the small-animal analgesia literature.
;
• Giuffrida M.A.
Type II error and statistical power in reports of small animal clinical trials.
). This may reflect an unawareness of the importance of sample size, time pressure to complete a study or funding limitations. The value of data from underpowered studies is questionable, particularly when access to study data is limited.
By contrast, reporting rates for blinding and data handling were stagnant. Reporting of blinding and data handling is comparable with earlier studies (blinding reported in approximately 35–48% of studies, inclusion and exclusion criteria described in 44%) suggesting minimal improvement since these initial studies were performed (
• Lund E.M.
• James K.M.
• Neaton J.D.
Veterinary randomized clinical trial reporting: a review of the small animal literature.
;
• Sargeant J.M.
• Thompson A.
• Valcour J.
• et al.
Quality of reporting of clinical trials of dogs and cats and associations with treatment effects.
).
Access to published data allows verification of results and analysis, and enables data from different studies to be combined (such as for meta-analysis) (
• Moher D.
• Hopewell S.
• Schulz K.F.
CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials.
;
• Wicherts J.M.
• Bakker M.
• Molenaar D.
Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results.
;
• Percie du Sert N.
• Hurst V.
• Ahluwalia A.
• et al.
The ARRIVE guidelines 2.0: updated guidelines for reporting animal research.
). It is clear from this present study and a previous study that free access to clinical veterinary research data is extremely limited (
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
). Of 120 in vivo veterinary studies published in 2017, data from a single paper were publicly accessible (
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
). Similar findings have been reported in in vivo biomedical research (
• Iqbal S.A.
• Wallach J.D.
• Khoury M.J.
• et al.
Reproducible research practices and transparency across the biomedical literature.
). When corresponding authors of 29 papers published in the British Medical Journal were contacted by e-mail requesting access to study data, one provided data and a second indicated a willingness to do so without caveats (
• Reidpath D.D.
• Allotey P.A.
Data sharing on medical research: an empirical investigation.
). Full and open access to publicly funded research is proposed by the Organisation for Economic Co-operation and Development (OECD), and data access is encouraged or required by several publishers (
• Pilat D.
• Fukasaku Y.
OECD principles and guidelines for access to research data from public funding.
;
PLoS One
Data availability.
). Beyond simple accessibility, data being Findable, Accessible, Interoperable and Reusable (the FAIR principles) provide a framework for maximizing the potential of data access (
• Wilkinson M.D.
• Dumontier M.
• Aalbersberg I.
• et al.
The FAIR guiding principles for scientific data management and stewardship.
). Reluctance to make data available appears to stem from concern that errors in data analysis will be exposed, potentially leading to different conclusions (
• Anon
A fair share: the concept of sharing primary data is generating unnecessary angst in the psychology community.
). Compared with papers for which data could be accessed, results from papers for which data were withheld were associated with more errors in the reported statistical analyses and with smaller p values (
• Wicherts J.M.
• Bakker M.
• Molenaar D.
Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results.
).
Numerous studies show reporting quality in human and animal research remains poor despite the large number of guidelines available and widespread journal endorsement (
• Turner L.
• Shamseer L.
• Altman D.G.
• et al.
Consolidated standards of reporting trials (CONSORT) and the completeness of reporting of randomised controlled trials (RCTs) published in medical journals.
;
• Avey M.T.
• Moher D.
• Sullivan K.J.
• et al.
The devil is in the details: incomplete reporting in preclinical animal research.
;
• Di Girolamo N.
• Giuffrida M.A.
• Winter A.L.
• Meursinge Reynders R.
In veterinary trials reporting and communication regarding randomisation procedures is suboptimal.
;
• Leung V.
• Rousseau-Blass F.
• Beauchamp G.
• Pang D.S.J.
ARRIVE has not ARRIVEd: support for the ARRIVE (Animal Research: Reporting of in vivo Experiments) guidelines does not improve the reporting quality of papers in animal welfare, analgesia or anesthesia.
;
• Totton S.C.
• Cullen J.N.
• Sargeant J.M.
• O’Connor A.M.
The reporting characteristics of bovine respiratory disease clinical intervention trials published prior to and following publication of the REFLECT statement.
;
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
). Several solutions to improve reporting quality have been proposed, including investigator training, improving grant application review, use of standardized field-specific reporting guidelines, not including methods sections in paper word counts, study evaluation during ethics review, institutional oversight of research, simplified guidance for authors and reviewers, and mandatory reporting checklist submission (
• Fisher M.
• Feuerstein G.
• Howells D.W.
• et al.
Update of the stroke therapy academic industry roundtable preclinical recommendations.
;
• Anon
Reducing our irreproducibility.
;
• Collins F.S.
• Tabak L.A.
NIH plans to enhance reproducibility.
;
• Begley C.G.
• Ioannidis J.P.A.
Reproducibility in science: improving the standard for basic and preclinical research.
;
• Freedman L.P.
• Cockburn I.M.
• Simcoe T.S.
The economics of reproducibility in preclinical research.
;
• Curtis M.J.
• Alexander S.
• Cirino G.
• et al.
Experimental design and analysis and their reporting II: updated and simplified guidance for authors and peer reviewers.
;
• Wellcome
Use of animals in research policy.
). Of these, mandatory checklist submission has shown some success, with improved reporting observed; however, reporting rates did not reach 100% (
• Han S.
• Olonisakin T.F.
• Pribis J.P.
• et al.
A checklist is associated with increased quality of reporting preclinical biomedical research: a systematic review.
;
The NCQIP Collaborative Group
Did a change in Nature journals’ editorial policy for life sciences research improve reporting?.
). This may reflect reviewer/editorial discretion or absence of compliance during review (
• Blanco D.
• Biggane A.M.
• Cobo E.
• MiRoR network
Are CONSORT checklists submitted by authors adequately reflecting what information is actually reported in published papers?.
;
The NCQIP Collaborative Group
Did a change in Nature journals’ editorial policy for life sciences research improve reporting?.
). Ultimately, it is unclear with whom final responsibility for ensuring adherence to reporting guideline lies, and no one group may be fully capable of providing oversight (
• Grindlay D.J.C.
• Dean R.S.
• Christopher M.M.
• Brennan M.L.
A survey of the awareness, knowledge, policies and views of veterinary journal Editors-in-Chief on reporting guidelines for publication of research.
). A survey of veterinary editors-in-chief in 2012 (n = 185, response rate of 36.8%; 68/185) found that 47.1% ‘had no previous knowledge’ of reporting guidelines (
• Grindlay D.J.C.
• Dean R.S.
• Christopher M.M.
• Brennan M.L.
A survey of the awareness, knowledge, policies and views of veterinary journal Editors-in-Chief on reporting guidelines for publication of research.
). Simplifying the process for all parties (authors, reviewers, editorial boards) may achieve necessary redundancy in a manuscript submission and review process in which individuals will focus on different areas. There is a strong argument for early screening by improving assessment of study design during grant review, potentially preventing limitations in study design being carried forward (
• Begley C.G.
• Ioannidis J.P.A.
Reproducibility in science: improving the standard for basic and preclinical research.
;
• Curtis M.J.
• Alexander S.
• Cirino G.
• et al.
Experimental design and analysis and their reporting II: updated and simplified guidance for authors and peer reviewers.
).
This study has the following limitations. 1) It is possible that a proliferative author may be overrepresented in the data set, potentially skewing results, and this was not evaluated. 2) The randomization methods described in individual papers were not further evaluated to assess if they were truly random.

Conclusion

Over the 10 year period studied, reporting of randomization and sample size estimation were the only items that showed significant improvement. These findings show that a considerable gap remains between reporting guideline endorsement and reporting quality in published studies. Although there is no accepted standard for completeness of reporting, it seems reasonable to aim for full reporting (100%) for each paper published in Veterinary Anaesthesia and Analgesia.

Acknowledgements

The authors thank Megan Hass (Faculty of Veterinary Medicine, University of Calgary) for blinding and randomizing articles and Dr. Grace Kwong (Faculty of Veterinary Medicine, University of Calgary) for sample size estimation and statistical analyses. Funding was provided by the O’Brien Centre Summer Studentship , University of Calgary (BAM) and Natural Sciences and Engineering Research Council of Canada (DSJP).

Authors’ contributions

BAM and PB: study design, data collection and interpretation, preparation of manuscript. DSJP: study conception, study design, data collection and interpretation, preparation of manuscript. All authors read and approved the final version of the manuscript.

Conflict of interest statement

The authors declare no conflict of interest. A disclosure is noted that DSJP is a member of the Editorial Board of this journal, but had no access to the review process.

References

• Anon
A fair share: the concept of sharing primary data is generating unnecessary angst in the psychology community.
Nature. 2006; 444: 653-654
• Anon
Reducing our irreproducibility.
Nature. 2013; 496: 398
• Axiak Flammer S.M.
• Trim C.M.
ARRIVE and CONSORT guidelines: do they have a place in Veterinary Anaesthesia and Analgesia?.
Vet Anaesth Analg. 2016; 43: 2-4
• Avey M.T.
• Moher D.
• Sullivan K.J.
• et al.
The devil is in the details: incomplete reporting in preclinical animal research.
PLoS One. 2016; 11e0166733
• Begley C.G.
• Ioannidis J.P.A.
Reproducibility in science: improving the standard for basic and preclinical research.
Circ Res. 2015; 116: 116-126
• Blanco D.
• Biggane A.M.
• Cobo E.
• MiRoR network
Are CONSORT checklists submitted by authors adequately reflecting what information is actually reported in published papers?.
Trials. 2018; 19: 80
• Boutron I.
• Page M.J.
• Higgins J.P.T.
• et al.
Considering bias and conflicts of interest among the included studies.
in: Higgins J.P.T. Thomas J. Chandler J. Cochrane Handbook for Systematic Reviews of Interventions Version 6.2 (updated February 2021). 2021. Cochrane, 2021 (Available from)
• Brown D.C.
Control of selection bias in parallel-group controlled clinical trials in dogs and cats: 97 trials (2000–2005).
J Am Vet Med Assoc. 2006; 229: 990-993
• Collins F.S.
• Tabak L.A.
NIH plans to enhance reproducibility.
Nature. 2014; 505: 612-613
• Cramond F.
• Irvine C.
• Liao J.
• et al.
Protocol for a retrospective, controlled cohort study of the impact of a change in Nature journals’ editorial policy for life sciences research on the completeness of reporting study design and execution.
Scientometrics. 2016; 108: 315-328
• Curtis M.J.
• Alexander S.
• Cirino G.
• et al.
Experimental design and analysis and their reporting II: updated and simplified guidance for authors and peer reviewers.
Br J Pharmacol. 2018; 175: 987-993
• Di Girolamo N.
• Giuffrida M.A.
• Winter A.L.
• Meursinge Reynders R.
In veterinary trials reporting and communication regarding randomisation procedures is suboptimal.
Vet Rec. 2017; 181: 195
• Equator Network
Enhancing the QUAlity and Transparency Of health Research: search for reporting guidelines.
2021
• Fisher M.
• Feuerstein G.
• Howells D.W.
• et al.
Update of the stroke therapy academic industry roundtable preclinical recommendations.
Stroke. 2009; 40: 2244-2250
• Freedman L.P.
• Cockburn I.M.
• Simcoe T.S.
The economics of reproducibility in preclinical research.
PLoS Biol. 2015; 13e1002165
• Giuffrida M.A.
Type II error and statistical power in reports of small animal clinical trials.
J Am Vet Med Assoc. 2014; 244: 1075-1080
• Grindlay D.J.C.
• Dean R.S.
• Christopher M.M.
• Brennan M.L.
A survey of the awareness, knowledge, policies and views of veterinary journal Editors-in-Chief on reporting guidelines for publication of research.
Bio Med Central Vet Res. 2014; 10: 10
• Han S.
• Olonisakin T.F.
• Pribis J.P.
• et al.
A checklist is associated with increased quality of reporting preclinical biomedical research: a systematic review.
PLoS One. 2017; 12e0183591
• Hofmeister E.H.
• King J.
• Budsberg S.C.
Sample size and statistical power in the small-animal analgesia literature.
J Small Anim Pract. 2007; 48: 76-79
• Iqbal S.A.
• Wallach J.D.
• Khoury M.J.
• et al.
Reproducible research practices and transparency across the biomedical literature.
PLoS Biol. 2016; 14e1002333
• Kilkenny C.
• Browne W.J.
• Cuthill I.C.
• et al.
Improving bioscience research reporting: the ARRIVE guidelines for reporting animal research.
PLoS Biol. 2010; 8e1000412
• Landis S.C.
• Amara S.G.
• et al.
A call for transparent reporting to optimize the predictive value of preclinical research.
Nature. 2012; 490: 187-191
• Leung V.
• Rousseau-Blass F.
• Beauchamp G.
• Pang D.S.J.
ARRIVE has not ARRIVEd: support for the ARRIVE (Animal Research: Reporting of in vivo Experiments) guidelines does not improve the reporting quality of papers in animal welfare, analgesia or anesthesia.
PLoS One. 2018; 13e0197882
• Lu M.J.
• Zhong W.H.
• Liu Y.X.
• et al.
Sample size for assessing agreement between two methods of measurement by Bland–Altman method.
Int J Biostat. 2016; 12: 20150039
• Lund E.M.
• James K.M.
• Neaton J.D.
Veterinary randomized clinical trial reporting: a review of the small animal literature.
J Vet Intern Med. 1998; 12: 57-60
• Macleod M.R.
• van der Worp H.B.
• Sena E.S.
• et al.
Evidence for the efficacy of NXY-059 in experimental focal cerebral ischaemia is confounded by study quality.
Stroke. 2008; 39: 2824-2829
• Macleod M.R.
• Lawson McLean A.
• Kyriakopoulou A.
• et al.
Risk of bias in reports of in vivo research: a focus for improvement.
PLoS Biol. 2015; 13e1002273
• Moher D.
• Hopewell S.
• Schulz K.F.
CONSORT 2010 explanation and elaboration: updated guidelines for reporting parallel group randomised trials.
Br Med J. 2010; 340: c869
• Percie du Sert N.
• Hurst V.
• Ahluwalia A.
• et al.
The ARRIVE guidelines 2.0: updated guidelines for reporting animal research.
PLoS Biol. 2020; 18e3000410
• Pilat D.
• Fukasaku Y.
OECD principles and guidelines for access to research data from public funding.
Data Sci J. 2007; 6: OD4-OD11https://doi.org/10.2481/dsj.6.OD4
• PLoS One
Data availability.
2019
• Reidpath D.D.
• Allotey P.A.
Data sharing on medical research: an empirical investigation.
Bioethics. 2001; 15: 125-134
• Rufiange M.
• Rousseau-Blass F.
• Pang D.S.J.
Incomplete reporting of experimental studies and items associated with risk of bias in veterinary research.
Vet Rec Open. 2019; 6e000322
• Sargeant J.M.
• Thompson A.
• Valcour J.
• et al.
Quality of reporting of clinical trials of dogs and cats and associations with treatment effects.
J Vet Intern Med. 2010; 24: 44-50
• The NCQIP Collaborative Group
Did a change in Nature journals’ editorial policy for life sciences research improve reporting?.
BMJ Open Science. 2019; 3e000035
• Totton S.C.
• Cullen J.N.
• Sargeant J.M.
• O’Connor A.M.
The reporting characteristics of bovine respiratory disease clinical intervention trials published prior to and following publication of the REFLECT statement.
Prev Vet Med. 2018; 150: 117-125
• Turner L.
• Shamseer L.
• Altman D.G.
• et al.
Consolidated standards of reporting trials (CONSORT) and the completeness of reporting of randomised controlled trials (RCTs) published in medical journals.
Cochrane Database Syst Rev. 2012; 11MR000030https://doi.org/10.1002/14651858.MR000030.pub2
• Wagg C.R.
• Kwong G.P.S.
• Pang D.S.J.
Application of confidence intervals to data interpretation.
Can Vet J. 2016; 57: 547
• Wellcome
Use of animals in research policy.
2020
• Wicherts J.M.
• Bakker M.
• Molenaar D.
Willingness to share research data is related to the strength of the evidence and the quality of reporting of statistical results.
PLoS One. 2011; 6e26828
• Wilkinson M.D.
• Dumontier M.
• Aalbersberg I.
• et al.
The FAIR guiding principles for scientific data management and stewardship.
Sci Data. 2016; 3: 160018