TY - JOUR
T1 - Double Sampling with Multiple Imputation to Answer Large Sample Meta-Research Questions
T2 - Introduction and Illustration by Evaluating Adherence to Two Simple CONSORT Guidelines
AU - Capers, Patrice L.
AU - Brown, Andrew W.
AU - Dawson, John A.
AU - Allison, David B.
N1 - Funding Information:
This project was supported by NIH grants K12GM088010, P30DK056336, T32HL072757, and R25HL124208. The opinions expressed are those of the authors and do not necessarily reflect those of the NIH or any other organization.
Publisher Copyright:
© Copyright © 2015 Capers, Brown, Dawson and Allison.
PY - 2015/3/9
Y1 - 2015/3/9
N2 - Background: Meta-research can involve manual retrieval and evaluation of research, which is resource intensive. Creation of high throughput methods (e.g., search heuristics, crowdsourcing) has improved feasibility of large meta-research questions, but possibly at the cost of accuracy. Objective: To evaluate the use of double sampling combined with multiple imputation (DS + MI) to address meta-research questions, using as an example adherence of PubMed entries to two simple consolidated standards of reporting trials guidelines for titles and abstracts. Methods: For the DS large sample, we retrieved all PubMed entries satisfying the filters: RCT, human, abstract available, and English language (n = 322, 107). For the DS subsample, we randomly sampled 500 entries from the large sample. The large sample was evaluated with a lower rigor, higher throughput (RLOTHI) method using search heuristics, while the subsample was evaluated using a higher rigor, lower throughput (RHITLO) human rating method. Multiple imputation of the missing-completely at-random RHITLO data for the large sample was informed by: RHITLO data from the subsample; RLOTHI data from the large sample; whether a study was an RCT; and country and year of publication. Results: The RHITLO and RLOTHI methods in the subsample largely agreed (phi coefficients: title = 1.00, abstract = 0.92). Compliance with abstract and title criteria has increased over time, with non-US countries improving more rapidly. DS + MI logistic regression estimates were more precise than subsample estimates (e.g., 95% CI for change in title and abstract compliance by year: subsample RHITLO 1.050–1.174 vs. DS + MI 1.082–1.151). As evidence of improved accuracy, DS + MI coefficient estimates were closer to RHITLO than the large sample RLOTHI. Conclusion: Our results support our hypothesis that DS + MI would result in improved precision and accuracy. This method is flexible and may provide a practical way to examine large corpora of literature.
AB - Background: Meta-research can involve manual retrieval and evaluation of research, which is resource intensive. Creation of high throughput methods (e.g., search heuristics, crowdsourcing) has improved feasibility of large meta-research questions, but possibly at the cost of accuracy. Objective: To evaluate the use of double sampling combined with multiple imputation (DS + MI) to address meta-research questions, using as an example adherence of PubMed entries to two simple consolidated standards of reporting trials guidelines for titles and abstracts. Methods: For the DS large sample, we retrieved all PubMed entries satisfying the filters: RCT, human, abstract available, and English language (n = 322, 107). For the DS subsample, we randomly sampled 500 entries from the large sample. The large sample was evaluated with a lower rigor, higher throughput (RLOTHI) method using search heuristics, while the subsample was evaluated using a higher rigor, lower throughput (RHITLO) human rating method. Multiple imputation of the missing-completely at-random RHITLO data for the large sample was informed by: RHITLO data from the subsample; RLOTHI data from the large sample; whether a study was an RCT; and country and year of publication. Results: The RHITLO and RLOTHI methods in the subsample largely agreed (phi coefficients: title = 1.00, abstract = 0.92). Compliance with abstract and title criteria has increased over time, with non-US countries improving more rapidly. DS + MI logistic regression estimates were more precise than subsample estimates (e.g., 95% CI for change in title and abstract compliance by year: subsample RHITLO 1.050–1.174 vs. DS + MI 1.082–1.151). As evidence of improved accuracy, DS + MI coefficient estimates were closer to RHITLO than the large sample RLOTHI. Conclusion: Our results support our hypothesis that DS + MI would result in improved precision and accuracy. This method is flexible and may provide a practical way to examine large corpora of literature.
KW - CONSORT
KW - adherence
KW - double sampling
KW - meta-research
KW - modeling
KW - multiple imputation
UR - http://www.scopus.com/inward/record.url?scp=85014047472&partnerID=8YFLogxK
U2 - 10.3389/fnut.2015.00006
DO - 10.3389/fnut.2015.00006
M3 - Article
AN - SCOPUS:85014047472
SN - 2296-861X
VL - 2
JO - Frontiers in Nutrition
JF - Frontiers in Nutrition
M1 - 6
ER -