Improving the quality of biomarker discovery research: the right samples and enough of them
Margaret S Pepe, Christopher I Li, Ziding Feng, Margaret S Pepe, Christopher I Li, Ziding Feng
Abstract
Background: Biomarker discovery research has yielded few biomarkers that validate for clinical use. A contributing factor may be poor study designs.
Methods: The goal in discovery research is to identify a subset of potentially useful markers from a large set of candidates assayed on case and control samples. We recommend the PRoBE design for selecting samples. We propose sample size calculations that require specifying: (i) a definition for biomarker performance; (ii) the proportion of useful markers the study should identify (Discovery Power); and (iii) the tolerable number of useless markers amongst those identified (False Leads Expected, FLE).
Results: We apply the methodology to a study of 9,000 candidate biomarkers for risk of colon cancer recurrence where a useful biomarker has positive predictive value ≥ 30%. We find that 40 patients with recurrence and 160 without recurrence suffice to filter out 98% of useless markers (2% FLE) while identifying 95% of useful biomarkers (95% Discovery Power). Alternative methods for sample size calculation required more assumptions.
Conclusions: Biomarker discovery research should utilize quality biospecimen repositories and include sample sizes that enable markers meeting prespecified performance characteristics for well-defined clinical applications to be identified.
Impact: The scientific rigor of discovery research should be improved.
©2015 American Association for Cancer Research.
Figures
Source: PubMed