Main

Prostate cancer is the fourth most common malignancy worldwide, and the second most common among men.1 In 2014, it was estimated that more than a quarter of a million new cases were diagnosed in North America and that the disease accounted for more than 33,000 deaths.2,3 These numbers are likely to increase with the aging of the population. On the basis of data from the Surveillance, Epidemiology, and End Results program, more men were diagnosed with prostate cancer at a younger age and earlier stage in 2004–2005 than in the mid-late 1990s, and disparity between ethnic groups in cancer stage at diagnosis decreased.4 Apart from age,5 ethnic group,5,6 and family history,7,8,9 the risk factors associated with prostate cancer are unclear,5 making primary prevention difficult. Prostate cancer is currently considered to be a complex, multifactorial disease, and the vast majority of familial clustering is attributed to the interaction of multiple shared susceptibility genes with moderate to low penetrance and shared environmental factors within these families.

The natural history of prostate cancer is highly variable.10 In a large proportion of men the disease is indolent, and it is difficult to predict which tumors will be aggressive. The value of aggressive management for localized prostate cancer is debated,11,12,13,14 and only a small proportion of men with early-stage prostate cancer die from the disease within 10 to 15 years of diagnosis. Developing tools to differentiate aggressive and indolent disease is of key importance.

Prostate-specific antigen (PSA) screening was introduced in the late 1980s.15 Meta-analyses of seven randomized controlled trials (RCTs) of screening using PSA testing alone or in combination with digital rectal examination suggested no evidence of benefit in reducing mortality16,17 and some evidence of harm from overdiagnosis.17 Amid substantial debate,18,19,20 the argument has been made for developing more accurate screening tests, including possible genetic markers.21

Since 2001 there have been ~1,000 published studies reporting associations between prostate cancer, single-nucleotide polymorphisms (SNPs), and other genetic variants. To date, genome-wide association (GWA) studies have identified replicated associations between prostate cancer and 100 specific SNPs.21 The magnitude of the odds ratios in these studies was in the range of 1.1 to 2.1, that is, of low penetrance. It is generally accepted that information on single low-penetrance alleles has no value in screening,22 but a small to moderate number of common, low-penetrance variants in combination may be useful in predicting the risk for disease.23

The Centers for Disease Control and Prevention, through the Office of Public Health Genomics, and the Evaluation of Genomic Applications in Practice and Prevention (EGAPP) project, partnered with the Agency for Healthcare Research and Quality (AHRQ) to apply the ACCE (analytic validity; clinical validity; clinical utility; and ethical, legal, and social implications) framework24 to evidence of the use of SNP-based genotyping panels to assess risk of prostate cancer. The review addressed three research questions:

  • What is the analytic validity of currently available SNP-based panels designed for prostate cancer risk assessment; that is, how well do the panels measure the genetic variation they are intended to measure?

  • What is the clinical validity of currently available SNP-based panels designed for prostate cancer risk assessment; that is, what is the accuracy with which the panels identify or predict prostate cancer, or differentiate risk for aggressive from indolent disease?

  • What is the clinical utility of currently available SNP-based panels for prostate cancer risk assessment in terms of the process of care, health outcomes, harms, and economic considerations? Thus, if SNP panels assess genotype accurately, and, if so, accurately predict or stratify a person’s risk, does such risk prediction or stratification lead to altered clinical decision making and/or change in personal behavior sufficient to alter important disease outcomes, and are there any direct harms of such an approach?

Supplementary Figure S1 online illustrates how the use of SNP test panels may result in different types of intermediate and final outcomes, including adverse events.

The methods and findings of the review up to October 2011 were reported in an AHRQ Evidence Report.25 Here, we update this review with evidence up to April 2013.

Methods

With the input of a technical expert panel, we divided the research questions into a series of subquestions (Supplementary Appendix SI online) and developed a protocol following AHRQ guidelines26 to guide the identification and assembly of evidence to address them. A review protocol was developed and peer reviewed before commencing the review.

Data sources

Standard systematic review methodology was applied. MEDLINE, Cochrane CENTRAL, Cochrane Database of Systematic Reviews, and EMBASE databases were searched from their inception to April 2013, inclusive (Supplementary Appendix SII online). The websites of relevant specialty societies and organizations were searched, as well as the reference lists of eligible studies.

Eligibility criteria

To be eligible, studies had to have been published in English and report evaluation of the application of SNP analysis to human populations. The SNP analysis had to be across more than one gene and be commercially available, and at least one of the gene variants included in the panel must have been validated in a GWA study. The commercial availability of a test panel was defined as “a clinical test offered (or soon to be offered) by a certified laboratory, or licensed or certified kit reagent test panels sold for use by clinical service laboratories within continental North America.”25 The criterion of having been validated in a GWA study was imposed because many associations with candidate genes have not been replicated.27 We operationalized this criterion by checking the list of included SNPs against a list developed by reviewing original articles indexed in the National Human Genome Research Institute GWA catalog.28 Validation required observation of association in one or more independent data sets with a significance level of P < 10–5. Panels that included a SNP that was reported to be in linkage disequilibrium with a SNP that had been validated in a GWA study were included. Study designs varied by question and case reports; GWA and simulation studies were excluded.

Study selection

Titles, abstracts, and full texts were screened sequentially by two independent reviewers (J.L. and R.C.). Any conflicts were resolved by a third reviewer with content expertise (B.W. or J.B.). Editorials, commentaries, and qualitative studies were excluded. No restrictions were placed on study setting, minimum sample size, or duration of follow-up (further details on eligibility criteria are provided in Supplementary Appendix SIII online, and excluded studies are listed in Supplementary Appendix SIV online).

Data extraction and risk of bias assessment

Data on study characteristics, SNP panels, and metrics specific to each research question were abstracted by trained data abstractors using standardized forms and a reference guide. Key study elements were reviewed by a second investigator with respect to outcomes, seminal-population characteristics, and characteristics of the SNP panel. Disagreements were resolved by consensus.

With regard to research question 1, because we did not find studies directly evaluating analytic validity, we extracted from the studies, or from references cited by these studies, data that assessed clinical validity on the technologies that were applied in genotyping in these studies. Following the approach recommended by EGAPP,29 information on overall genotyping accuracy rates, SNP call rates, and concordance upon retesting was extracted on sets of genes, which included genes in addition to those included in the SNP panels reviewed. Concerning research question 2, risk of bias was assessed using the NOS,30 supplemented by selected items from the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool.31 We used the NOS anticipating that most studies would be observational, and because it is not clear how well the QUADAS tool would apply to genetic tests. The selected items from the QUADAS tool were whether (i) the spectrum of participants was representative of the patients who would receive the test in practice; (ii) the selection criteria were clearly described; and (iii) uninterpretable, indeterminate, or intermediate test results were reported. Because the number of studies assessing specific panels was small, we did not perform formal statistical tests for publication bias.32,33

Data synthesis

A qualitative descriptive approach was used to summarize study characteristics and outcomes. Multiple publications for the same study were grouped together and treated as a single study, with the most current data reported for the presentation of summary results. Standardized summary tables explaining important study and target population characteristics, as well as study results, were created. Quantitative synthesis and subgroup analyses were planned but not performed because of the heterogeneity of outcomes across the studies and/or because of insufficient data.

Role of the funding source

This systematic review was funded under contract from the AHRQ, which provided project oversight and assisted with internal and external review of the draft evidence report. The AHRQ did not participate in the literature search, determination of study eligibility criteria, data analysis or interpretation, or the preparation and review of the manuscript for publication. The authors worked with a seven-member technical expert panel. This panel included experts in urologic cancer, cancer genetic testing, molecular diagnostics and pathology, prognostic markers and outcomes research, and it helped to set the scope of the review and provided input on methodological and substantive issues during the review.

Results

The search yielded 2,813 unique citations ( Figure 1 ). In total, 1,967 (69%) were excluded following the initial level of title and abstract screening. The full text of the remaining 846 citations was screened, and from these a total of 21 articles34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54 were eligible. All were considered primarily relevant to research question 2 (clinical validity), but they also provided data that permitted extrapolation to address research question 1 (analytic validity). No studies were identified which addressed research question 3 (clinical utility).

Figure 1
figure 1

Flow diagram depicting the flow of studies through the screening process. The 22 studies included in this figure comprise refs. 34-54, plus ref. 58 (which compared the results of different statistical analyses as applied to ref. 51, as explained on lines 2-4 of left-hand column of page 4).

1. What is the analytic validity of currently available SNP-based panels designed for prostate cancer risk assessment?

No direct assessment of the analytic validity of any SNP-based panels was identified in the literature search. On the basis of the 14 articles that were identified as providing information relevant to the assessment of the clinical validity of SNP panels, reported overall (i.e., including genetic markers included in the panels and variants of other genes) genotyping accuracy rates ranged up to >99.9%; SNP call rates were usually reported in the range of 98 to 99% (with a low of 89%), and reported concordance upon retesting was usually >99%.25,34,35,36,37,38,39,40,41,42,43,45,46,47,48,49,50,51,53,54,55 However, the methodologies described as the basis for determining analytical validity were not uniform across all analytes for some panels; in multiple cases, the SNP call rate of a given test panel was reported on the basis of data from two or more different chip platforms or analytical techniques. No evidence was identified about sources of variation in accuracy or analytical validity across different test platforms.

2. What is the clinical validity of currently available SNP-based panels designed for prostate cancer risk assessment?

Twenty-one articles describing 18 distinct SNP-based panels were identified as eligible (Supplementary Table S1 online). The analyses were based on 18 base studies, one of which was used in seven articles, one in four articles, two in three articles, four in two articles, and the remainder in single articles only (Supplementary Table S2 online). There was overlap between the panels assessed (Supplementary Table S5 online). The properties of a five-SNP panel were investigated in six articles,34,35,36,37,41,44 four of which also considered family history.34,35,36,41 This five-SNP panel, first described by Zheng et al.,34 is the basis of the Focus 5 predictive test for prostate cancer. A patent application has been filed by Xu et al.56 The test has been marketed by Proactive Genomics.57 The properties of an 11-SNP panel were investigated in two articles,38,44 and of a 33-SNP panel in four;49,50,51,52 one further article58 compared the results of the application of alternative statistical analyses with the data described by Kader et al.51 The other 15 panels included between 2 and 35 SNPs, but each was investigated in a single study only; several of these considered family history and age in the risk prediction model. All but two evaluations were case-control (association) studies, and were heterogeneous in terms of the composition of each panel (specific SNPs and the number included), the inclusion of other risk factor data, the populations in which they were evaluated, and the metrics used to judge the performance of the panel as a “test.” One evaluation was a cross-sectional study,46 and one was a cohort study of survival among men with prostate cancer.47 The studies made use of samples that had already been collected and/or were being assembled for research purposes. Four articles were wholly or partially based on cases and controls nested in an RCT of multimodal screening36,39,48,49 (one of which pooled these data with data from cohort studies) 48; two were wholly or partially based on cases and controls nested in the placebo arm of a chemoprevention trial51,52 and one in a cohort study.43 Other sources of ascertainment of cases included cancer registries34,35,38,39,43,44,47,49 and clinical series.36,37,40,41,42,43,45,46,49,50 Sources of controls included population registries,34,38,39,44,49 random digit dialing,35,43 volunteers,36,37,41,42,46,53 and screening studies.40,45 Thus, none of the studies was performed in clinical settings in which the panels were being used to guide clinical management.

In terms of the ability of multigene panels to stratify future risk and/or screen for current disease, across six studies, the range of observed diagnostic odds ratio for the five-SNP panel when other variables were not included was 2.4 to 4.5 ( Figure 2 ). Receiver-operator characteristic curves were computed in two of these studies, with the reported area under the curve (AUC) ranging from 58 to 73%, depending on the study and inclusion of other variables. AUCs across all panels ranged between 58 and 74%. Within individual studies, the incremental gain in AUC observed when the predictive model including the SNP data was compared with the best alternative non-SNP model (i.e., the absolute improvement in AUC) ranged from 2.5 to 11.0%.34,35,38,41,46,48,51,52,53 Of note, the largest increases were observed in comparison with non-SNP models whose AUCs were the lowest.48,52

Figure 2
figure 2

Forest plot of odds ratios and tabulation of associated areas under receiver-operator characteristic curve for the five-SNP and other panels for prostate cancer risk assessment. NR, not reported; SNP, single-nucleotide polymorphism.

Data on the ability of multigene panels to distinguish clinically important from latent disease were available for the 5-SNP,36,59 14-SNP,39 11-SNP,38 25-SNP,48 17-SNP,53 and 35-SNP panels.46 Regardless of the operational definition of “clinically important” prostate cancer, none of the evaluations suggested that any of these panels performed well in distinguishing between more and less aggressive disease.

Prediction of prostate cancer mortality in affected men was evaluated for the 5-SNP panel, with and without inclusion of family history,35 the 6-SNP panel,43 and the 16-SNP panel.47 Follow-up periods ranged from 3.7 to 10 years. There was no association between risk alleles and prostate cancer mortality for any of the panels,35,43,47 and there was no increase in the AUC of a model based on age, PSA, Gleason score, and tumor stage when SNP panel data were added.35

No data were found that directly addressed the question of whether factors such as ethnicity, gene–gene interaction, or gene–environment interaction affect the predictive value of multigene panels or the interpretation of their results. For one of the panels,42 we noted the development of separate tests for SNPs in steroid hormone pathway genes for non-Hispanic whites and Hispanic whites.

3. What is the clinical utility of currently available SNP-based panels for prostate cancer risk assessment in terms of the process of care, health outcomes, harms, and economic considerations?

No eligible studies addressing any component of clinical utility, including effects of using the panels on process of care, health outcomes, harms, and economic outcomes, were identified.

Risk of bias assessment of individual studies

For research question 1, information on the validity of the overall genotyping accuracy rates, SNP call rates, and concordance upon retesting was extracted for sets of genes that included genes in addition to those included in the SNP panels. For research question 2, the reference standard for cases was histopathological diagnosis in all of the studies, but latent or undiagnosed cancer was not checked for in control groups with two exceptions.41,46 Autopsy studies of men over 50 years of age who had died from other causes demonstrated a frequency of histologically proven prostate cancer of 30 to 40%.25 However, there are clearly ethical constraints to taking prostate tissue samples from asymptomatic men in order to exclude an undiagnosed disease. In one of the studies, controls were selected from the same group of men referred to prostate cancer centers who had either a PSA value ≥4.0 ng/ml or an abnormal digital rectal examination and who had no biopsy evidence of prostate cancer.41 The results of the clinical validity evaluation of the five-SNP panel in this study were similar to those of the other studies in which this panel was evaluated.35,36,37 In all of the studies, it seems unlikely that the index test result affected the decision to undertake prostate biopsy or the interpretation of histopathological examination of biopsy specimens. Because all of the studies were conducted in research contexts, however, it is not clear that decision making incorporated the same clinical data that would have been available in routine practice.

The execution of the genotyping component of the index test was adequately described in all but two of the studies.44,52 Most of the studies related to participants of European origin, and those that did not adjusted for ethnicity or conducted analyses restricted to participants of European origin. This is likely to have limited the risk of bias resulting from population stratification, that is, the presence within a population of subgroups among which allele (or genotype or haplotype) frequencies and disease risks differ.60 However, some of the other variables included in risk scores may have been prone to differential error because of the retrospective case-control design used in all but the PLCO Trial,36,39,49 the pooled data from the PLCO and ATBC trials and four cohorts,48 the follow-up of the placebo arm of the REDUCE trial,51,61 the PHS,43 and the San Antonio cohort.42

By combining the results of the NOS30 evaluation and the QUADAS31 criteria for the individual studies, all studies of the five-SNP panel were found to have a moderate risk of bias (Supplementary Tables S3 and S4 online). Based on three domains in the NOS30 (selection of controls, comparability of cases and controls, and method of ascertainment of cases and controls), along with limited data about genotyping methods and quality control, lack of specification of which candidate nongenetic variables were initially examined or considered for inclusion in the risk models, and lack of information about how these variables were assessed, the overall risk of bias of was assessed as being at least “moderate.” Using the same approach, the assessments of the other 14 panels were based on single studies, reported in 11 articles,37,38,39,40,41,42,43,44,45,46,47 and these all were also considered to have at least a moderate risk of bias (Supplementary Tables S3 and S4 online).

We were unable to assess the extent of publication bias in this literature. Overall, it is unlikely that any of the biases identified would be sufficient to alter the interpretation of the findings from (at best) inadequacy of evidence to clearly positive supporting evidence for any of the SNPs panels reviewed.

Discussion

We identified a number of SNP panels that we considered as meeting the definition of “close to commercially available.” They were widely variable in their makeup, containing a range of different SNPs, many combined with other risk factor data in predictive algorithms. We could not draw robust conclusions regarding their analytic validity. Following EGAPP guidance29 in drawing inferences from data available on accuracy and call rates, and concordance upon retesting for genotyping of large numbers of SNPs, including but not limited to the SNPs that were the components of the panels, it is likely that the analytic validity of genotyping of the five-SNP panel is high in research settings. However, questions remain about potential errors that could influence test results when applied in clinical management. This concern also applies to the other panels assessed, for which data were available only from single studies.

We acknowledge that we were unable to assess the extent of publication bias. In an attempt to address publication bias, we assembled a list of companies believed to have developed, or that are in the process of developing, SNP-based panels. On behalf of the authors, the Scientific Resource Center of AHRQ directly contacted 40 companies known to provide either test services or diagnostic reagents potentially relevant to the key questions in an effort to elicit unpublished sources of information.25 No response was received. Because of this lack of response, we did not pursue this approach further. Over the period 2010–2014, we reviewed information at human and clinical genetics scientific conferences and asked that information on genetic risk testing for prostate cancer to be sent to us, but we have received no follow-up.

The studies of clinical validity were predominantly done with participants of European origin, and so the applicability of these findings to men of other ancestral or ethnic groups is limited. Overall, these studies showed statistically significant associations between combinations of SNPs and risk of prostate cancer. When assessed using test evaluation designs, however, the risk models based on SNP panel data alone improved the AUC only marginally compared with non-SNP-based tests (which performed poorly overall) in distinguishing cases from noncases, distinguishing clinically meaningful from latent cancer, or in stratifying the prognosis of confirmed cases. It would be expected that test panels that include additional validated genetic markers, especially genetic variants shown to be causal as distinct from indicating a chromosomal risk region, will offer greater clinical validity than the panels considered in this review. None of the evaluations was conducted in a routine clinical setting, further limiting applicability of these findings.

For the most-investigated five-SNP panel, the maximal AUCs with the inclusion of SNPs ranged between 63 and 73%, and would not in themselves be considered useful for individual risk prediction. While we recognize that the AUC may not be optimal in assessing models that stratify individuals into risk categories,62 it has been suggested that proposed tests with an AUC of 75% or less are unlikely to be clinically useful.63 By way of comparison, AUCs for risk prediction models for breast cancer have ranged between 53 and 66%.64 The median AUC for the widely investigated Framingham Risk Score, when coronary heart disease was the outcome examined in 57 studies, was 77% (interquartile range: 71–83%).65 In the single study of the five-SNP panel that investigated mortality, there was no difference between SNP-based and non-SNP-based models. In the single study of the panel that addressed differences by Gleason score, as well as aggressive and nonaggressive disease, there was no association with scores derived from the five-SNP panel.

The next most investigated panel was the 33-SNP panel, for which the maximal AUCs ranged between 61 and 64%. The results were very similar, irrespective of the different methods of statistical analysis used.49,58 The AUC for the 11-SNP panel was 65%38; in a subset analysis limited to men without a reported family history of prostate cancer, the positive predictive value of the panel was 37%.44 There were only single studies of the other panels, almost all of which reported panel development, with no information on internal or external validation. When AUC was reported, it was in the range of 62 to 74%. Any increase in AUC compared with models not incorporating the SNP combinations was small. In the few studies that investigated the distinction between clinically important and latent/asymptomatic prostate cancer or prognosis, no associations with risk scores derived from the SNP panels were observed.

We are aware that the deCODE PrCa test was launched in 2008. This was further developed as the deCODE ProstateCancer test, available in different versions for men of European, African, and Asian descent, and the details of which were available only on a website.25 The test could be ordered as part of deCODE Complete, which analyzed genetic risk factors for 47 traits and conditions, or deCODE Cancer, which analyzed genetic risk factors for seven types of cancer.66 A patent application was filed by Gudmundsson and Sulem67 in May 2010. deCODE developed a partnership with ARUP Laboratories in 2010 to offer these tests. Amgen bought deCODE in 2012. In addition, prostate cancer was one of the diseases included in direct-to-consumer genetic risk assessments for multiple diseases offered by deCODE, Navigenics, and 23andMe.68 The offer of the all of the deCODE tests and the Navigenics test seems to have since been discontinued. In the United States, 23andMe no longer offers health-related genetic reports69; in Canada and the United Kingdom, reports on over 100 conditions are offered, but prostate cancer is not listed.70 The Myriad myRisk test, available through health-care providers, is designed to be a hereditary cancer risk test, combining information on family history and multiple genes associated with cancer risk is at least two to three times the general population risk and with syndromic overlap (thus the genes listed as associated with prostate cancer—BRCA1, BRCA2, TP53, CHEK2, and NBN—are associated with at least one other type of cancer).

Thus, currently available or documented SNP panels proposed for prediction of risk for prostate cancer have poor discriminative ability. Only two of the panels were validated in a data set independent of the data in which the panel was developed and by independent teams of investigators. None of the articles considered calibration, that is, the agreement between the proportion predicted to have the outcome and the proportion observed among the participants in whom the panel was tested. Evaluation of calibration is important if predictions based on a test panel are used to inform those tested or health professionals when making decisions.71 Moreover, discrimination and calibration have limited usefulness for clinical decision making. On the one hand, a panel with good discrimination in a research context may not be clinically useful if the threshold for clinical decision making is outside the range of predictions provided by the panel.71 On the other hand, a model with relatively poor discrimination may be clinically useful if there is little evidence or consensus to guide clinical choice between alternative managements; none of the studies used a decision-analytic approach.72

No evidence was found that addressed the important questions of clinical utility. This is not surprising given that this field is in the early stages of development.73,74 However, even if the review had identified more compelling evidence to support clinical validity, this would not in itself provide any direct evidence of the value of SNP-based test panels in reducing morbidity and mortality. The overall benefits of genomic approaches to risk assessment and screening will also depend on the consistent application of appropriate diagnostic strategies, which in turn will depend, at least in part, on clinicians’ willingness to trust the results of initial screening. The most important limitation with PSA-based screening is its lack of specificity (i.e., a high rate of false positives).15,16,17 Improving this by using SNP-based panels would reduce unnecessary diagnostic investigations and their associated morbidity and costs. However, this would be successful only if patients were willing to trust negative screen results, given a prevailing culture that seems to promote higher levels of screening as “better” screening practice.75,76,77 Thus, SNP-based screening panels need to demonstrate increased specificity, and may also need to demonstrate superior levels of sensitivity, compared with PSA-based screening for patients and their physicians to have confidence in their use, especially in view of the debate about PSA-based screening.18,78,79,80 Some studies have suggested that the use of SNP-based models in stratifying PSA thresholds could improve PSA-based screening.81,82

SNP-based panels may also have a role in stratifying future risk of prostate cancer in men who are currently unaffected. This would permit surveillance strategies to be tailored according to risk category: Those at highest risk could be offered more frequent screening and those at lowest risk could avoid unnecessary surveillance. However, this assumes that it would be possible to optimize surveillance strategies and ensure valid screening tests. It might also be assumed that men at higher risk would be more motivated to make positive lifestyle changes, although there is no evidence that this actually occurs from a trial of genetic and environmental risk assessment in the context of colorectal cancer screening,83 or from studies based on other forms of risk stratification.84,85 It has also been argued that while the risk of a disease outcome varies between risk strata, the risk of harm from treatment is more uniform.86 Thus some individuals could benefit more from treatment than others, but all would be at similar risk of harm.

It is also hoped that SNP-based panels may improve the overall tailoring of treatment so that only those men who are at risk of aggressive disease are offered radical surgical interventions. Evaluations of the prognostic accuracy of such panels would be a first step, but definitive evidence from rigorous trials would still be required to determine the overall utility of such an approach. A recently completed RCT, in which men were enrolled soon after PSA testing entered routine clinical practice, found that, compared with watchful waiting in men with clinically localized prostate cancer, radical prostatectomy did not reduce all-cause or prostate cancer mortality over a follow-up period of at least 12 years.87 Two RCTs initiated before PSA testing became widespread gave conflicting results about the efficacy of radical prostatectomy compared with watchful waiting.88 One of these trials highlighted important concerns with quality of life for both radical prostatectomy and watchful waiting.89 Syntheses of observational evidence are significantly hampered by serious methodological issues, including considerable variation in outcome reporting, lack of controls or risk adjustment, and overlap between studies.90 An RCT comparing watchful waiting with radical prostatectomy is ongoing in the United Kingdom.13,91

Taken together, therefore, these data show that benefits from improvements in prostate cancer risk prediction, screening, and prognostic stratification will depend to a large extent on clearer evidence that surveillance, diagnostic, and treatment strategies in themselves lead to reductions in morbidity and mortality.

Future research should include direct assessment of the analytic validity of specific panels and sources of variation in accuracy or analytical validity across different panels. It should focus on evaluating clinical validity more extensively and robustly in participants who are more representative of general clinical populations, and on directly comparing SNP-based panels with the existing standard of care. In addition to the consideration of discrimination and calibration, it would be helpful to use decision analysis methods.92 The development of SNP-based panels that could be applied to prostate cancer is still at an early stage. Incorporation of additional SNPs that increase the proportion of the polygenic variance accounted for by measured genetic variants would be expected to increase the absolute difference in risk between extreme tails of the distribution of a SNP panel.93 It has also been observed that adding a polygenic risk score (that is, a score based on SNP alleles associated with disease that do not achieve either nominal statistical significance (P < 0.05) or stringent genome-wide statistical significance) does not improve risk prediction for prostate cancer over replicated SNPs from GWA studies.94 These observations suggest a need to identify and validate further genetic markers to enable larger SNP panels to be developed. It has been estimated that known prostate cancer risk variants explain about a third of the familial risk of the disease among populations of European origin.21 SNPs identified from GWA studies are markers for the region of risk in which the causal SNP is located; the magnitude of risk associated with truly causal variants would be expected to be greater than with the risk markers identified so far. Therefore, the quest to develop future panels useful in risk stratification depends on further characterization of the regions of genetic risk already identified, as well as possible additional markers.95 More emphasis needs to be placed on distinguishing between aggressive and nonaggressive disease, and investigators should consider the possibility for subgroup analyses at the planning stage of studies. We also note the broader context in which risk stratification would be applied, in which there is a need for clearer evidence of the effectiveness of surveillance, diagnostic, and treatment strategies in reducing morbidity and mortality. This suggests that it would be very valuable to assemble a stakeholder panel to help prioritize research needs regarding the potential application of genomic profiling in prostate cancer risk assessment, as recommended in a recent paper on prioritization criteria methodology.96

Conclusion

The potential value of using SNP-based panels in prostate cancer risk assessment includes risk stratification, screening for undiagnosed disease, and assessing prognosis. We identified 18 SNP panels that we considered fulfilled the definition of “close to commercially available.” They were widely variable in their makeup, containing 2–35 different SNPs, many combined with other risk factor data in predictive algorithms.

With regard to stratifying future risk and/or screening for current disease, a five-SNP panel was evaluated in six articles. Five of the other 17 panels were investigated in single studies only. AUCs across all panels ranged between 58 and 74%. Thus, all of the panels had AUCs below 75%, the threshold below which tests are in general considered unlikely to be clinically useful. Any increase in AUC compared with models not incorporating the SNP combinations was small. In the few studies that investigated the distinction between clinically important and latent/asymptomatic prostate cancer or prognosis, no associations with risk scores derived from the SNP panels were observed. Thus, currently available or documented SNP panels proposed for prediction of risk for prostate cancer have poor discriminative ability.

No evidence was found that addressed the important questions of clinical utility relating to process of care, health outcomes, harms, and economic outcomes; a significant gap in the literature has been identified. However, even if the review had identified compelling evidence to support clinical utility, this in itself would not provide any direct evidence of the value of SNP-based test panels in reducing morbidity and mortality. Any benefit from improvements in prostate cancer risk prediction, screening, and prognostic stratification will depend to a large extent on clearer evidence about other components of the chain of evidence, in particular whether surveillance, diagnostic, and treatment strategies in themselves lead to reductions in morbidity and mortality.

Disclosure

The authors declare no conflict of interest.