Main

The human epidermal growth factor receptor 2 gene ERBB2 (HER2) is amplified and its protein is overexpressed in many cancer types.1, 2 Overexpression of ERBB2 is an established therapeutic target in breast and gastric cancers and is successfully exploited in the clinic using a variety of anti-ERBB2 agents, leading to remarkable outcome improvements.3, 4 Although the comprehensive molecular characterization of human colorectal cancer has identified ERBB2 amplification as a potential therapeutic target5 and ERBB2 overexpression has been controversially linked to prognosis,6, 7, 8, 9 the clinical significance of ERBB2 alterations remains elusive. Recently, we and others have found that activation of ERBB2 signaling causes resistance to anti-EGFR therapy in a fraction of metastatic colorectal patients, wild type for RAS codons 12–13.10, 11, 12 Of more relevance for the clinic, we have demonstrated that the combination of the anti-ERBB2 monoclonal antibody trastuzumab and the dual EGFR/HER2 small-molecule inhibitor lapatinib, but not either drug alone, is effective in inducing durable tumor shrinkage in ERBB2-amplified metastatic CRC patient-derived xenografts.10

Reasoning that ERBB2 could also represent a valuable therapeutic target in KRAS wild-type metastatic colorectal cancer patients resistant to anti-EGFR treatment, we designed the HERACLES trial, a phase II trial testing the combination of trastuzumab and lapatinib in ERBB2-positive metastatic colorectal cancer patients refractory to standard treatment, including cetuximab or panitumumab.11 Prior to starting the HERACLES trial, we elected to develop colorectal cancer-specific criteria for the definition of ERBB2 positivity. Immunohistochemistry and fluorescent or silver in situ hybridization (FISH or SISH) are current standard methodologies to detect, respectively, ERBB2 protein expression and gene amplification on formalin-fixed paraffin-embedded tumor samples. These are routinely used to establish ERBB2 status in breast and gastric cancer but have not been customized for colorectal cancer for which the reported rate for ERBB2 positivity ranges enormously from <1% to >50%.6, 7, 8, 9, 12, 13, 14, 15, 16, 17, 18, 19, 20 The aim of the present study was to develop a validated ERBB2 scoring system for colorectal cancer with the goal of identifying ERBB2-positive patients suitable for enrollment in the HERACLES trial.

Materials and methods

Study Design

This was a two-step study. Step 1 was conducted on formalin-fixed paraffin-embedded archival samples (archival test cohort). A consensus panel of three pathologists (i) defined, by similarity with breast and gastric cancers, the technical protocols of assessment for two immunohistochemistry staining protocols and two in situ hybridization methods; (ii) appointed one member of the consensus panel (MG) to review all samples; (iii) established the criteria by which samples for collegial review were to be selected by the appointed pathologist for consensus revision; (iv) collegially read and discussed the characteristics of the selected samples during a day-long consensus session, and (v) as a result of the consensus review formulated a diagnostic algorithm for ERBB2 positivity in colorectal cancer, referred to as HERACLES Diagnostic Criteria. Step 2 was conducted on samples from KRAS 12/13 wild-type metastatic colorectal cancer patients prospectively screened for the HERACLES trial (clinical validation cohort). A centralized pathology laboratory (Niguarda Cancer Center, Milan, Italy) processed all samples, including those already tested for ERBB2 at HERACLES participating centers. The study was conducted in accordance with the Declaration of Helsinki and Good Clinical Practice. Informed consent, allowing the use of the patient’s surgical/bioptical specimen for diagnostic purposes, was available for each archival sample of the archival test cohort, while for the screening (clinical validation cohort), patients signed a protocol-specific informed consent approved by independent Ethical Committees.

ERBB2 Status

ERBB2 expression analysis by immunohistochemistry was performed manually using HercepTest antibody (Dako A/S Glostrup, Denmark) and automatically on the automated Bench Mark Ultrasystem using the VENTANA 4B5 antibody, following the manufacturers’ instructions in both cases.

ERBB2 amplification analysis by FISH was performed with a PathVysion HER-2 DNA Probe Kit (Abbott Laboratories, Des Plaines, IL, USA) and SISH with a VENTANA 4B5 Inform HER2 dual-color on the BenchMark Ultra system (Inform HER2 DNA dual-color assay—Roche Tissue Diagnostics, VENTANA Medical Systems, SA). The scoring and evaluation for in situ hybridization was performed by counting ERBB2 and CEN17 signals from 100 nuclei per case. Non-tumor tissue (normal colon mucosa) was used as an internal negative control. Samples with a ERBB2/C EN17 ratio ≥2.0 were considered amplified. Images were captured with the Axiovision software using an Axio Zeiss Imager 2 microscope for IHC and SISH and the ISIS Metasystems software using an Axio Zeiss Imager Z1 microscope for FISH.

Analysis

A sample was considered evaluable for review when all test results were present. At least one evaluable sample was required to consider a patient evaluable. In the case of multiple samples, the highest immunohistochemistry value defined the patient score. In step 1, scoring procedures for immunohistochemistry and in situ hybridization took into account the standard scoring systems for breast and gastric cancers.21, 22, 23 In the first analysis, three staining parameters, ie, pattern of membranous reactivity, intensity of reactivity, and percentage of immunoreactive cells, were combined as shown in Figure 1a. Only samples with a ratio ERBB2:CEN17 ≥2 or staining equivocal (2+) or positive (3+) in at least 10% of cells were selected for review, together with a small set of samples scoring immunohistochemistry 0/1+ as negative controls. In step 2, only the VENTANA 4B5 and FISH kits were used for ERBB2 determination. All samples were centrally scored according to HERACLES Diagnostic Criteria.

Figure 1
figure 1

Initial criteria for ERBB2 determination of immunohistochemistry scores (0/1+/2+/3+) (a); photomicrography of typical ERBB2 protein expression (VENTANA 4B5 and HercepTest) (b); and ERBB2 gene amplification (fluorescent (FISH) or silver in situ hybridization (SISH)) (c) in colorectal cancer archival samples.

We used percentage and mean for qualitative variables and s.d. for quantitative variables. Receiver operating characteristic (ROC) curves were used to determine test performances for each IHC method; accuracy, sensitivity, specificity, and positive and negative predictive values, were calculated in the two series using in situ hybridization as the gold standard and a cutoff for immunohistochemistry positivity as defined by the HERACLES Diagnostic Criteria.

Results

Archival Test Cohort (Step 1)

The characteristics of the 359 patients with colorectal adenocarcinoma who provided 482 tumor samples for the archival test cohort and the Study Consort are reported in Supplementary Figures S1A and B. Of the 482 samples, 66 (14%, derived from 49 patients) were from KRAS exon 2 mutated cancers. Overall, 134 samples (22 KRAS mutated) could not be assessed with all planned procedures because of insufficient tumor material (N=36, 27%), initial technical problems with the first SISH method used (INFORM HER2-VENTANA, N=94, 70%) and sub-optimal tissue preparation (N=4, 3%), leaving 348 evaluable samples (44 KRAS mutated) (Supplementary Figure S1B). Fifty-eight percent of the evaluable KRAS wild-type samples were from primary cancers, 13% from metastatic sites, and 29% from both primary and metastatic sites (Supplementary Figures S1C and D).

Immunohistochemistry and SISH results are reported in Table 1. Of the 44 KRAS mutated samples (Table 1A), none was ERBB2 amplified or expression positive (3+). Equivocal (2+) staining was observed in six samples with VENTANA 4B5 and one sample with HercepTest, leaving 97% HercepTest and 86% VENTANA 4B5 negative samples. In KRAS wild-type samples (Table 1B), immunohistochemistry positivity (3+) with strong circumferential membrane immunostaining was observed in 14 cases (4.6%) with HercepTest and 16 cases (5.3%) with VENTANA 4B5. Equivocal staining (2+) with weak-to-moderate circumferential basolateral or lateral immunoreactivity was seen in >10% of cells and was more frequent with VENTANA 4B5 (N=23, 7.6%) than with HercepTest (N=3, 1%). VENTANA 4B5 did not stain (0) or faintly stained (1+) 87% of samples (0, N=170, 1+, N=95; total N=265); none of these samples was amplified by SISH. HercepTest stained negative (0) in 74% of samples (N=226), including one SISH-positive sample (staining 3+ with VENTANA 4B5) and faintly stained (1+) the remaining 61 samples (20%), none of which was amplified. The comparative analysis between immunohistochemistry and SISH results, used as the ‘gold standard’, is also reported in Table 1. Overall, ERBB2 amplification in KRAS wild-type cases was observed in 17 samples (5.6% of samples, corresponding to 5.1% of patients), and of these 16 (94.1%) scored positive (3+) and 1 equivocal (2+) with VENTANA 4B5. With HercepTest, 14 (82.4%) were positive (3+), 2 equivocal (2+) and 1 negative (0).

Table 1 Results of the immunoistochemistry test with HercepTest or Ventana 4B5 and silver in situ hybridization (SISH)

Concordance between samples from primary and metastatic tumor sites in the same patients was possible for 95 paired samples from 47 patients (1 patient had two metastatic site samples, Supplementary Table S1). There were four ERBB2 amplified cases (8.5%): VENTANA 4B5 stained all four paired samples positive (3+), while HercepTest only three of the pairs. The remaining amplified case scored equivocal (2+) in the primary and predominantly negative (0) in the paired liver metastasis. This latter case, however, was also the only one showing clear intra-sample heterogeneity out of the 17 ERBB2-amplified cases, with areas of strong 3+ reactivity to VENTANA 4B5 coexisting with areas of no (0) or faint reactivity (1+) corresponding to ERBB2-amplified and non-amplified tumor areas, respectively, in both the primary and the metastatic tumors (Figure 2).

Figure 2
figure 2

Atypical pattern of heterogeneous ERBB2 expression and amplification in colorectal cancer (ac) and liver metastasis (df) from a single patient (#01114) clearly shows clusters of tumor cells with different immunoreactivity ranging from 0 to 3+, corresponding to areas of high (arrows) and no amplification. (a, b, d, e) ERBB2 protein expression by VENTANA 4B5 at x4 (a, d) and x20 (b, e). (c, f) ERBB2 gene copy number by fluorescent in situ hybridization at x63 magnification.

Consensus Panel of Pathologists Review

The Consensus Panel of Pathologists reviewed only cases that were informative for all three analyses. Although a formal technique, such as Delphi, was not used to reach consensus, the three pathologists reviewed 30 positive (3+) or equivocal (2+) and a random selection of 6 0/1+ cases, derived in total from 25 patients; these were read at multi-head microscopes and discussed until a consensus was reached. According to the Consensus Panel of Pathologists, all ERBB2-amplified cases examined (N=14) showed a typical immunoreactivity pattern consisting of circumferential or basolateral or lateral immunohistochemistry staining of the cancer cells, resulting in highly homogenous and intensely stained tumor areas, with normal mucosa or liver tissue background of moderate intensity. Cytoplasmic staining of the cancer cells was not considered. The Consensus Panel of Pathologists agreed that the percentage of immunoreactive tumor cells within each sample significantly differed between VENTANA 4B5 and HercepTest (Figures 3a and b). With VENTANA 4B5, 93% (N=13) of the 14 amplified samples scored immunohistochemistry positive in ≥50% of cells, mostly clustering at or above the 90% cellularity mark, while non-amplified positive (3+) or equivocal samples (2+) clustered below the 30% cellularity mark. Interestingly, the only sample with less than a 50% cellularity score was the sample with high intra-sample heterogeneity shown in Figure 2. In contrast, HercepTest-positive (3+) amplified samples were widely scattered, with almost half of the cases showing ≤50% cellularity (one quarter showing <30%). The performances of HercepTest and VENTANA 4B5 methods were calculated considering SISH as the reference gold standard. Both immunohistochemistry tests showed excellent accuracy with ROC analysis, with VENTANA 4B5 marginally more accurate (VENTANA 4B5: area under curve: 0.98; 95% confidence interval (CI), 0.95–1.0; HercepTest: area under curve: 0.95; 95% CI, 0.85–1.0; P=0.63; Supplementary Figure S3). Because of the limited number of cases, it was not possible to calculate the best cutoff value within separate immunohistochemistry scores. Therefore, we conservatively combined equivocal (2+) and positive (3+) staining samples and used two different empirical cutoffs for percentages of immunoreactive cells (≥10% or ≥50%). VENTANA 4B5 best performances are observed with a 50% cellularity cutoff showing 96.7% accuracy, 100% sensitivity, and 94.1% specificity (Table 2). At the same cutoff, HercepTest was more specific (100%) but considerably less sensitive (71.4%). VENTANA 4B5 always slightly outperformed HercepTest, with the latter’s best performance, observed at a 10% cellularity cutoff, showing 90% accuracy, 92.9% sensitivity, and 87.5% specificity. False negatives were only present when HercepTest was used. False positive rates with VENTANA 4B5 were 30 and 3.3%, with a cellularity cutoff of 10% and 50%, respectively. The Panel of Consensus Pathologists also evaluated SISH performance in comparison to FISH by blind reading 30 paired slides randomly selected from both the positive/equivocal (N=23) and the negative (N=7) immunohistochemistry samples. Concordance was 100% (data not shown) with example patterns shown in Figure 1b. At the end of the review process, the Consensus Panel of Pathologists formulated a set of specific criteria for determining ERBB2 positivity in colorectal cancer, referred to as HERACLES Diagnostic Criteria. The assessment of ERBB2 positivity in colorectal cancer according to these criteria is a two-tier process whereby locally assessed samples, if negative (0/1+) or equivocal (2+) in <50% of cells, are excluded from further testing, while centralized immunohistochemistry and in situ hybridization re-testing is carried out according to specific staining intensities and cellularity cutoffs as reported in Table 3.

Figure 3
figure 3

Archival test cohort: consensus panel results: scatter plot of the immunohistochemistry score and percentage of immunoreactive cells obtained with VENTANA 4B5 (a) or HercepTest (b) versus amplification assessed with silver in situ hybridization (SISH; black triangles).

Table 2 Performances of two immunohistochemistry methods against in situ hybridization as the 'gold standard' for detection of ERBB2 gene amplification
Table 3 Consensus panel recommendations on ERBB2 scoring for colorectal cancer (HERACLES Diagnostic Criteria)

Clinical Validation Cohort (Step 2)

We screened 830 KRAS wild-type colorectal cancer patients according to HERACLES Diagnostic Criteria. ERBB2 overexpression (2+/3+) and amplification prevalence rates are reported in Table 4 together with the summarized archival results for comparison. The prevalence of overexpressed cases significantly decreased from the archival to the screening cohort, from 13.7 to 8.4%, respectively (P=0.02). On the contrary in both series the prevalence of confirmed ERBB2 amplification was 5% (5.1 vs 5.2%). Interestingly all immunohistochemistry positive (3+) cases were amplified in both cohorts, while the percentage of amplified equivocal (2+) tumors increased from 4.3 to 27% from the archival to the screening cohort, respectively. VENTANA 4B5 test performances in the screening compared with the archival data sets using the ≥50% cellularity cutoff were identical for accuracy (96.7%), sensitivity (100%), and negative predictive value (100%). Specificity was higher in the screening data set than in the test data set (96.6 vs 94.1%, respectively), while, inversely, the positive predictive value decreased from 93 to 61% (Supplementary Table S2).

Table 4 Prevalence of ERBB2 expression and amplification in KRAS exon 2 WT patients in the archival and screening cohorts

Discussion

While exploring ERBB2 as an actionable therapeutic target in colorectal cancer in the HERACLES trial, by assessing the activity of dual ERBB2 inhibition with trastuzumab and lapatinib, we investigated the prevalence of ERBB2 overexpression and amplification in more than a thousand prevalently metastatic colorectal cancer patients. The aim of the study was threefold.

First, a panel of ERBB2 expert pathologists established that breast and gastric criteria for ERBB2 positivity determination could also be made suitable to score ERBB2 accurately and reproducibly in colorectal cancer. Technically, the Consensus of Pathologists Panel selected VENTANA 4B5 over HercepTest for protein expression determination because of lack of false negatives with the former. On the other hand, VENTANA 4B5, which recognizes both ERBB2 and ERBB4, had a higher false positive rate than the HercepTest, which uses a ERBB2-specific polyclonal antibody. This could be related to a cross-reaction of VENTANA 4B5 to ERBB4, or to operator-dependent factors resulting from the stronger retrieval processing procedure and the darker chromogen of the method. Given the relatively low prevalence of ERBB2-positive cases in colorectal cancer, the panel chose sensitivity over specificity to maximize potential ERBB2-positive patient’s identification. The panel, recognizing that SISH analysis is more adaptable to processing of large sample batches (as in the archival collection) while FISH is more suitable for one-by-one screening, and because of the nearly perfect concordance between these two methods, also selected FISH analysis to determine gene amplification in the clinical validation cohort.

Second, we defined a ERBB2 diagnostic algorithm, referred to as HERACLES Diagnostic Criteria. It focused on ERBB2 amplification, as preclinical data supporting the HERACLES trial suggested amplification and not only overexpression as the predictive marker for response to anti-ERBB2 treatment.10 Within this algorithm, to minimize false positives owing to unspecific staining, we disregarded cytoplasmic ERBB2 expression,19, 24 as in breast cancer only membrane-bound ERBB2 expression is associated with ERBB2 gene amplification.19 As in breast cancer, in our colorectal cancer series, ERBB2 overexpression associated with gene amplification was also already clearly membrane bound at low magnification. Interestingly, normal colon mucosa was generally staining with higher intensity than both the normal ductal epithelium and the normal gastric epithelium (Supplementary Figure S2). In breast and gastric cancer, the background staining noise can lead to reading biases, ultimately impinging on the concordance between ERBB2 immunohistochemistry and in situ hybridization.25, 26 The presence of a moderately intense normal background might thus have contributed to the scoring of the false negative case observed also with the HercepTest in our test cohort. Normalization protocols to deal with this issue in breast cancer have been both suggested25 and highly criticized.27 No formal normalization protocol was used in the present study, but the normal mucosa reactivity was a factor constantly considered by the pathologist during the scoring process, and the main reason why centralized testing has been recommended.

Third, to have a better estimate of the number of patients needing to be screened in order to achieve the intended sample size, we determined the prevalence of ERBB2 amplification in a representative population from the catchment area of the planned trial. In our study, the prevalence rate for ERBB2 amplification in KRAS wild-type samples was almost identical, i.e., 5.1 vs 5.2% in the archival test and in the clinical screening cohorts, respectively, thus prospectively validating the Heracles Diagnostic Criteria. ERBB2 overexpression rates were also almost identical for positive (3+) cases (4.7 vs 4.0%), whereas equivocal scored (2+) cases halved from the test to the validation cohort (9.0 vs 4.5%, respectively), suggesting a learning curve by local and central pathologists.

ERBB2 overexpression and amplification rates in colorectal cancer range widely owing to differences in technical approaches, antibodies, scoring protocols, cellular localization, and cellularity cutoffs.6, 7, 8, 9, 12, 13, 14, 15, 16, 17, 18, 19, 20

The present study is unique because the only study with an equally large sample size, albeit without clinical validation, reported much lower rates of both ERBB2 expression (2.7%) and positive amplification (1.6%).8 This difference very likely resides in the fact that while our findings have been obtained on a primarily metastatic population, Heppner et al.8 tested only primary tumors using a different staining antibody (SP3) and a different in situ hybridization method (CISH), and employed gastric cancer criteria to test and score the samples. Interestingly, the characteristics of ERBB2 positivity in our colorectal cancer samples are more similar to those of breast cancer than to gastric cancer.3, 22, 28 ERBB2 staining in gastric cancer is not directly correlated to ERBB2 amplification, with up to 30% of ERBB2 amplification-positive cases showing only focal staining or diffuse staining in <30% of tumor cells.24 On the contrary, ERBB2 protein expression and gene amplification in our colorectal cancer samples tallied quite accurately. None of the ERBB2 expression-negative colorectal cancer was amplified, whereas all immunohistochemistry-positive colorectal cancers were, leaving only the equivocal staining tumors in a grey zone. In addition, the cellularity of ERBB2 amplification is quite homogeneous with all positive cases displaying the amplification in >50% of cells. Another key point is intra-sample heterogeneity. In gastric cancer, areas strongly expressing ERBB2 at low magnification are often intermixed with areas displaying a much weaker membranous signal, and are only discernible at medium or high magnification corresponding to superimposable areas of high and no or patchy amplification.21 A similar pattern was only observed in 1 out of the 53 amplified cases in our colorectal cancer series, precisely in the one case showing also high heterogeneity between primary tumor and liver metastases samples. On the other hand, ERBB2 membrane distribution in equivocal staining colorectal cancer samples is preferentially basolateral (U shaped) as in gastric21 but not in breast cancer.4, 28 However, >70% of colorectacl cancer equivocal (2+) cases are not amplified and those that are amplified tend to show a slightly more patchwork-type intensity. Variability in the interpretation of immunohistochemistry staining and in situ hybridization results is to be expected in colorectal cancer, as testified by the still ongoing controversies on ERBB2 assessment in breast and gastric cancers, despite the continuous evolution of scoring guidelines.21, 23, 29, 30

The crux of the matter, however, to ensure accurate identification of patients that might benefit from ERBB2-targeted therapy, across all tumor types, is whether the level of ERBB2 expression tallies or not with the level of ERBB2 amplification. In breast cancer, the level of ERBB2 protein is a known predictive factor of response to trastuzumab in metastatic disease3 and of complete pathological response to the neoadjuvant therapy combination of lapatinib and trastuzumab.31 Surprisingly, however, the level of amplification has been proven so far to be truly outcome predictive only for neoadjuvant therapy.32, 33, 34 With gastric cancer in the ToGA trial, a trastuzumab-based regimen was found more effective in patients with tumors highly expressing the ERBB2 protein.4 A positive correlation between levels of gene amplification and overall survival after trastuzumab-based therapy has also been recently established.28 Early results from the HERACLES trial suggest that in colorectal cancer ERBB2 is a positive predictive marker for anti-ERBB2 targeted therapy.35

In conclusion, in a series of 1086 colorectal cancers, we optimized a diagnostic decisional algorithm referred to as HERACLES Diagnostic Criteria for ERBB2 protein overexpression and documented that ERBB2 amplification occurs in 5% of the population. This study is the first clinical step toward a paradigmatic precision medicine path, initiated by functionally validating ERBB2 as a therapeutic target in patient-derived xenografts from anti-EGFR therapy-resistant, KRAS wild-type metastatic colorectal cancer patients. Data from this study have provided important information on the stratification criteria for the HERACLES trial, a phase II clinical trial testing the combination of trastuzumab and lapatinib in metastatic colorectal cancer patients resistant to cetuximab or panitumumab.