Main

Biomarkers have been historically considered analytes measured in the blood/sera to determine systemic events. Identification of biomolecules in tissues can have more value than circulating biomarkers as they are accompanied by spatial information, they are closer to the ‘action,’ and they carry contextual information. Often, the context (or its absence) defines the results and validity of the assay (eg, a transcription factor localized to the nucleus). In tissues, the coexistence of multiple cell types in different functional states is a rich source of potential data. This complexity is even more pronounced in biomarker studies of tumor tissues with altered biological composition and frequent aberrant expression of molecules. For example, identification of integral membrane proteins or mRNAs in the cell nucleus, or of transcription factors in the cytoplasm, may carry biological information about function that can be inferred from localization.

In the clinical diagnostic setting, the vast majority of usage of immunohistochemistry (IHC) is not measurement but binary assessment of the contextual information of the biomarker.1 IHC has also been used for measurement. The ability to estimate the level of expression of a given marker within a specific tissue compartment (HER2 in the membrane of breast cancer epithelial cells) has led to assays that have gained FDA approval and to prescription of drugs to subsets of cancer populations that could not be achieved by assays where tissue is ground up or assays where analytes are measured in blood.

Here, we examine the IHC assay and extensions of this assay (quantitative immunofluorescence (QIF)) for measurement of diverse analytes in tissue. We describe the methods for in situ measurement using chromogens or fluorophores and the advantages and disadvantages of each. We also describe methods for quantification of these biomolecules and a vision for translation of these methods to clinical CLIA lab setting.

TISSUE BIOMARKER SIGNAL DETECTION SYSTEMS

Chromogenic Staining

Chromogens are molecules that allow detection of a target using enzyme-based precipitation reactions. They are used in IHC as they allow visualization of the immune complex (and hence the antigen) in the context of tissue architecture. Hematoxylin, the blue component of the hematoxylin and eosin stain, binds to negatively charged molecules (predominantly nucleic acids) and provides a counterstain for the chromogen. Different chromogenic compounds are commercially available in a range of colors.2 The most widely used, 3,3’-diaminobenzidine (DAB), is a highly thermochemically stable polybenzimidazole that provides brown-colored staining.3 The chromogen deposition occurs through a redox reaction4 catalyzed by an enzyme conjugated to an antibody or oligonucleotide detection scaffold.5, 6 This allows direct, bright-field light microscopy assessment of spatial distribution and quantity of a target in counterstained slide preparations.

Optimal chromogenic staining relies on the deposition of a sufficient amount of substrate to block light.7 In the case of DAB, a ‘desirable’ image is produced when the deposition of substrate leads to an absorbance of 1–2 units. This means that 90 to 99% of the light signal is blocked. Although this creates a contrast that is easy to read, it hampers the use of multiple colocalized chromogens on routine assays. Still, different colored chromogens may be used simultaneously to recognize the presence of two different targets and determine their relationship to each other. Chromogens have a dynamic range of nearly one log and are not compatible with in vivo imaging. However, chromogenic-based assays are widely used in biosciences and anatomic pathology because of their ability to localize the antigen in a familiar morphological context, easy interpretation, and simple equipment requirements.

Fluorescent Staining

Fluorescent reporters are widely used as labels in biology and medicine. They are molecules capable of absorption and emission of light at different wavelengths. Absorption of light results in a transition from ground- to excited-electronic state. Then the internal relaxation of the excited state results in radiative decay that emits light (photons), usually at a higher wavelength than the absorption peak.8 Various organic molecules, such as xanthenes, cyanines, and Alexa® dyes,9 are commercially available and encompass a wide excitation/emission spectrum from ∼350 to 800 nm. Advances in nanomaterials have generated new types of inorganic fluorescent molecules with superior photophysical characteristics. Among them, quantum dots10 are luminescent, nanometer-sized superconductor crystals that possess high quantum yield, narrow emission wavelength, and high resistance to photobleaching and chemical degradation. Covalent binding of functional groups11 has allowed their application to immune- or oligonucleotide-based assays using the avidin/streptavidin–biotin reaction. However, their size, hydrophobicity, and specific solvent requirements have limited their use in immunofluorescent applications.

Optical detection of molecules using fluorescence depends on photon emission. Available fluorophores that emit in the visible region of the spectrum possess a broad dynamic range, 2 to 2.5 logs, that makes them good for both visualization and quantification.9 Simultaneous measurement of multiple fluorescent dyes is extensively used for in vitro and in vivo assays, as well as in archival tissues. For appropriate results, the user must select combinations of dyes that have distinct, non-overlapping emission spectra. The limit to the number of dyes used in combination is a function of the emission bandwidth, commonly limiting routine use to 4 or 5 fluorophores synchronously. Postprocessing of the signal to remove overlapping signal or spectral unmixing12 can expand the number of multiplexed fluorophores to 6–8 or more (Richard Levenson, personal communication). New inorganic fluorescent nanomaterials may further increase the number of detectable combinations because of their narrower spectra.

Signal Amplification Systems

Amplification systems aim to increase the ease of detection of the signal by increasing the number of chromogenic or fluorescent molecules associated with each epitope. The first breakthrough involved using enzymatic amplification. By linking an enzyme to the Fc region of an antibody, cyclic enzymatic activity results in multiple deposition events so that a single molecule of enzyme might cleave thousands of molecules of substrate, resulting in at least 3–4 log amplification.13, 14, 15 The most common methods for enzymatic amplification are the peroxidase (horseradish peroxidase (HRP)) and phosphatase (alkaline phosphatase (AP)) reactions.5, 16 Another approach is biotin–avidin method. Here antibodies are biotinylated and each biotin can bind 4 molecules of avidin17 that may be directly linked to a fluorophore or chromogen, or is more commonly linked to an enzyme, thus further amplifying the enzymatic system.18, 19 Another example of additive amplification can be achieved by the use of long-chain polymers conjugated with HRP.20, 21 Incubation with a primary antibody is followed by binding of a secondary antibody conjugated to dextran or similar polymeric backbone that is then conjugated to 100 or more HRPs. Even further signal intensification can be accomplished through the deposition of tyramine compounds. Oxygen free radicals, liberated by peroxidase enzymatic reaction, result in crosslinking of tyramine to nearby proteins. The tyramine that is conjugated to biotin or directly to fluorophore provides another enzymatic amplification step to enhance visualization of antibody–antigen interaction.13 In each case, these amplification systems are typically used to saturation. That is, sufficient substrate is provided such that the amount of conjugated enzyme is the limiting factor for the deposition of signal (chromogen or fluorophore).

Some newer methods involve the use of DNA amplification to increase the signal. Rolling circle amplification (see below)22 uses a ligation reaction to form circular DNA or simply conjugation of circular DNA to a secondary antibody. This is then followed by addition of a short, complementary DNA primer and an enzyme to produce a single-stranded concatameric DNA molecule composed of thousands of copies of the original circles. Fluorophore-labeled oligonucleotide detectors are then used to bind the single-strand sequences. It has largely been used for nucleotide sequence detection, although it has been adapted to immune-based assays.23

Multiplexed Biomarker Detection Assays

Simultaneous interrogation of multiple targets in a single sample (ie, target multiplexing) provides information on tissue distribution, marker colocalization, and synchronous-level quantitation. This approach has been extensively applied on immune-based assays24, 25, 26, 27 and requires a combination of primary antibodies from different species together with species and isotype-specific secondary antibodies or other methods to prevent secondary antibody crossreactivity. The concurrent use of chromogenic reporters requires deposition of different substrates and is limited by the ability to distinguish different colors on routine light microscopy. It is also limited by the physics of the process. That is, if the first chromogen absorbs 95% of the light, only 5% is left for second or subsequent chromogens, limiting the multiplexing capacity. Fluorescent compounds are better suited for this purpose as they use emitted light rather than light absorption. However, fluorophores are limited by the overlapping photon emission spectra.

These problems can be addressed to some extent by either spectral unmixing or cycling of fluorescent substrates. Spectral unmixing7, 12, 28, 29 refers to the pixel-by-pixel determination of the relative contribution of each spectra to the overall signal intensity, providing the means for marker colocalization determination.30 This approach also allows postprocessing of the signal to unmix or subtract fluorophores or chromogens to allow visualization with better signal-to-noise ratio. Postprocessing can also be used to pseudo-color the resulting image to provide a more familiar appearance for fluorescent images.31, 32 Cycling or sequential staining and capturing of fluorophores provides another approach to high-level multiplexing. Gerdes et al33 used sequential fluorescent staining and alkaline quenching, resulting in imaging of up to 61 targets in a single tissue sample. They also show consecutive staining of the same protein for up to 10 cycles. However, this method is limited by the fact that some of the markers showed decreased detection sensitivity as a result of the dye inactivation process, thus limiting linear quantification.34

MEASUREMENT OF CANCER BIOMARKERS

Measurement of Proteins

In the clinical setting, most pathology laboratories perform in situ protein detection using single-marker chromogenic IHC with primary monoclonal antibodies and secondary antibodies conjugated with polymer-based amplification systems. Typically, diverse areas from one slide are evaluated and a trained observer (eg, a pathologist or researcher) renders an integrated categorical estimation of results. This approach has become the pathology standard because of its simplicity, low cost (eg, requires only a traditional light microscope), and preservation of contextual morphological information. Routine examples of in situ protein biomarkers used in the clinical setting include tissue differentiation markers (eg, keratins, vimentin, S100-protein, CD45, CD3, CD20, CD30, TTF-1, CDX2, BRST-2, HMB-45, RCC, HEPAR1, and so on), microorganism-related proteins (eg, viral antigens (hepatitis, HHV-8, CMV, EBV, BKV, and so on), H. pylori, Klebsiella, Pneumocystis, fungal elements, and so on), and specific anticancer therapeutic targets (ERα, PgR, AR, HER2, ALK, ROS1, and so on). In all these cases the presence of the protein of interest is evaluated in strict correlation with the cell type and compartment where it is detected. Typically, the markers are interpreted in a binary manner as present or absent that supports (and/or rules out) a determinate diagnosis. However, the output format and threshold for positivity/negativity is marker specific and subjective.35 Efforts to quantify in situ protein signals have been pursued predominantly for anti-cancer therapeutic targets. In these rare cases (ER, PgR, HER2, Ki67, and may be others at some institutions), semiquantitative scoring systems assess the location, relative intensity, and estimated percentage of positive cells. For the breast cancer markers ERα, PgR, and HER2, guidelines have been written by expert panels in attempts to unify and standardize the IHC approach.25, 36, 37, 38 Although it is the current standard, traditional IHC has been slow to adopt more rigorous quantitative methods.

To date, only two areas of anatomic pathology use systematic immunofluorescence (IF) studies for characterization of specific autoimmune/inflammatory disorders (eg, renal pathology and dermatopathology). This is ironic as IF predated IHC in early efforts to visualize proteins in situ. Fluorescence-based IHC or IF is also widely used in research laboratories39, 40 with single-target and multiplexing approaches, using diverse illumination devices (mercury/xenon light sources, LED illumination systems, and laser beams) and imaging modalities (eg, epifluorescence with optical lenses, pinhole-based confocal microscopy, spinning disc-based confocal imaging, multi-photon imaging, and total internal reflection (TIRF) devices). Comprehensive description of each of these modalities and their most common uses is beyond the scope of this essay and has been described elsewhere.41 In general, the major advantages of IF include its broad dynamic range, capability for multiplexing using different fluorescence channels, amenability for colocalization studies, fluorescence energy transfer protocols (FRET), and signal quantification using digital pixel measurements. These methodologies and others have been extensively used in preclinical mechanistic studies in cell and molecular biology, but surprisingly little in clinical labs in anatomic pathology. This is difficult to understand as increased desire to find mechanisms to match patient subsets to drugs brings increasing demands and complexity to biomarker studies. In fact, the absence of rigorous quantitative tests may in part be a cause for failure of some recent biomarker-driven clinical trials.42

During the past decade, developments allowing more accurate and automated signal quantification of IHC- and/or IF-stained slides have become available including the Inform® software (Caliper/Perkin-Elmer), TissueStudio® (Definiens/Leica), Ariol® (Genetix/Leica Microsystems), VIASTM (Ventana Medical Systems), AQUA® (HistoRx/Genoptix), and ImageScope® (Aperio Technologies/Leica), among others. These platforms use image segmentation and feature extraction-based signal quantification algorithms to measure the signal in selected areas or cells43, 44 In particular, some of these systems (Ariol, ImageScope, and VIAS) have received FDA approval for clinical use in breast cancer.43, 44 Other systems, including Multi-omyx, INform, Definiens, Tissuegnostics, Visopharm, and the AQUA method, use multiplexed IF and can measure targets in defined tissue regions, by feature extraction, more complicated spatially defined compartments, or compartments defined by molecular colocalization with specific proteins to generate objective (eg, human independent) region-specific scores.43, 44 For example (Figure 1), simultaneous visualization of cytokeratin (tumor cells), CD3 (T lymphocytes), CD8 (cytotoxic T cells), CD20 (B lymphocytes), and DAPI signal (all nuclei) allows characterization and quantification of TILs in the tumor and stromal areas.45, 46 Beyond traditional multiplexing, a cycling approach has recently been described where 60–100 protein targets are identified in the same sample.33, 47 The next generation of multiplexing is also at a very early stage. Two groups have reported the ability to multiplex up to 40 (with promises of 100 or more) proteins in FFPE tumor tissues using antibodies labeled with isotopic metals in the lanthanide series followed by detection using secondary ion mass spectrometry.48, 49 These rare earth metals produce a highly distinct signal in mass spectrometry with minimal if any signal overlap, showing the potential to overcome most of the current limitations of multiplexed in situ protein measurement. However, the mass spectrometry-based detection systems are still under development and will require further technological advancements before broad adoption.

Figure 1
figure 1figure 1

Multiplexing targets in FFPE tissues using immunofluorescence. (a) Schematics of the serial multiplexing protocol for simultaneous staining of CD3 (red-colored text), CD8 (green), CD20 (purple), cytokeratin (yellow), and DAPI (blue) in formalin-fixed, paraffin-embedded tissues. The primary antibodies, isotype-specific secondary antibodies, and fluorescence detection system are indicated. (b) Representative low-power microphotographs showing a hematoxylin and eosin-stained preparation of human tonsil (upper left). The same section was stained with the multiplexing TILs protocol indicated in (a) and the fluorescence images in each channel (same magnification) are shown for each marker.

Measurement of RNAs

Until recently, the in situ detection of mRNA using nonradioactive in situ hybridization (ISH) strategies were largely confined to identification of relatively high abundance transcripts, largely for research purposes.50, 51 Similarly, clinical use of in situ RNA detection was limited to identification of highly expressed EBV-associated proteins LMP-1 and EBER using chromogenic ISH to support the diagnosis of some epithelial and lymphoid neoplasms (eg, nasopharingeal carcinoma, endemic Burkitt’s lymphoma, lymphomatoid granulomatosis, posttransplant lymphomas, and so on). More recently, novel in situ hybridization strategies using increased numbers of hybridization probes/per target, in situ target sequence amplification, and novel signal enhancement methods have allowed detection of low-abundance mRNA transcripts in conventional FFPE tissues. Coupling of these methods with sensitive signal measurement/quantification tools has opened new avenues for the use of RNAs as cancer biomarkers.

There are four methodologically unique methods for in situ mRNA detection platforms using fluorescence-based signal detection that have the potential to detect single mRNA molecules. They are: (1) paired probe-based ISH assays (RNAscope® and QuantiGene RNAview®); (2) single-tagged multiple probes ISH (Stellaris® assay); (3) locked nucleic acid-based RNA detection (LNATM probes), and (4) in situ amplification/labeling-based systems (rolling-circle amplification with padlock probes). For information regarding additional RNA ISH protocols and novel DNA ISH methodologies for cancer diagnostics, we refer the readers to comprehensive reviews published elsewhere.50, 52, 53, 54, 55

Perhaps the most prominently used method is the paired probes method for mRNA ISH (also known as Z-probes or branched probes), based on the contiguous hybridization of various pairs of 14–20-long RNA oligonucleotides spanning typically an ∼1 kb area. Each probe is designed with a target-specific sequence, a spacer, and a tail sequence that is recognized by the signal amplification HRP- or AP-based system only when serially aligned with its partner probe (and not with potentially nonspecific single-bounded probes).56 The major advantages of this method are the high-level signal amplification, the noise reduction achieved by the paired Z-probe method, and the facilitation of the parallel use of positive and negative control probes (eg, Ubiquitin C or GAPDH as positive controls; and bacterial DapB as negative control) to determine sample integrity and experimental quality. Figure 2 shows an example of PTEN, UbC, and DapB stained in serial TMA sections and quantified using the AQUA method by multiplexing with pancytokeratin stain to define the tumor compartment. The paired probe system is available in two commercial assays (RNAscope and QuantiGene RNAview), providing a vast array of target probes and possibilities for customized probe design. The two commercial platforms share the paired probe design, but may differ in their signal detection method.

Figure 2
figure 2

Measurement of PTEN mRNA in FFPE tissues using in situ hybridization with the paired probes assay (RNAscope) coupled to quantitative fluorescence. (a) Representative fluorescence microphotographs showing in situ detection of PTEN mRNA (upper left, red fluorescence channel), Ubiquitin C mRNA (UbC, red channel, middle panel), and DapB mRNA (red channel, right panel) in archival breast cancer specimens. The lower panels show the cytokeratin stain in each tissue sample (green fluorescence channel). UbC was used as positive control for the presence of measurable mRNA and bacterial DapB was used as negative control and noise indicator for each sample and in each run. (b) Chart showing the levels of PTEN mRNA (blue columns), UbC mRNA (red columns), and DapB (green columns) in archival FFPE breast cancer samples. Serial sections from a tissue microarray including samples from 238 breast carcinomas (YTMA128) were stained simultaneously for each mRNA target and with cytokeratin protein. The levels of each marker were measured in the tumor compartment using the AQUA technology and are expressed as arbitrary units of fluorescence (y axis). Only spots including available scores for all three mRNA markers are included in the chart.

The single-label probe mRNA ISH approach was first described targeting each transcript of interest with 30–50 short (17–22 nucleotides long), singly fluorescently labeled RNA oligonucleotides.57, 58 This method allowed simultaneously allocation of many fluorescent molecules to each target transcript and was shown to be sensitive, specific, and suitable for FFPE samples. This method also allowed multiplexed target detection using different fluorescence channels. Although earlier studies have used variations on the single probe theme, none are widely published. More recently, a more comprehensive version of this assay became commercially available as the Stellaris RNA FISH and includes predesigned target oligonucleotides with bounded fluorophores as well as an online webtool for personalized probe design. To date, diverse studies have communicated results using this platform in in vitro cell preparations.59, 60, 61, 62, 63, 64, 65, 66 However, and to our knowledge, only two reports (from the same researchers) have used the Stellaris FISH assay to interrogate the association between expression of RIP2 and KIF14 transcripts in human breast cancer specimens.67, 68

The locked nucleic acid (LNA)-based RNA in situ detection is based on oligonucleotide probes made with chemically modified nucleotides that can increase the duplex stability at higher temperatures and increased specificity as compared with conventional RNA probes. The use of anti-digoxigenin HRP-conjugated antibodies after hybridization allows using signal amplification systems to detect the molecules. In particular, digoxigenin double-labeled and relatively short (12–24 nucleotide) LNA probes have been successfully used to detect mRNAs in cells and tissue preparations.69, 70 However, the main use of this technology in human tumors has been to detect and measure microRNAs.71, 72 Using this approach coupled to quantitative fluorescence, we have shown prognostic value of miR-221 in human breast cancer73 and the tumor suppressive role of miR-205 in human melanoma.74 Others have used this method to show expression of small RNAs (microRNAs and lncRNAs) in diverse human tissues and in archival biopsy material from various tumor types including pancreatic, breast, colorectal, nasopharyngeal, and lung carcinomas.75, 76, 77, 78, 79, 80, 81, 82

The padlock probes/rolling-circle amplification system was described nearly a decade ago and was originally used for DNA FISH and genotyping.83, 84, 85 This system has also been used for tissue mRNA visualization.86, 87 Padlock probes are linear oligonucleotides that bind reverse-transcribed cDNA of the mRNA of interest. After hybridization, probes are circularized by high-stringency ligation and the circular DNA padlock probe can act as template for rolling-circle replication steps using DNA polymerase and several amplification steps. The successive amplification generates multiple concatemers including the target sequence and several linker probe sequences. These linker sequences serve then as hybridization sites for fluorescently labeled oligonucleotides that are used to recognize the target. Using initial reverse transcription with LNA primers, this method was successfully used to detect single mRNA transcripts in paraformaldehyde-fixed human and mouse tissues.86 Moreover, the high primer specificity and the high fidelity ligase step also allowed the identification of single-base substitutions of the target transcript. This approach was recently used to identify mRNA mutations and characterize the expression of 39 different transcripts (including 21 targets from the Oncotype DX test) using a novel ligation-based sequencing bar-coding system in fixed frozen sections from human breast tumors.69 This method of in situ mRNA measurements could be performed in cytological imprints and FFPE tissues,88 and an automated platform to analyze this assay has been developed as an ImageJ plugin for signal quantification.88, 89

STANDARDIZATION AND MEASUREMENT IN THE RESEARCH LAB AND IN THE CLIA-CERTIFIED LAB

Quantitative Standardization of Predictive Cancer Biomarkers

Current ASCO/CAP guidelines for the interpretation of estrogen receptor (ER)36 and human epidermal growth factor receptor 2 (HER2, ERBB2)37 consider qualitative, chromogen-based IHC for status determination in breast cancer. Guidelines have been published by the CAP to validate antibodies90 and the FDA has cleared a number of assays for both pathologist-read and semiquantitative analysis of hormone receptors and HER2.91 However, even with protocol-locked robotic stainers and prediluted antibodies, IHC is still subject to considerable variability because of lack of tissue-based standardization92, 93 and subjective pathologist interpretation,94 among other factors. The effect of ‘by-eye’ assay optimization (acceptable to both CAP and the FDA), which is the current standard used to determine the quality of chromogenic staining, is still highly variable. A small study done by placing breast cancer TMAs into the work flow of two separate CLIA labs showed that variability is still present (Figure 3). Although that study was not a rigorous comparison of multiple labs and was limited by the fact that it was done on TMAs, the discordance is concerning and further studies have been proposed. This level of discordance has not been reported in systems read ‘by eye’ but that may represent the inaccuracy of human-based assessment compared with machine assessment. It may also be the result of subtle differences in antibody concentration because, as illustrated by McCabe et al95 and Welsh et al,96 antibody concentration can affect the scoring and the signal-to-noise threshold, thereby potentially changing the apparent expression of a given case. Regardless of the scientific basis, discordance studies are both politically and logistically challenging. To our knowledge, a comprehensive, quantitative assessment of biomarker concordance between multiple institutions has not been done.

Figure 3
figure 3

Discordance in predictive biomarker assessment between CLIA-certified labs. (a) A TMA with close to 500 spots was put into the daily work flow of two CLIA labs doing estrogen receptor (ER). The spots were then read independently by an author on this work, according to the 2010 ASCO/CAP guidelines (>1% of cells positive) as part of an effort to determine whether discordance was a function of percentage of cells positive for ER. Note that the overall discordance between the labs, with both using FDA-approved methods on automated staining platforms, is 18.7%. (b) Analysis of discordant spots revealed that they are distributed across the range of percentage of positive nuclei. A limitation of this work is that it does not represent whole tissue sections, but rather single TMA spots.

A few studies have examined the effects of antibody concentration and staining conditions on cut-point for estrogen receptor. Using a panel of ER-negative and -positive breast cancer cell lines, Welsh et al96 determined a threshold concentration that separated the cell line groups. By using quantitative IF in the same cells, ER concentration was translated into a continuous score. When these results were compared with conventional assessment of chromogenic IHC, 10–20% of ER-positive cases from two large breast cancer cohorts tested QIF positive/IHC negative. In this case, the decreased dynamic range of chromogenic IHC or perhaps low signal overwhelmed by hematoxylin counterstain did not allow detection of patients with lower levels of ER who would benefit from tamoxifen treatment. Assay/antibody selection also plays a role in predicting outcome. Cheang et al97 showed that selection of SP1, a higher affinity rabbit antibody, showed that the 8% cases that were discordant (positive for SP1and negative for 1D5) showed outcomes similar to concordant positives. Using a similar approach, Welsh et al98 showed an ER-positivity threshold was a function of the antibody tested in two retrospective cohorts using the same two validated antibodies. These studies showed that 7–11% of cases were positive using the antibody clone SP1, but not 1D5. It is notable that both of these antibodies are FDA cleared, suggesting FDA clearance is not currently as standardized and reflective of outcome for biomarkers as it is for therapeutics. It is also notable that when tested ‘by eye’ no difference was seen between the antibodies.99

Quantitative in situ assessment of mRNA is less well studied and used much less in the clinic compared with IHC. Before its mainstream usage, the performance, reproducibility, and interassay variation of the in situ mRNA detection strategies will require careful validation. The relatively lower level of expression of mRNAs compared with proteins and the sensitivity requirements for measuring low-abundance transcripts are still a major concern. In addition (and analogous to detection of proteins), the use of different signal amplification and detection systems and the design of probes targeting transcript regions of variable size could affect interassay reproducibility. Future studies will be required to address these points before common usage of in situ RNA measurements as a clinical tool.

Transition from the Research Lab to the Clinical Lab

Evaluation of tissue biomarkers can reveal their prognostic/predictive value and lead to the development of companion diagnostics. The implementation of such tests in clinical practice as a laboratory-developed test (LDT) requires validation of the test, as outlined in Fitzgibbons et al.90 An LDT is an in vitro diagnostic test that is designed, manufactured, and used within a single laboratory. LDTs can be used to evaluate a wide variety of analytes and can range from relatively simple tests to rather complex assays, such as multiplexed detection of numerous biomarkers. Although LDTs are important to the continued development of optimized diagnostic tests, their widespread use and direct impact on clinical decisions has raised concern in the US Food and Drug Administration.100 If the test is intended to be sold as a kit that can be performed in multiple labs, then the test is an in vitro diagnostic (IVD) and requires FDA clearance to be sold in the United States. Whether the test is an LDT or an FDA-cleared IVD, there is still no guarantee that it will be reimbursed. The Center of Medicare and Medicaid Services (CMS) and other third-party payers make individual decision on reimbursement based on clinical utility. Clinical utility101 should be distinguished from analytic or clinical validity. Clinical utility refers to the actionable value of the test, as determined by high levels of evidence102 that result in changes in patient care as a function of the outcome of the test. The lack of independent review and evidence for clinical utility of LDTs is one the FDA’s main concerns. As CLIA labs may produce and sell LDTs without any proof of clinical validity or utility (they only need to establish analytic validity), they may market and sell tests without proven value to the patient. To address this issue, the FDA has issued a draft guidance on future regulation of devices (including diagnostic tests like LDTs). They outline a timeline to regulation of LDTs to assure that the tests used in the provision of health care are safe and effective.

LDTs and IVDs in Clinical Trials and in the CLIA Lab

Recent advances in immunotherapy have opened new opportunities to patients with advanced-stage solid tumors. Targeting of co-inhibitory molecules, such as programmed death-1 (PD-1) and its ligand PD-L1, have resulted in unprecedented and lasting responses in patients previously treated with diverse standard therapies.103, 104 IHC-based PD-L1 testing has been considered as criteria for inclusion in clinical trials, but its evaluation has not been well defined and assays show wide variability and subjective interpretation. Moreover, the specificity and reproducibility of commercially available antibodies has not been assessed.45, 105, 106 Several ‘positivity’ cut-points have been proposed and used in different tumor elements.103 Although PD-L1 membranous positivity in more than 5% of tumor cells was found to have predictive value in the first reports, some negative cases still exhibited response.104 Other trials have used other criteria for positivity including distinguishing stroma and epithelial staining, and using trial-specific cut-points. Although there is justified fear of depending on LDTs as companion diagnostic tests, it is not clear that specific labeling and FDA-cleared IVD tests will be the solution. Current immune checkpoint trials are using highly variable and drug-specific cut-points, and vendor-specific labs and tests. It is possible that specific drugs may be approved with ‘labeling’ for companion diagnostic assays that have different requirements and different cut-points. This scenario emphasizes the need for standardization in IHC-based testing.

The solution to this problem is not clear. IHC for EGFR failed as a companion diagnostic test in the past for Erbitux.107 In the case of the MET IHC test, it is possible that its subjectivity or limited reproducibility contributed to the recent failure of the MetMab phase 3 trial.42 These mishaps suggest that, in the future, pathologists and oncologists will need to move to objective testing to attempt to take protein measurement to a point where it can be used as a reliable companion diagnostic test. Standardization and measurement are also likely to be important as target molecules such as mRNAs and other small noncoding RNAs enter the predictive biomarker field. One possible solution is the use of QIF. It is capable of objective, multiplexed interrogation of routine FFPE tissues and can accommodate rigorous standardization. Thus, we believe it has the potential to be used to develop next-generation companion diagnostic tests. QIF has been used to predict response to therapy25, 93, 108 and accomplish objective and reproducible assay validation.93 However, resistance to adopt QIF has historically prevented the use of these tests in routine CLIA lab settings. To date, QIF has been introduced in a couple of diagnostics labs (Genoptix and Clarient/GE), but with limited market uptake. It will be interesting to see whether QIF will advance to prominence for tissue-based companion diagnostics, or if other technologies with similar potential for standardization and quantification48, 49, 109 will fill the need for precision tissue-based research and diagnostics.

CONCLUSION

QIF allows objective, in situ interrogation of biomolecules in tissues. This method can be coupled to immune- and oligonucleotide-based assays to detect analytes at the protein and RNA levels. As a recently available LDT in the CLIA lab context, QIF has proven to be a unique tool for assay validation/standardization and investigation of relevant targets for research and clinical purposes. As pathology and oncology move from qualitative to quantitative, and as measurement of biomarkers demands accuracy and precision, new test methods will need to be adopted. It will be interesting to follow these developments over the next few years. One possibility is that IHC will be relegated to a binary qualitative test and other modalities like mRNA by RT-PCR or Nanostring will be used for companion diagnostic testing. However, it is also possible that the drive to quantification will elevate IHC in the form of QIF to bring protein assessment to the quantitative level needed for reproducible companion diagnostic tests.