Introduction

Miniaturized multiplexed serological assays have been applied to characterize the serological response against a variety of pathogens including HBV, HCV, Helicobacter pylori, Mycobacterium tuberculosis (Mtb) and influenza1,2,3,4,5. In addition, (whole) proteome arrays enabled the definition of reactive antigens sets for a variety of pathogens like Mtb6, Plasmodium falciparum (3D7 strain)7, HPV8, Burkholderia pseudomallei9 and Coxiella burnetii10,11,12. Arrays containing thousands of recombinant human proteins were used for the discovery of antibodies directed against self-antigens, which could be potential biomarkers for diagnostic purposes13,14,15. In current clinical practice, serological assays are well established within the field of autoimmune diseases16,17,18. All of these serological assays require quality controlled sample testing procedures.

Prior to implementation into diagnostics appropriate assay validation has to be achieved. FDA guidelines for the development of immunoassays state that “sufficient QC samples should be used to ensure control of the assay19. As a consequence, such quality controlled samples should be available for assay validation, as well as for large scale screening and diagnostic purposes20. Quality control samples are necessary within every assay to ensure that the assay performs within specifications and should be reviewed before interpretation of the results of individual serum samples. The purpose of a quality control is to report that all experimental steps were executed correctly in an assay experiment and to be able to compare data over a longer period of time.

Reference samples for sandwich immunoassays targeting serum proteins can be easily generated by spiking the target analytes into a plasma or serum matrix. However, any serological assay is based on the presence of human antibodies specific for the selected antigens. For singleplex assays it is usually sufficient to select a serum with strong reactivity towards its respective antigen. However, identifying single sera with appropriate reactivity against a multitude of different antigens, such as required for antigen arrays, has been very difficult if not impossible in many cases. Moreover, a single serum as a quality control, covering all targeted antigens would only be available in limited amounts and may thus confine test development, validation and clinical evaluation. A common, but surprisingly little documented approach is to create pools from multiple sera in order to warrant reactivity towards all target antigens and to generate a sufficiently large quality control stock21,22.

Here we present a mathematical approach towards a sample pooling strategy where a composition of such a pool was calculated from an available data set with the aim that this pooled sample shows a positive response for each analyte. The threshold for a positive signal is defined by a multiple of the negative control population. In our case we chose four times the negative control population. If a signal in the pool exceeds this threshold the analyte is covered.

Results

The serological response of 142 serum samples obtained from patients with active tuberculosis (TB) was analyzed using a bead array consisting of 71 TB proteins (Supplement A). The serological response of these sera was heterogeneous, ranging from 2–69 TB-associated proteins per serum sample (Figure 1). Out of our 142 sera we found no serum reactive to all 71 antigens under investigation (Supplement B). Our mathematical approach identified sets of positive sera, which could be pooled to generate a quality control serum to react with all 71 TB proteins. This strategy allowed us to create defined reference samples, revealing a simultaneous serological response against all TB antigens employed in the assay.

Figure 1
figure 1

Distribution of samples that give positive signals.

The plot shows the number of TB-associated proteins reacting with a given number of serum samples. No single serum revealed a serological reactivity to all Mtb antigens. Only a small number of samples revealed a serological reactivity for a broad range of TB antigens.

Appropriate data sets of the serological response pattern against the targeted antigens for a set of available samples provided the basis for our calculation. A mathematical model was developed to predict reactivity characteristics of a given sample pool. We hypothesized that those values could be estimated from the quantitative serological response measured for the individual samples. Our first assumption was that if two samples are combined, assay signals would add up (see Figure 2). The second assumption was that on average assays show dilution linearity with a slope of 1.0.

Figure 2
figure 2

Schematic representation of a sample pooling.

The red lines represent an antigen-specific threshold. Samples (A–C) show individual responses and none exceed all thresholds. By combination a new pooled sample, with a positive response to all antigens, is created.

A linear integer program was constructed from the model, screening data and the threshold vector. The objective function was to maximize the number of serum reactivities, given a fixed number of serum samples from which the pool should be generated. The relative dilution of the sample pool was kept identical to the dilution of the individual serum. Results of the optimization approach revealed suggestions for the generation of optimal sample pools, differing in composition and size (Supplement C).

For the verification of our theoretical results, the following experiments were performed. In a first experiment the assumption of additivity of the individual signal values of the pooled samples was tested. A suspension bead array displaying the different tuberculosis antigens were incubated with human serum samples. Bound human IgG antibodies were detected with an R-PE-labeled anti-human IgG. The read-out was performed on a fluorescence-based bead array reader (Luminex FlexMAP3D). Sample pools were created by subsequently pooling samples in the scheme S1, S1 + S2, S1 + S2 + S3, … S1, S1 + S2, S1 + S2 + S3, … up to a pool consisting of six samples. As shown in Figure 3A–E a strong correlation (R > 0.98) was observed between the values predicted from single sample screenings and the signal generated by the sample pool. The slope of the linear regression was 1.04 for the least complex pool and decreased to 0.7 for the pool containing up to six samples. This data supports our hypotheses about signal additivity. The observation that the signals generated by the pools for a given antigen get stronger when the number of sera in the pool increased is notable (Figure 3). A larger number of different paratopes for the same antigen originating from different individual sera could also explain this observation.

Figure 3
figure 3

Correlation between predicted and measured pool data.

The graphs show the strong correlation between predicted and measured results, although the linear slope is 0.7 for the most complex pool (E). (A) Correlation of prediction and measurement for pool S1 + S2 (B) pool S1 + S2 + S3 (C) pool S1 + S2 + S3 + S4 (D) pool S1 + S2 + S3 + S4 + S5 (E) pool S1 + S2 + S3 + S4 + S5 + S6.

In a subsequent experiment our algorithm was applied to find optimal pools for the total panel of 71 TB antigens. Here a data set derived from 142 previously tested serum samples was used as input. The allowed range for dilution of a single sample within the pool was 1:200 to 1:2000. Interestingly, we found that we had to consider that the signal intensities of each individual serum added to the pool are “diluted” with the other sera during the pooling process (Figure 3). The final serum dilution of the pool was set to 1:200, according to the standard dilution of our serological TB assay. An artificial cutoff for each TB antigen was calculated from the quadruple of the values measured in the negative control sample. The algorithm suggested ten solutions consisting of up to ten parts of up to four different samples.

Thus, as expected, it was not possible to cover all analytes by a single sample or by pooling two individual samples. Our algorithm suggested pools consisting of at least three samples to reveal a serological response to all TB antigens. The signals of the pool for all analytes were higher than the defined threshold. The measured values correlate with the predicted values with a correlation coefficient of 0.98 (see Figure 4 A). The correlation between the pool and the three single samples is comparatively low as shown in Figures 4 B–D. This shows that no sample stands out in the pool and that the signal pattern is the result of the composition of all three samples.

Figure 4
figure 4

Correlation of the calculated sample pool MFI values and the measured pool MFI values.

While (A) the correlation between the sample pool and the predicted values (a weighted sum of single sample values) are high, the single samples (B–D) show weaker correlation with the pool. This data shows that the unique coverage characteristic of the serum pool is due to the combination of the three samples.

Discussion

We have created a technical quality control for multiplexed antigen assays to make sure that all antigens used in the assay has not lost its antigenicity and that all technical steps are executed correctly.

With this mathematical model, we can create quality control samples for roughly 60000 samples from only 1.5 mL of three pooled serum samples. We also created a second pool consisting of four samples (10 parts; 5 parts sample 1, 3 parts sample 2, 1 part sample 3 and 4) with a correlation coefficient of 0.9 between predicted and observed MFI values (data not shown). Once the first pool is running out, one can easily create a second pool consisting of different samples. Our results demonstrate that our mathematical model for sample pools makes adequate predictions. We demonstrated that quality controls for multiplex antigen assays can be created by the systematic selection and pooling of samples. Our systematic approach is scalable and can be easily adapted to other assays platforms. We believe that our method provides an important tool for diagnostic assay development and test evaluation.

A software tool for the generation of reference samples is available at http://webservices.nmi.de/samplepooler.

Methods

Coupling of antigens to magnetic carboxylated beads

An automated bead handler (KingFisher 96, Thermo Scientific, Schwerte, Germany) was used to couple the antigens to magnetic carboxylated beads (MagPlex Microspheres, Luminex-Corp., Austin, TX). The Mtb proteins were covalently coupled to the beads using EDC/sulfo-NHS chemistry. The bead stock was vortexed and sonicated thoroughly for at least 10 s. Three hundred μL beads from each bead stock (1.25 × 10 E7 beads/mL) were transferred to respective wells. Beads were washed with 250 μL activation buffer (100 mM Na2HPO4 + 0.005% (v/v) Triton X-100). The carboxyl groups on the magnetic beads were activated with 120 μL activation buffer + 15 μL EDC (50 mg/mL) + 15 μL sulfo-NHS (50 mg/mL in water-free DMSO) for 20 min at room temperature with agitation. Activated beads were washed two times with 250 μL coupling buffer (50 mM MES + 0.005% (v/v) Triton X-100). Antigens diluted to a concentration of 100 μg/mL in coupling buffer were incubated with the activated beads and agitated for 2 h at RT. Coupled beads were washed with 250 μL wash buffer and resuspended in 200 μL block store buffer (PBS + 1% (w/v) BSA) containing 0.05% NaN3 and stored at 4°C until further use.

Coupling control

The coupling efficiency of the His-tagged antigens was controlled using an anti-His antibody. Bound anti-His antibody was visualized using a secondary R-PE conjugated anti-species antibody.

The mouse-anti-His antibody (Qiagen, Hilden, Germany) was diluted in block store buffer (10 μg/mL, 1 μg/mL, 0.1 μg/mL, 0.01 μg/mL). For each antibody concentration 30 μL of the prepared bead suspension was distributed on a 96 half-well plate (1 μL of each individual bead type). A plate magnet was placed under the plate and the supernatant was removed by quickly inverting the plate. Beads were resuspended in 50 μL of diluted antibody and incubated on a shaker for 45 min at RT in the dark. Beads were washed twice with 100 μL wash buffer with the plate magnet as described before. 50 μL of a PE-conjugated goat-anti-mouse antibody (5 μg/mL in block store buffer) was added and incubated on a shaker for 30 min at RT in the dark. Beads were washed twice with 100 μL wash buffer and were resuspended in 100 μL block store buffer. Measurements were performed using a Luminex FlexMAP3D instrument with Luminex xPONENT software (settings: sample size: 80 μL, time out: 60, bead count: 100 per bead sort). Binding events were displayed as median fluorescence intensities.

After a successful coupling control the antigen-coupled beads were pooled to generate a master mix.

Bead-based serological assay

A serological bead-based assay was performed to detect human IgG antibodies directed against individual Mycobacterium tuberculosis proteins.

Serum samples were stored on ice. Serum dilution and incubation were performed at room temperature.

Samples were diluted in assay buffer (PBS + Low Cross Buffer (Candor, Wangen, Germany) + 0.5% BSA (w/v)) supplemented with 10% E. coli lysate. After dilution, samples were incubated for 20 min on a shaker.

Incubation protocol

The assay was performed in a semi-automated fashion using a bead handling system (KingFisher 96, Thermo Scientific, Schwerte, Germany).

A master bead mix containing antigen-coated beads was prepared in assay buffer without E. coli lysate and distributed on a 96 well PCR plate. The beads were transferred from the bead source plate to 50 μL of the diluted human serum samples and were incubated for 2 h at room temperature. Unbound antibodies were removed by washing the beads twice with 100 μL PBS + 0.05% Tween20. To visualize bound human antibodies 50 μL of an R-PE labeled goat-anti-human IgG antibody (5 μg/mL) beads were incubated for 1 h at room temperature. After washing twice with 100 μL PBS + 0.05% Tween20 the beads were resuspended in 100 μL assay buffer without E. coli lysate. Measurements were performed using a Luminex FlexMAP3D instrument with Luminex xPONENT software (settings: sample size: 80 μL, time out: 60, bead count: 100 per bead sort). Binding events were displayed as median fluorescence intensities.

Algorithm

The screening of a set of samples S1, S2, …, Sn generated a dataset M and mj(Si) designates the MFI-signal for target j in sample Si.

An important premise is that the assays have a predominately linear characteristic in the range of interest. If a pooled sample P was created from the sample Sk and Si the MFI would approximately add up

If a sample is diluted using factor α ≤ 1 the MFI will exhibit a linear change:

We have created pools by subsequently adding samples (S1, S1 + S2, S1 + S2 + S3, …).

The threshold for a positive signal is usually defined by a multiple of the mean intensities measured in a negative control population. The threshold for target j is designated tj. Furthermore the decision variable xi describes whether the sample Si is included in the pool or not.

In our approach the number of samples is fixed to a maximum and the number of antigens quality controlled by the resulting pool is the target to be optimized. By allowing integer values for xi, the constraint

fixes the number of parts, a pool consist of, to Xmax. E.g. a pool with Xmax = 5 could consist of 3 parts S3 and 2 parts S7.

The set 71 of decision variables a1, a2, …, am, indicate whether an analyte should be covered by the pool or not. By defining the coverage as

it is ensured, that only if aj = 1, the sum of MFIs has to exceed the threshold tj. The term to maximize is the number of analytes which can be quality controlled using the pooled sample

Another constraint is that the resulting pool should have the same matrix dilution as used in the normal sample preparation. If the input samples have been measured in a 1:n dilution resulting in values mj(Si), the values need to be scaled accordingly to 1:n Xmax. The upper bound for Xmax is defined by the limit of dilutional linearity. E.g, if the limit is 1:2000 and the original dilution was 1:200 the maximum for Xmax would be 10.

Pseudocode

INPUT: M, Xmaxupperbound

WHILE XmaxXmaxupperbound

BEGIN

solve ILP:

subject to

END

Generation of sample pool

For pooling the samples indicated by the algorithm 5 μL of each of the three samples were diluted 1:200 in assay buffer + 10% E. coli lysate, aliquotted at 60 μL and stored at −80°C.