Introduction

Fluorescently labeled antibodies and flow cytometry have been the workhorse for single-cell data generation in many fields of biosciences since its development in the late 1960s1. The ability to rapidly collect quantitative data from millions of single cells has driven the understanding of heterogeneity in complex cellular mixtures, and led to the development of many fluorescence-based functional assays2,3,4,5. The diverse utility of flow cytometry has driven constant demand for an expansion in the number of parameters to be simultaneously measured. Development of novel fluorophores and advances in laser technology have provided a steady increase in the number of parameters that can be measured on state-of-the-art machines, roughly doubling each decade since the 1970s (Roederer’s Law for Flow Cytometry)6.

The development from single-color flow cytometry to ultra high-parameter flow cytometry has allowed an enormous growth in the data collected per cell. In our own field of immunology, high-parameter flow cytometry panels have become necessary, with multiple markers required to identify cellular lineages, major subsets, and activation markers. A key limitation with high-parameter flow cytometry, however, is the spectral overlap of fluorescent dyes7. This results in the spillover of fluorescence to detectors different from the detector assigned to each dye (in classical flow cytometry). Removing this unwanted spillover, i.e. compensating, is a necessary preliminary step in the data analysis of multi-color flow cytometry.

State-of-the-art flow cytometers, with ~30 channels, make compensation increasingly difficult as the number of channels grows, due to the unavoidable overlap between emission spectra of fluorescent dyes. The difficulty of experimental design has followed the growth in fluorophore options, to the point where the development, refinement, and validation of ultra-high parameter panels can take months to years of expert input4,8,9,10. Indeed, the development of mass cytometry as an alternative technology is largely driven by its lack of spillover11, as otherwise the technology compares unfavorably to flow cytometry in several aspects6.

Unlike the extensive development efforts in fluorophore generation, fluidics refinement, and laser addition, the basis for dealing with spillover in flow cytometry has largely remained unchanged. Current compensation algorithms are based upon the algorithm for spillover calculation proposed by Bagwell and Adams, when flow cytometers worked with only a few fluorophores12. While aspects of data processing have been refined since then, such as autofluorescence correction, the basic compensation strategy of calculating the spillover signal between defined positive and negative populations remains the traditional approach used across the majority of software packages. These approaches provide an estimation of the spillover matrix, in which the degree of spectral spillover between channels is estimated from single-color controls. A compensation matrix is obtained by inverting the spillover matrix, by which spillover is compensated out from experimental datasets. While effective in low-parameter datasets, where spillover is moderate to start with, in the case of high-parameter data this method often requires manual adjustment before proceeding with downstream analyses. This manual tuning entails manipulating a matrix with several hundred coefficients, which can be challenging and time-consuming, thus severely constraining panel design in practice. This approach requires single-color controls with well-defined positive and negative populations, which often forces the single-color controls to differ from those of the actual panel, increasing the complexity of the experiment.

Spectral flow cytometry is a refinement of classical flow cytometers, expanding the number of parameters simultaneously measured. In these systems, spectral unmixing is used to discriminate between the spectra of similar fluorophores. The unmixing is carried out in a different way, but obtaining the spectral signature of each fluorophore is also based on single-color controls. As with compensation, unmixing requires the calculation of spillover to every detector, with more detectors used than fluorophores. Both classical flow cytometers, and the spectral systems potentially replacing them over the upcoming decades, are therefore limited by the accuracy of spillover calculation.

We have developed an algorithm, AutoSpill, to compensate flow cytometry data. This approach uses single-color controls, making it compatible with existing datasets and protocols. Unlike other compensation approaches, however, it calculates spillover coefficients by means of robust linear models. This method produces better estimation of spillover coefficients, without requiring well-defined positive and negative populations. Moreover, AutoSpill uses this improved estimation of the spillover matrix only as the initial value for an iterative algorithm that automatically refines the spillover matrix until achieving, for practical purposes, virtually perfect compensation for the given set of controls. In addition to providing optimal spillover matrices for compensating (or unmixing in spectral systems), and given that AutoSpill does not rely on well-defined positive and negative populations, it can calculate the autofluorescence spectrum of cells by treating it as an extra endogenous dye. Thus, it allows effective detection and removal of autofluorescence from experimental data.

A linear modeling approach can equally be used to estimate the increase in fluorescence noise or spread caused by compensating spillover. Thus, we also propose a second algorithm, AutoSpread, which calculates spillover spreading coefficients with linear models, thereby providing a spillover spreading matrix (SSM) without the need for well-defined positive and negative populations in the single-color controls.

Together, AutoSpill and AutoSpread remove limiting constraints of traditional compensation methods, easing the preparation of compensation controls in high-parameter flow cytometry, making errors less likely, and facilitating the practical implementation of ultra high-parameter flow cytometry. AutoSpill is available through open-source code and a freely available web service (https://autospill.vib.be). AutoSpill and AutoSpread are available in FlowJo v.10.7.

Results

Tessellation allows robust gating

A critical first step in the processing of flow cytometry data is the elimination of cellular debris and other non-cellular contamination. This stage is typically performed by manual or automated gating of particles with the expected size and granularity, based on forward scatter and side scatter. In order to develop a fully automated pipeline, we sought to encode this initial cellular gating in the AutoSpill algorithm (Supplementary Software 1). After numerous tests on data provided by collaborating immunologists, we settled on a multi-step process with two tessellations, which demonstrated the required features of robust cell or bead identification. Figure 1 shows the initial gating for one single-color control of each set of controls. The multi-step process robustly identified the cellular fractions as desired, regardless of the presence of high amounts of cellular debris in the HS1 and HS2 datasets (Fig. 1, second and third columns). It also worked correctly with beads (Be1 dataset), which exhibited substantially different forward-scatter/side-scatter profiles (Fig. 1, fourth column). For all channels and all datasets, the gate selected the cell/bead population in the desired density maximum, without needing manual adjustment.

Fig. 1: Tessellation allows robust initial gating.
figure 1

Results of gating before the calculation of compensation, using forward (FSC) and side scatter (SSC) parameters (as shown in the axes), for different samples with cells or beads. Columns show one gate example for each dataset, as indicated. Rows show the successive steps of the algorithm for each example: ad bound calculation (dashed black line) and first tessellation (in blue), to identify the density maxima (blue points, with numbers showing decreasing order of density value); eh region identification (solid black line) around the target maximum; il second tessellation (in blue), to isolate the target maximum from close maxima inside the region (point color and number as in ad); mp calculation of the boundary gate (black closed curve), by a threshold on density and a convex hull; qt gate summary provided to package/website users, with same line, point, and color code as in ad. Pseudo-color represents cellular density. Raw datasets are available at FlowRepository with IDs FR-FCM-Z2SS (MM1) [https://flowrepository.org/id/FR-FCM-Z2SS], FR-FCM-Z2ST (HS1 & HS2) [https://flowrepository.org/id/FR-FCM-Z2ST], and FR-FCM-Z2SV (Be1) [https://flowrepository.org/id/FR-FCM-Z2SV].

Robust linear regression effectively estimates spillover coefficients

The estimation of spillover coefficients is based on the comparison between the level of fluorescence detected in the primary channel (i.e. the detector dedicated to the dye or fluorophore, in classical systems, or the detector with highest signal, in spectral systems) and the secondary channels (i.e. every other detector). The linear relationship between the fluorescence levels of primary and secondary channels is not visible in the usual bi-exponential scale (Fig. 2, first and third columns), but it becomes apparent in linear scale (Fig. 2, second and fourth columns). The linear relationship between the primary and secondary channels shows that the ratio of fluorescence between the two channels is constant across a broad range of fluorescence levels. Thus, a linear regression can be used to properly identify the slope between the two channels, that is, the spillover coefficient. As fluorescence data is heteroskedastic, owing to the effects of photon-counting statistics, robust linear regression, which efficiently estimates the relationship while down-weighting outlier points, is more suitable for this purpose than an ordinary linear regression, which has increased sensitivity to outliers that violate the normality of the data. We sought to compare robust linear regression to the traditional approach. As our robust gating (described above) provided notable benefits on downstream spillover calculation, independent of calculation method, we applied the same initial robust gating strategies to both the traditional and robust linear regression approaches. Other than the use of improved robust gating (applied so as to not overestimate the advantages of our approach), the traditional calculation used the standard approach of identifying positive and negative peaks and selecting the median. The robust linear model approach produces a similar result to that achieved by the “traditional” calculation of a slope between the median values of the positive and negative populations12, which is the method usually employed (Fig. 2, first and second columns). Notably, however, the use of linear regression also allows the robust calculation of the slope in cases that the traditional approach was not designed to deal with: low numbers of positive events (Fig. 2b), without a well-defined positive population (Fig. 2c), or without well-defined positive and negative populations (Fig. 2d). The quality of compensation can be evaluated by the difference between the obtained compensation and the ideal one, with perfectly compensated data showing an exactly vertical distribution of data along the primary fluorophore (i.e. zero slope). While traditional estimation of spillover was successful to some extent in producing low-error compensation, in particular when distinct positive and negative populations were present (Fig. 2a, first and second columns), errors were identified in particular channels, especially when populations did not conform to good separation (Fig. 2c, first and second columns). Traditional algorithms struggle in the case of poor separation between positive and negative populations due to the requirement to identify two distinct populations to calculate a slope between (Fig. 2c). In extreme cases this can result in the identification of two populations within the negative cell cluster, and grossly wrong slope calculations. AutoSpill, by contrast, treats data at the single-cell level, utilizing expression data even when the positive and negative populations are low in frequency or form a tail from the negative population, driving a large correction in error (Fig. 2c). In all cases, linear regression resulted in less compensation error (Fig. 2, third and fourth columns).

Fig. 2: Robust linear regression effectively estimates spillover coefficients.
figure 2

Each row ad, eh, il, mp shows a compensation example from the MM1 dataset, with the primary and secondary channels, indicated, respectively, in the y-axes and x-axes. Compensation results are displayed using positive and negative populations (first column (a, e, i, m), bi-exponential scale; second column (b, f, j, n), linear scale), and robust linear regression (third column (c, g, k, o), bi-exponential scale; fourth column (d, h, l, p), linear scale). The linear relationship between the levels of fluorescence is not visible in bi-exponential scale, but it is very clear in linear scale. Uncompensated data is displayed in blue and compensated data in black. Dim points correspond to gated-out events, not used in the calculation. Lines in the second and fourth columns (linear scale) show regressions of uncompensated (blue) and compensated (black) data. The slope coefficient of the latter provided the compensation error (number at the bottom right of each panel). Vertical green dashed lines are shown as a reference for perfectly compensated data. Raw dataset is available at FlowRepository with ID FR-FCM-Z2SS (MM1) [https://flowrepository.org/id/FR-FCM-Z2SS].

Iterative reduction of compensation error yields optimal spillover coefficients

The spillover coefficients obtained in the first iteration step by robust linear regression produced low-error estimates of the spillover matrix for all channels (Fig. 3, with representative example in Fig. S1). While this error level outperformed that of the traditional approach (Figs. 3 and S1C), some channels exhibited a residual degree of over compensation or under compensation (Fig. S1E). While such errors are small, they nonetheless produce overcompensation or undercompensation noticeable in bi-exponential scale, which visually amplifies fluorescence levels close to zero. In a high-parameter flow cytometry panel, with multiple fluorophores present on small subpopulations, such errors can accumulate to the point of making individual channels effectively unusable. We therefore developed an iterative approach, by which the spillover results obtained through the robust linear regression approach (Fig. S1E) were used as the starting point for an additional round of robust linear regression. This process repeats, successively obtaining better spillover matrices allowing for further reduction of error in the compensated data, until pre-defined criteria are met. This iterative refinement of the spillover matrix reduced the compensation errors to negligible values (Figs. 3 and S1F).

Fig. 3: Iterative reduction of compensation error yields optimal spillover coefficients.
figure 3

Each row shows the reduction of compensation error that occurs over the iterative process for each dataset: a MM1, b HS1, c HS2, and d Be1. Left column displays the comparison of probability density functions of observed spillover coefficient errors within each dataset (with the density plots displaying the aggregate of individual spillover coefficient errors, i.e. off-diagonal elements of spillover matrix). Density plots are displayed for the same dataset in each plot, using different approaches: after traditional compensation, based on identifying positive and negative populations (red), after the first step of AutoSpill (green), and at the final step of the iterative process of AutoSpill (blue). Errors are displayed in log-scale scale of absolute values, separated for under-compensation (solid lines) and over-compensation (dashed lines) errors to demonstrate no bias between positive and negative errors. Right column displays the convergence of AutoSpill, with points showing standard deviations of errors (brown), maximum absolute error (orange), and the moving average of the decrease in the standard deviation of errors (pink), used to detect oscillations. Linear regressions were carried out in linear scale (triangles) or bi-exponential scale (circles). Empty triangles (same color code) show compensation errors resulting from calculating spillover coefficients with positive and negative populations. Dashed lines display the thresholds for changing from linear to bi-exponential (10−2, on the maximum absolute error, to provide better compensation in the bi-exponential scale used for visualization), reaching convergence (10−4, on the maximum absolute error), and detecting oscillations (10−6, on the moving average of the decrease in the standard deviation of errors). Raw datasets are available at FlowRepository with IDs FR-FCM-Z2SS (MM1) [https://flowrepository.org/id/FR-FCM-Z2SS], FR-FCM-Z2ST (HS1 and HS2) [https://flowrepository.org/id/FR-FCM-Z2ST], and FR-FCM-Z2SV (Be1) [https://flowrepository.org/id/FR-FCM-Z2SV].

While effective in most cases, this strategy for reducing compensation error can become compromised when using controls with low fluorescence levels in the primary channel or other fluorescence artifacts. Under these circumstances, iterations gave rise to oscillations in the observed compensation errors before reaching convergence (Fig. 3a, c). In order to deal with these extreme cases, we applied a fraction of the update to the spillover matrix, slowing down convergence and further decreasing compensation error (Fig. 3a, c).

Overall, the iterative refinement of spillover coefficients was effective at reducing errors in compensation. In the four representative datasets reported here, the refinement reduced error from the initial compensation step in 4–6 orders of magnitude (Fig. 3). This improvement was observed even with subsampling single color controls to very low numbers of cells (Fig. S2A), although an increased number of iterations was required to achieve convergence (Fig. S2B). This low error amounts to optimal spillover coefficients and compensation matrices, relative to the quality of the single-color controls used as input, and therefore it removes a key challenge to successful compensation in high-dimensional flow cytometry.

Removal of autofluorescence through compensation with an additional autofluorescence channel

Cells produce autofluorescence, due to the interaction of the constituent organic molecules with the incoming photons. The amount of autofluorescence varies between cell types, and it is, for example, higher on cells from the myeloid lineage13,14. This can create problems in the analysis of certain flow cytometry datasets. Although the amount of autofluorescence varies between cell types, the spillover from autofluorescence observed in an unstained control (Fig. 4a) behaved similarly to the spillover detected from (exogenous) fluorescent dyes (Fig. 2, first and third columns), with the key feature of not having well-defined positive and negative populations. The capacity of AutoSpill to estimate spillover coefficients without needing these populations allowed the treatment of autofluorescence as coming from an endogenous dye, whose single-color control was an unstained control, and whose fluorescence level was recorded in an extra empty channel assigned to a dummy dye. We therefore tested the ability of AutoSpill to compensate out autofluorescence, which was in issue in the HS1 and HS2 datasets. In effect, we were able to use the extra channel to measure the intensity of autofluorescence and greatly reduce its impact onto the other channels (Fig. 4b, c). Importantly, the empty channel assigned to autofluorescence worked best when it was the channel with higher level of signal in the unstained control. This way, the most autofluorescent channel was sacrificed during panel design to enhance resolution across all the other channels. As this process of autofluorescence removal is based on the calculation of spillover in the unstained control, autofluorescence removal requires all of the single-color control samples to be run from the same base cell type as the experimental samples. Autofluorescence removal is therefore not possible in AutoSpill, or any other computational approaches of which we are aware, when single color controls come from disparate sources (such as using beads or cellular mixes with different baseline autofluorescence). While autofluorescence removal is effective in a mixed cellular population in which different cell types have quantitatively different levels of autofluorescence, the process may fail if the sample includes a mixture cells which qualitatively differ in their autofluorescence spectrum. Autofluorescence subtraction in samples with minimal autofluorescence could, in principle, add low degree of noise to the data. We therefore suggest that users manually inspect unstained samples for variance in fluorescence and only use the autofluorescence subtraction option if autofluorescence is detected in the sample.

Fig. 4: Removal of autofluorescence through compensation with an additional autofluorescence channel.
figure 4

a, b Examples, for the unstained control of the HS1 dataset, of compensation of spillover from the autofluorescence channel (y-axes) to two secondary channels (x-axes). Uncompensated data is displayed in blue and compensated data in black. Resulting compensation errors (slope coefficients of the regressions on compensated data) are shown at the bottom left or right of each panel. Vertical green dashed lines are shown as a reference for perfectly compensated data. cf Compensation of two channels (one case per row) severely affected by autofluorescence in the HS1 dataset (left, (c, e), without autofluorescence channel; right, (d, f), with autofluorescence channel), with primary channels in y-axes and the secondary channels in x-axes. Same color and line code, and number with compensation error, as in a, b. g, h Comparison of probability density functions of spillover skewness in the HS1 dataset, without (g) or with (h) autofluorescence channel. Errors are displayed in log-scale of absolute values, separated in positive (solid lines) and negative (dashed lines) values to document any bias between positive and negative errors. Autofluorescence, causing spurious positive spillover, corresponds to anomalously large positive skewness in the affected channels (left). Raw datasets are available at FlowRepository with IDs FR-FCM-Z2SS (MM1) [https://flowrepository.org/id/FR-FCM-Z2SS] and FR-FCM-Z2ST (HS1) [https://flowrepository.org/id/FR-FCM-Z2ST].

Linear models for estimation of the SSM

Spillover spreading is defined as the incremental increase in standard deviation of fluorescent intensity in one parameter caused by the increase in fluorescent intensity of another parameter. Calculation of SSM coefficients, while not a standard step in the analysis pipeline, is a useful tool for machine quality control of consistency in sensitivity and performance, and can aid in minimizing interference during the design of high parameter flow cytometry panels15. The SSM coefficients can be calculated by comparing the fluorescent intensity in the primary detector to the standard deviation of fluorescence in the secondary detector, for a pair of positive and negative populations in a single-color control corresponding to the primary detector15. It can also be demonstrated that the linearity of this relationship for different sizes \(\sqrt{{{\Delta }}F}\), and that the estimation of each spillover spreading coefficient is machine-dependent and compensation-matrix-dependent, but is, however, dataset independent15. Here, we used quantile partitioning and linear regression to estimate the linear relationship observed by Nguyen et al. thereby allowing the inclusion of events above, below, or in-between the positive and negative populations of the original approach.

The events of each single-color control were partitioned quantile-wise in the primary detector, and the standard deviation of the level of fluorescence was estimated, for each quantile bin, in every secondary detector. Next, two linear regressions were used to estimate, first, the standard deviation at zero fluorescence, and second, the spillover spreading coefficient. Coefficients deemed non-significant using an F-test were replaced with zeros, as well as any negative coefficients. The majority of quantiles were, in fact, subsamples of the traditional positive and negative populations, but the inclusion of additional quantiles improved the precision of AutoSpread in estimating spillover spreading effects, because all these events conform to the same linear relationship, assuming that they are on-scale and in the linear range of the flow cytometer (Fig. 5a). As a result, AutoSpread accurately estimated spillover spreading for datasets whose compensation matrices successfully orthogonalized the fluorescent signals present in the single-color controls (Fig. 5b).

Fig. 5: Linear models for estimation of the Spillover Spreading Matrix (SSM).
figure 5

Examples are shown for the datasets MM1 (left, a, c, e) and HS1 (right, b, d, f). a, b Regression carried out over the gated events of one single-color control of each dataset, with no well-defined positive and negative populations, with the primary and secondary channels as indicated, respectively, in the y- and x-axes. Uncompensated data points are displayed in blue and compensated ones in black. Regression from uncompensated (resp. compensated) data is displayed with dashed (resp. solid) lines, in black (resp. gray) when the regression coefficient is significant and positive (resp. non-significant or non-positive). c, d Comparison of probability density functions of the differences between the results obtained with AutoSpread vs. the usual SSM algorithm. This shows the small difference between both calculations. Values are displayed in log-scale of the absolute value of the difference, separated in positive (solid lines) and negative (dashed lines) values to assess any bias between positive and negative errors. e, f Comparison between results obtained with AutoSpread vs the usual SSM algorithm, but with the omission of the first regression in AutoSpread, which leads to a systematic downward bias in AutoSpread results. Same scale and line code as in c, d. Raw datasets are available at FlowRepository with IDs FR-FCM-Z2SS (MM1) [https://flowrepository.org/id/FR-FCM-Z2SS] and FR-FCM-Z2ST (HS1) [https://flowrepository.org/id/FR-FCM-Z2ST].

The adjustment step of AutoSpread (the first regression) was critical. The adjustment removed the minor quadratic effect caused by σ0 in the initial estimates, thereby allowing a more accurate estimation of the coefficients \({\mathrm{{S{S}}}}_{C}^{P}\). If this adjustment step were skipped, that is, if the β’s were taken as the spillover spreading coefficients, then spreading effects would be consistently underestimated. In that case, comparison against the traditional SSM algorithm would show a clear negative bias (Fig. 5c). Including the adjustment, step eliminated that bias. For datasets whose single-color controls were contaminated by uncompensated signals (e.g. autofluorescence), both AutoSpread and the traditional SSM calculation may fail to accurately estimate spillover spreading. Initial gating that actively eliminates such effects, as well as the use of an extra autofluorescence channel, can alleviate the problem for both algorithms.

Biological utility of AutoSpill

To demonstrate the biological utility of improving the spillover matrix, we compared downstream analyses resulting from data compensated with AutoSpill versus the current traditional compensation algorithm. Here we used flow cytometry panels built to address biological questions that required antibody sets close to machine limits, or the analysis of highly autofluorescent cells, i.e., contexts where the greatest advantages of AutoSpill can be observed. First we compared the results of gating based on automated compensation calculation built into FlowJo v10.6 (traditional) with the results achieved by uploading an AutoSpill-generated spillover matrix into the same gating experiment in FlowJo. Analyzing 18- and 28-parameter flow cytometry datasets (MM3 and MM2, respectively), we identified multiple examples of poor discrimination of well-described immunological populations due to over- and undercompensation (Fig. 6a). A substantial fraction of the error introduced by traditional compensation calculation was due to inefficiencies in the gating component driving major errors in spillover calculation (which can be corrected by the user through manual regating), with the remaining quotient due to residual errors in spillover calculation even with corrected gates (as seen in Fig. S1). AutoSpill corrected both aspects of the pipeline and produced quality results (Fig. 6a). Next, as AutoSpill was incorporated into FlowJo v10.7 during the course of manuscript review, we were able to run a typical user experience test, with all sample compensation and analysis performed within FlowJo v10.7 either using the traditional algorithm or with the AutoSpill option enabled. As with the website pipeline, the FlowJo v10.7 AutoSpill option corrected several obvious compensation flaws (Fig. 6b). While these errors can readily be identified as compensation errors, AutoSpill also corrected less obvious downstream analyses. For example, in the 18-parameter MM3 dataset, where we gated for CD4+CD8−CD25+ lymphocytes, the population was 10-fold lower using traditional compensation algorithms than with AutoSpill, despite similar compensation identified between the CD4, CD8, and CD25 channels (Fig. 6b). Backgating the missing CD25+ population identified the problem as undercompensation between the CD25 and CD19 channels, leading to elimination of more than 90% of the CD25+ population during early gating stages (Fig. 6c). Finally, we display two clear examples of the benefit of autofluorescence reduction, both based on highly autofluorescent myeloid populations (MM4 and MM5 datasets). First, microglia, a brain-resident macrophage-like population, are often described as having low expression of MHCII during homeostasis16. This is a key difference from brain-resident macrophages, with high baseline MHCII expression, and determines the ability of the cell to present antigen to CD4 T cells. Using traditional compensation algorithms, low expression of MHCII was detected on 40% of microglia. This figure, however, dropped to near 0% when autofluorescence reduction was added (Fig. 6d), consistent with the complete absence of MHCII expression at the mRNA level in single-cell transcriptome analysis17. We validated the result by including microglia from MHCII knockout mice, where a similar level of background MHCII expression was observed (Fig. 6d), demonstrating that autofluorescence reduction gave the biologically correct outcome. As an independent example, we investigated Foxp3 expression, the key lineage-determining factor of regulatory T cells. Foxp3 expression has also been reported on various autofluorescent lineages, including thymic epithelium18, lung epithelium9, tumor cells19, and macrophages20. While expression outside the regulatory T cell lineage was later demonstrated to be due to autofluorescence artifacts21,22,23,24, the incorrect reports resulted in research misdirection for several years. Using high dimensional analysis on a Foxp3GFP reporter line and traditional compensation, low expression of the reporter was detected in 10% of the CD11b+ macrophage population (Fig. 6e). This expression was almost entirely eliminated through the use of the autofluorescence correction of AutoSpill, and was validated against wildtype mice, which do not have a GFP reporter present (Fig. 6e). Together, these practical examples demonstrate the added value of AutoSpill to flow cytometry analysis.

Fig. 6: Biological utility of AutoSpill.
figure 6

Downstream analyses of data compensated by either the traditional compensation algorithm or AutoSpill. a Plots were prepared and compensated using FlowJo v.10.6, using either the default traditional algorithm or uploading the spillover matrix generated by AutoSpill. Representative flow cytometry plots illustrating errors corrected by AutoSpill (MM2 dataset). be All plots were prepared from the same FCS files and compensated using FlowJo v.10.7, using either the traditional algorithm or the AutoSpill option. b Representative flow cytometry plots illustrating errors corrected by the AutoSpill option in FlowJo v10.7, c including hierarchical gating for CD4+CD8+CD25+ lymphocytes (MM3 dataset). d The CD4+CD25+ population gated in b was backgated to identify the source of population loss in the traditional algorithm (MM3 dataset). e MHCII expression on known negative cells (CD4 T cells), known positive cells (CD11b+ splenocytes), and microglia (MM4 dataset). f Percent positive was thresholded using CD4 T cells as the negative. MHCII knockout microglia we`re used as a true negative staining control. g Foxp3GFP expression on known bimodal cells (CD4+ splenocytes) and CD11b+ macrophages (MM5 dataset). h The positive population was thresholded using the negative CD4 T cell peak. Wildtype mice, without the GFP transgene, were used as a true negative staining control. Pseudo-color represents cellular density. For gating strategy see Fig. S3. Traditional and AutoSpill spillover matrices are provided on Mendeley data: [http://dx.doi.org/10.17632/mtdww9hd3m.1]. MM2, MM3, MM4, MM5 Raw datasets are available at FlowRepository with IDs FR-FCM-Z2SW (MM2) [https://flowrepository.org/id/FR-FCM-Z2SW], FR-FCM-Z2SJ (MM3) [https://flowrepository.org/id/FR-FCM-Z2SJ], FR-FR-FCM-Z2SK (MM4) [https://flowrepository.org/id/FR-FCM-Z2SK], and FR-FCM-Z2SL (MM5) [https://flowrepository.org/id/FR-FCM-Z2SL].

Discussion

Flow cytometry has been a revolutionary force in single-cell analysis. The ability to rapidly analyze protein expression of millions of cells at single-cell level, coupled with the purification capacity of fluorescence-activated cell sorting, has provided a remarkable tool for understanding cellular heterogeneity and function. Initial limitations were overcome through ingenious technical developments: the number of fluorescent parameters were expanded through the development of new dyes and lasers, intracellular staining protocols were optimized for the detection of intracellular (and even post-translationally modified) proteins, RNAflow techniques allowed measurement at the RNA level25, and numerous non-antibody-based dyes were able to detect processes from redox potential26 to organelle content and status27. The very utility of the technique has pushed flow cytometry to its technical barrier—the desire to measure everything on every cell has driven up the number of parameters that can be distinctly measured. The constraints imposed by overlapping fluorescent spectra are arguably the largest limit to the potential of flow cytometry, yet progress in the mathematical underpinnings of the analysis have substantially lagged behind the advances in the chemical and physical bases of the technology.

Newer single-cell technologies, most notably mass cytometry and single-cell RNA-Seq, do not have the spillover issues of flow cytometry. Mass cytometry is a direct competitor to flow cytometry, also primarily utilizing antibody-based detection of single-cell expression28. As the heavy metal labels do not overlap, mass cytometry panels can be built up in an modular manner, without the same design constraints required for flow cytometry29. While spectral flow cytometry and mass cytometry can readily run more than 40 parameters, classical flow cytometry experiments struggle to use more than 30 parameters, due to the challenge of distinguishing signals from each dye or fluorophore. Nonetheless, flow cytometry has major advantages over mass cytometry, most notably the speed of data acquisition (around 50-fold more rapid data collection) and the ability to sort live cells. The other main competitor to flow cytometry is single-cell RNA-Seq28. While initially limited to measurement of RNA content in a semi-quantitative manner, the advent of barcoded antibodies in protocols such as CITE-Seq30 and Abseq31 provided data directly comparable to that of flow cytometry. As barcoding approaches have no practical limit concerning compensation issues, they can compete with flow cytometry. Even in this case, however, flow cytometry has distinct technological advantages. In addition to the previously mentioned advantage of live-cell sorting, flow cytometry produces data at an unparalleled speed, with more than 106 cells measured per minute, and with a data format enabling immediate analysis. In terms of price, current flow cytometry assays are several orders of magnitude cheaper than RNA-Seq, with costs on the order of 10 USD per 106 cells28. Flow cytometry is therefore very much a living technology, with important advantages over competitor technologies and limited only by the parameter barrier.

The latest iteration of flow cytometry is spectral flow cytometry, a refinement where more channels (detectors) are used than dyes. Spectral flow cytometry allows for enhanced discrimination of fluorophores, including those that share a main channel, by calculating the dye origin through fluorescence at minor channels where the emission spectrum differs32. Spectral unmixing (assignment of detector signal to dyes) requires the generation of an accurate spillover matrix, which can be performed in a mathematically identical manner to the spillover matrix of traditional flow cytometry, regressing each dye against each other, but producing a rectangular rather than square spillover matrix (as channels > dyes). While different algorithms have been proposed on the methodology of applying this spillover matrix to unmix the spectral data33,34, each benefits from the use of a more correct spillover matrix. As AutoSpill focuses on improving the estimation of the spillover coefficients, rather than on how these coefficients are used, the implementation of the AutoSpill algorithm to spectral cytometry data can therefore yield similar benefits to that observed with traditional flow cytometry data. Indeed, spectral systems may well be the more compelling use case, as the system encourages dye crowding and the use of dyes with overlapping spectra. Moreover, spectral systems almost always have sufficient spectral resolution to orthogonalize autofluorescence from the other fluorescent spectra present in a sample. It is in these more complex cases where AutoSpill provides the greatest benefit.

We have presented here a compensation method which greatly reduces compensation error and expands the possible number of parameters in flow cytometry experiments. The use of robust linear regression and iterative refinement allows the calculation of spillover matrices without the need for using controls with well-defined positive and negative populations, thus permitting the use of the actual panel antibodies for the controls in many experiments. This method can be applied to any flow panel from 4 to 6 fluorophores up to multi-color staining sets with more than 30 fluorescent dyes. Given that the typical number of gated events in single-color controls is at least in the order of thousands, the amount of data points available enables this approach to reduce compensation errors to such small values that the resulting compensation is, in practical terms, functionally perfect for the given set of single-color controls. On the other hand, the method needs some level of fluorescence in the primary channel for each control (or at least in one of the detectors for spectral systems), to be able to regress the spillover coefficients.

An added feature of AutoSpill is the ability to compensate out autofluorescence. Although some methods have been proposed35,36,37, typically it is not possible to remove autofluorescence, with the exception of some spectral systems38,39. By default, AutoSpill does not use an unstained control, but it can be included and assigned to an extra unused channel in the flow cytometer. Data collected in this extra channel can be treated as coming from an endogenous fluorescent dye, which results in the inclusion of autofluorescence levels in the calculation of spillover coefficients and ensuing compensation. This optional approach is recommended when there are non-negligible levels of autofluorescence in one or several channels (as observed from an unstained control), and one of those high-autofluorescence channels is not used in the design of the panel. As autofluorescence can be increased by physiological and cellular processes13,40, the ability to compensate out autofluorescence can remove distortions appearing as false positives, where cellular changes are mistakenly identified as altered expression of a marker, while the signal is in fact caused by autofluorescence. This approach will be of particular utility in the study of cell populations with high intrinsic autofluorescence, such as myeloid-lineage cells13,14 or tumor cells41,42.

In comparison with previous compensation methods, which do not guarantee an upper bound on the compensation error, AutoSpill provides a spillover matrix with such a guarantee, given a set of controls. Therefore, it is possible now to address a new question: To which extent a set of single-color controls is sufficient to ensure proper compensation of data obtained with a complete panel, that is, not just for the set of controls. In our experience, some panels still require minor modifications of the spillover matrix, which implies that the single-color controls do not fully describe the fluorescence properties of the complete panel, probably because of second-order phenomena such as secondary fluorescence or other interactions between dyes. Thus, this remains an open question.

While we demonstrate the utility of this method using eight representative datasets, the tool has been beta-tested more than 1000 times over a period of 22 months by more than 100 collaborating immunologists. This has allowed the development of a robust algorithm, designed to accommodate diverse datasets and to deal with less-than-perfect data arising in real-world experiments. The code is open source and is released with a permissive license, allowing integration into existing flow cytometry analysis pipelines in academia and industry. To increase access by research communities in immunology and other fields, we also provide a website (https://autospill.vib.be) that allows the upload of sets of single-color controls for calculating the spillover matrix with AutoSpill, produced in formats compatible with common software for flow cytometry analysis. As we have demonstrated by including AutoSpill in FlowJo v.10.7, this algorithm is suitable for integration into commercial software, allowing for rapid and widespread uptake of superior flow cytometry compensation.

Methods

Datasets

Collaborating immunologists beta-tested AutoSpill over a period of 22 months, which allowed extensive testing and improvement of the algorithm for niche cases. Among these datasets, four are used as examples here, covering mouse cells, human cells, and beads. Compensation using AutoSpill, with default parameters, was carried out for each of these four sets of single-color controls: mouse splenocytes (MM1 dataset), human PBMCs (HS1 and HS2 datasets), and beads (Be1 dataset). We also analyzed four fully stained datasets, as examples of biological utility: mouse splenocytes (MM2 and MM3 datasets), and mouse microglia (MM4 and MM5 datasets). Data collection complied with all relevant ethical regulations for animal research and work with human participants. All animal experiments were performed in accordance with the University of Leuven Animal Ethics Committee guidelines or the Babraham Institute Animal Welfare and Ethics Review Body. Animal husbandry and experimentation complied with existing European Union and national legislation and local standards. Sample sizes for mouse experiments were chosen in conjunction with the ethics committees to allow for robust sensitivity without excessive use. For human experiments, written informed consent was obtained from all participants and the ethics committee of University Hospitals Leuven approved the study.

Be1 dataset, beads

UltraComp eBeadsTM Compensation Beads (Thermofisher) were used to optimize fluorescence compensation settings for multi-color flow cytometric analysis at a Symphony flow cytometer. UltraComp eBeadsTM were stained with the following fluorochrome-labeled anti-human antibodies: anti-CD8–BUV805 (1:200, clone SK1), anti-CD4–BUV496 (1:50, clone SK3), anti-CD86–BUV737 (1:50, clone 2331 FUN-1), anti-CD141–BUV615-P (1:50, clone 1A4), anti-CD56–BUV563 (1:50, clone NCAM 16.2), anti-CD16–BUV395 (1:50, clone 3G8), anti-CD123–BB660-P (1:50, clone 7G3), anti-CD80–BB630 (1:50,clone L307.4), anti-CD21–BV785 (1:50, clone B-ly4), anti-CD27–BV750-P (1:40,clone L128), anti-BAFF-R–BV650 (1:50, clone 11C1), anti-CD94–BV605 (1:50, clone HP-3D9), anti-CD40–APC-R700 (1:50, clone 5C3) (all BD bioscience); anti-CD3–PerCP-Vio700 (1:50, clone REA613) (Miltenyi Biotec); anti-CD57–FITC (1:100, clone TB01), anti-CD14–PE-Cy5.5 (1:200, clone TuK4), fixable viability dye eFluor780 (1:1000) (all eBioscience); anti-CD24–BV711 (1:50, clone ML5), anti-CD19–BV510 (1:25, clone HIB19), anti-HLA-DR–BV570 (1:40, clone L243), anti-IgM–BV421 (1:100, clone MHM-88), anti-CD11c–APC (1:40, clone 3.9), anti-CD38–PE/Dazzle 594 (1:100, clone HB-7), anti-CD10–PE-Cy5 (1:50, clone HI10a), and anti-IgD–PE-Cy7 (1:100, clone IA6-2) (all BioLegend).

HS1 dataset, human peripheral blood mononuclear cells (PBMCs)

PBMCs were isolated from heparinized blood samples of human healthy donors using Ficoll-Paque density centrifugation (MP biomedicals), frozen and then stored in liquid nitrogen. Frozen PBMCs were thawed and counted, and cell concentration was adjusted to 1 × 106 for each single-color control. Cells were plated in a V-bottom 96-well plate, washed once with PBS (Fisher Scientific) and stained with live/dead marker and fluorochrome-conjugated antibodies against surface markers: anti-CD8–BUV805 (1:200, clone SK1), anti-CD4–BUV496 (1:50, clone SK3), anti-CD95–BUV737 (1:100, clone DX2), anti-CD4–BUV615-P (1:50, SK3), anti-CD28–BB660-P (1:100, clone CD28.2), anti-CD4–BB630 (1:50, clone SK3), anti-CD4–BV750-P (1:50, clone SK3), anti-CD31–BV480 (1:100, clone WM59), anti-CXCR5–BV650 (1:25, clone RF8B2), anti-CD4–PE (1:100, clone SK3), anti-CD4–PE-Cy5 (1:50, clone SK3) (all BD Biosciences); anti-CD3–PerCP-Vio700 (1:50, clone Rea613) (Miltenyi Biotec); anti-CD3–FITC (1:50, clone UCHT1), anti-CD4–PE-Cy5.5 (1:50, clone SK3), anti-CCR7–PE-Cy7(1:50, clone 3D12), anti-CD4–APCeFluor780 (1:50, clone SK3) (all eBioscience); anti-CD4–BV786 (1:50, clone SK3), anti-CD4–BV711 (1:50, clone SK3), anti-CD4–BV605 (1:50, clone SK3), anti-HLA-DR–BV570 (1:40, clone L243), anti-CD127–BV421 (1:25, clone A019D5), anti-CD4–PE/Dazzle 594 (1:100, clone SK3), anti-CD4–AF647 (1:50, clone SK3) (all BioLegend).

Samples were stained for 60 min at 4 °C, washed twice in PBS/1% FBS (Tico Europe), and then fixed and permeabilized with Foxp3 Transcription Factor Staining Buffer Set (eBioscience), according to manufacturer’s instructions. Cells were stored overnight at 4 °C and were then acquired on a Symphony flow cytometer with Diva software (BD Biosciences). A minimum of 5 × 104 events were acquired for each sample.

HS2 dataset, human PBMCs

Frozen PBMCs from human healthy donors were processed as for the HS1 datasset and stained with live/dead marker and fluorochrome-conjugated antibodies against the following surface markers: anti-CD8–BUV805 (1:200, clone SK1), anti-CD4–BUV496 (1:50, clone SK3), anti-CD95–BUV737 (1:100, clone DX2), anti-CD28–BB660-P (1:100, clone CD28.2), anti-ICOS–BB630 (1:50, clone DX29), anti-CXCR3–BV785 (1:25, clone 1C6), anti-PD-1–BV750-P (1:25, clone EH12.1), anti-CXCR5–BV650 (1:25, clone RF8B2), anti-CCR2–BV605 (1:25, clone 1D9), anti-CD31–BV480 (1:100, clone WM59) (all BD Biosciences); anti-CD3–PerCP-Vio700 (1:50, clone REA613) (Miltenyi Biotec); anti-CD45RA–FITC (1:50, clone HI100), anti-CD14-PE–Cy5.5 (1:200, clone TuK4), anti-CCR7-PE–Cy7 (1:50, clone 3D12), fixable viability dye eFluor780 (all eBioscience); anti-CD25–BV711 (1:25, clone BC96), anti-HLA-DR–BV570 (1:40, clone L243), anti-CD127–BV421 (1:25, clone A019D5), and anti-CCR4–PE/Dazzle 594 (1:100, clone L291H4) (all BioLegend).

Samples were stained for 60 min at 4 °C, washed twice in PBS/1% FBS (Tico Europe), and then fixed and permeabilized with Foxp3 Transcription Factor Staining Buffer Set (eBioscience), according to manufacturer’s instructions. Cells were stained overnight at 4 °C with anti-Ki67–BUV615-P, anti-CTLA-4–PE-Cy5, anti-RORγt–PE (BD Biosciences), and anti-FOXP3–AF647 (BioLegend) anti-human intracellular antibody. Samples were acquired on a Symphony flow cytometer (BD Biosciences).

MM1 dataset, mouse splenocytes

Splenocytes from C57Bl/6 mice were disrupted with glass slides, filtered through 100 μm mesh, and red blood cells lysed. Cells were fixed and permeabilized with Foxp3 transcription factor staining buffer set (eBioscience) according to the manufacturer’s instructions, and stained overnight at 4 °C with Fixable Viability Dye eFluor780 (eBioscience) or the following antibodies: anti-CD4–BV421 (1:200, clone GK1.5), anti-CD24–BV510 (1:400, clone M1/69), anti-CD3–BV570 (1:250, clone 145-2C11), anti-CD4–BV605 (1:200, clone RM4-5), anti-CD3–BV650 (1:400, clone 145-2C11), anti-CD4–BV711 (1:200, clone GK1.5), anti-CD4–BV785 (1:200, clone GK1.5), anti-CD3–AF488 (1:1000, clone 145-2C11)/anti-CD4–AF488 (1:200, clone RM4-5)/anti-TCRβ–AF488 (1:2000, clone H57-597), anti-CD4–PerCP-Cy5.5 (1:200, clone RM4-5), anti-CD4–PE-594 (1:200, clone RM4-5), anti-CD8–PE-Cy7 (1:2000, clone 53-6.7), anti-MHC-II–AF700 (1:1000, clone M5/114.15.2) (all Biolegend), anti-CD19–BV750 (1:500, clone 1D3), anti-CD3–BB630-P (1:1000, clone 145-2C11)/anti-Thy1.2–BB630-P (1:4000, clone 53-2.1), anti-CD45.2–BB660-P2 (1:1000, clone 104)/anti-CD3–BB660-P2 (1:1000, clone 145-2C11), anti-TCRβ–BB790-P (1:2000, clone H57-597), anti-CD4–BUV395 (1:200, clone GK1.5), anti-IgD–BUV496 (1:2000, clone 11-26c.2a), anti-CD3–BUV563 (1:400, clone 145-2C11), anti-CD3–BUV615-P (1:400, clone 145-2C11), anti-CD19–BUV661 (1:250, clone 1D3), anti-CD21–BUV737 (1:500, clone 7G6), anti-CD8–BUV805 (1:250, clone 53-6.7) (all BD Biosciences), anti-CD4–PE (1:500, clone RM4-5)/anti-CD3–PE (1:2000, clone 145-2C11)/anti-CD8–PE (1:500, clone 53-6.7), anti-IgM–PE-Cy5 (1:2000, clone Il/41), anti-CD3–PE-Cy5.5 (1:8000, clone 145-2C11) or anti-CD4–APC (1:1000, clone RM4-5) (all eBioscience). For some fluorophores, multiple antibodies were used in the same compensation control, which is indicated by slashes. Samples were acquired on a Symphony flow cytometer (BD Biosciences).

MM2 dataset, mouse splenocytes

Splenocytes from C57Bl/6 mice were disrupted with glass slides, filtered through 100 μm mesh, and red blood cells lysed. Cells were stained with Fixable Viability Dye eFluor780 (eBioscience), fixed and permeabilized with Foxp3 transcription factor staining buffer set (eBioscience) according to the manufacturer’s instructions, and stained overnight at 4 °C with the following antibodies: anti-CD4–BV421 (1:2000, clone N418), anti-CD24–BV510 (1:2000, clone M1/69), anti-Ly6G–BV570 (1:2000, clone 1A8), anti-XCR1–BV650 (1:2500, clone ZET), anti-CD19–BV785 (1:400, clone 1D3), anti-CD3–AF488 (1:1000, clone 145-2C11), anti-PDCA-1–PerCP-Cy5.5 (1:1000, clone 927), anti-CD23–PE (1:5000, clone B3B4), anti-CD64–PE-594 (1:500, clone X54-5/7.1), anti-CD172a–PE-Cy7 (1:5000, clone P84), anti-CD45–APC (1:10,000, clone 30-F11), anti-MHCII–AF700 (1:2000, clone M5/114.15.2) (all Biolegend), anti-IgE–BV605 (1:5000, clone R35-72), anti-CD93–BV711 (1:2000, clone AA4.1), anti-CD11b–BV750 (1:2000, clone M1/70), anti-CD80–BB630-P (1:2000, clone 16-10A1), anti-CD95–BB660-P2 (1:10,000, clone Jo2), anti-TCRβ–BB790-P (1:2000, clone H57-597), anti-CD103–BUV395 (1:1000, clone M290), anti-IgD–BUV496 (1:2000, clone 11-26c.2a), anti-Ly6C–BUV563 (1:500, clone AL-21), anti-Siglec F–BUV615-P (1:1000, clone E50-2440), anti-c-Kit–BUV661 (1:5000, clone 2B8), anti-CD21/35–BUV737 (1:5000, clone 7G6), anti-CD8a–BUV805 (1:500, clone 53-6.7) (all BD Biosciences), anti-IgM–PE-Cy5 (1:1000, clone Il/41) and anti-NK1.1–PECy5.5 (1:2000, clone PK136) (eBioscience). Compensation controls were stained as described in the MM1 dataset. Samples were acquired on a Symphony flow cytometer (BD Biosciences).

MM3 dataset, mouse splenocytes

Splenocytes from C57Bl/6 mice were disrupted with glass slides, filtered through 100 μm mesh, and red blood cells lysed. Cells were stained with Fixable Viability Dye eFluor780 (eBioscience), anti-CD90.2–BV510 (1:250, clone 53-2.1), anti-CD25–BV650 (1:200, clone PC61), anti-CD45–BUV395 (1:500, clone 30-F11) (all Biolegend), anti-CD127–PE (1:100, clone A7R34) and anti-B220–PE-Cy5 (1:200, clone RA3-6B2) (all eBioscience). Cells were fixed and permeabilized with Foxp3 transcription factor staining buffer set (eBioscience) according to the manufacturer’s instructions, and stained overnight at 4 °C with the following antibodies: anti-T-bet–BV421 (1:200, clone 4B10), anti-CD8–BV785 (1:2000, clone 53-6.7), anti-NKp46–FITC (1:500, clone 29A1.4), anti-NK1.1–PE-Cy5.5 (1:2500, clone PK136), anti-MHCII–AF700 (1:2000, clone M5/114.15.2) (all Biolegend), anti-CD11b–eFluor450 (1:1000, clone M1/70), anti-GATA3–PE-Cy7 (1:100, clone L50-823), anti-CD3–biotin (1:1000, clone 145-2C11), anti-RORt–APC (1:500, clone AFKJS-9) (all eBioscience), anti-TCRβ–BB790-P (1:4000, clone H57-597), anti-CD4–BUV496 (1:500, clone GK1.5), and anti-CD19–BUV661 (1:2000, clone 1D3) (all BD Biosciences). Antibodies used for compensation controls were anti-CD25–BV421 (1:200, clone PC61), anti-CD44–BV510 (1:200, clone IM7), anti-CD3–BV650 (1:200, clone 17A2), anti-CD8–BV785 (1:2000, clone 53-6.7), anti-NK1.1–PE-Cy5.5 (1:2500, clone PK136), anti-MHCII–AF700 (1:2000, clone M5/114.15.2) (all Biolegend), anti-CD11b–eFluor450 (1:1000, clone M1/70), anti-TCRβ–FITC (1:500, clone H57-597), anti-B220–PE-Cy5 (1:200, clone RA3-6B2), anti-CD23–PE-Cy7 (1:500, clone B3B4), anti-CD8–biotin (1:200, clone 53-6.7), anti-Foxp3–APC (1:200, clone FJK-16s), anti-CD69–PE (1:200, clone H1.2F3) (all eBioscience), anti-TCRβ–BB790-P (1:4000, clone H57-597), anti-CD103–BUV395 (1:500, clone M290), anti-CD4–BUV496 (1:200, clone GK1.5), and anti-CD19–BUV661 (1:2000, clone 1D3) (all BD Biosciences). Streptavidin AF350 (1:200, Invitrogen) was used to identify biotinylated antibody. Samples were acquired on a Yeti/ZE5 flow cytometer (Propel Labs/BioRad).

MM4 dataset, mouse microglia

MHCII knockout mice43 were used on the B6 background. Leukocytes and microglia were extracted from mouse brains by chopping with a razor blade, digested in 0.4 mg/ml collagenase D (Sigma-Aldrich), and separated over 40% Percoll (GE Healthcare). Microglia were stained with anti-MHCII–FITC (1:200, clone M5/114.15.2, eBioscience), anti-CD11b–PE-Cy7 (1:500, clone M1/70, eBioscience), anti-CD45–APC (1:1000, clone 30-F11, eBioscience), anti-CD4–PE-Dazzle594 (1:500, clone GK1.5, BioLegend), and fixable viability dye eFluor780 (eBioscience). Samples were acquired on an Aurora spectral cytometer (Cytek).

MM5 dataset, mouse splenocytes

Foxp3DTR-GFP mice44 were used on the B6 background. Splenocytes were disrupted with glass slides, filtered through 100 μm mesh, and red blood cells lysed. Splenocytes anti-CD11b–PE-Cy7 (1:2000, clone M1/70, eBioscience), anti-CD45–APC (1:1000, clone 30-F11, eBioscience), anti-CD4–PE-Dazzle594 (1:500, clone GK1.5, BioLegend), and fixable viability dye eFluor780 (1:4000, eBioscience). Samples were acquired on an Aurora spectral cytometer (Cytek).

General implementation details of AutoSpill

AutoSpill was implemented in R v.3.6.3, using the packages flow core v.1.52.1, flowWorkspace v.3.34.1, ggplot2 v.3.3.2, moments v.0.14, and RColorBrewer v.1.1-2. Further details on packages specific to particular steps of the algorithm are listed below.

Initial gating

The initial gate was calculated independently for each control, over the 2d-density of events on forward and side scatter (FSC-A and SSC-A parameters). To robustly detect the population of interest, two tessellations were successively carried out to isolate the desired density peak. First, data were trimmed on extreme values (1% and 99%). Then, maxima were located numerically by a moving average (window size 3) on a soft estimation of the 2d-density (bandwidth factor 3). Maxima were used to generating non-overlapping tiles covering the entire 2d dataset (tessellation). The first tessellation was carried out on these density maxima, and the tile corresponding to the highest maximum was selected, ignoring peaks with lower values of both FSC-A and SSC-A (<5% of range). A rectangular region in the FSC-A/SSC-A-plane was chosen by using the median and 3 × the mean absolute deviation of the events contained in the selected tile. A second, finer 2d-density estimation (bandwidth factor 2) was obtained on this region, followed again by numerical detection of maxima (window size 2) and tessellation by the maxima. A final 2d-density estimation (bandwidth factor 1) was obtained on the tile containing the highest maximum, with the gate being defined as the convex hull enclosing the points that belonged to this tile and had a density larger than a threshold (33% of range).

Tessellations were carried out with package deliver v.0.1-28, density estimations with packages MASS v.7.3-51.6, surface interpolations with package fields v.10.3, and spatial operations with packages sp v.1.4-2 and tripack v.1.3-9.

Robust linear models for estimation of spillover coefficients

The linearity of the quantum mechanical nature of photons implies that the ratio between the average fluorescence level (that is, the average number of photons) detected in any two detectors and from any dye is equal to the ratio between the corresponding values of the emission spectrum of the dye, regardless of the level of fluorescence. As the value of the spillover coefficient for the primary channel (the channel assigned to the dye in the single-color control, in classical systems) is usually normalized to one, the spillover coefficient of every secondary channel is equal to the fluorescence ratio above. This implies that each spillover coefficient can be directly read from the slope of a linear regression considering the fluorescence in the primary channel as the independent variable and the fluorescence in the secondary channel as the dependent variable (that is, with x and y swapped for the usual representation of single-color controls when compensating). Thus, the absence of spillover corresponds to a zero slope in this regression, that is, to the vertical direction in the usual plot where the primary channel is displayed in the y-axis. To protect the algorithm against distortions in the data, especially those coming from autofluorescence issues, robust linear regression was used, giving lower weights to events farther away from the estimated regression line. Robust linear models (motivated by the heteroscedastic data with outliers) were implemented with the package MASS v.7.3-51.6, with default parameters, i.e. M-estimation with Huber weighting and the parameter k = 1.345.

Refinement of spillover matrix

After the first iteration of the algorithm, applying on compensated data the same kind of calculation used for the spillover coefficients, on channels in classical systems or on dyes in spectral systems, would produce zero values with perfect compensation, corresponding to perfectly vertical compensation plots. Otherwise, errors in compensation would yield non-zero values reflecting residual spillover. Overcompensated data would amount to excessively negative values in the secondary channel/dye, corresponding to a negative slope. Similarly, undercompensation would produce excessively positive values in the secondary channel/dye, corresponding to a positive slope.

Observed errors in compensation arise from errors in the estimation of the spillover coefficients. Crucially, it can be proved that, for the average event at any level of fluorescence, the error matrix T in the calculation of the spillover matrix S can be calculated from the observed compensation errors E as

$${\bf{T}}=-{\bf{E}}{\bf{U}}\ ,$$
(1)

U = S + T is the (erroneous) spillover matrix used to compensate the data (see below).

By successively applying Eq. (1), that is, by iteratively refining the spillover matrix and recalculating the compensation, errors in the spillover matrix and errors in compensation can be reduced to a negligible magnitude. The algorithm starts working in linear scale, and switches to bi-exponential scale when the maximum compensation error across all single-color controls is less than a threshold fixed a priori (10−2). To be used in Eq. (1), compensation errors obtained in the bi-exponential scale are transformed back to a linear scale, by using the two points in the regression line with extreme values in the primary channel. Iterations stop near the convergence of the algorithm when the maximum compensation error across all single-color controls is less than a threshold of 10−4.

While effective in most cases, this strategy for reducing compensation error can become compromised when using controls with low fluorescence levels in the primary channel or other fluorescence artifacts. In these situations, iterations can give rise to oscillations in the observed compensation errors before reaching convergence. To deal with these extreme cases, oscillations are detected by a moving average (size 10, initial value 1) of the decrease in the standard deviation of spillover errors. When this moving average gets below a threshold of 10−6, a fraction (10%) of the update to the spillover matrix is applied in Eq. (1), slowing down convergence and further decreasing compensation error.

Spillover error

In a flow cytometry system with c channels, let us consider the spillover matrix for a set of d single-color controls, that is for d dyes, with d ≤ c. We concentrate on the dye i = 1…d during the following argument.

For any event in the flow cytometer, we have the following two-row vectors: the true event data x, with length d, and the observed event data y, with length c. On average for any level of fluorescence, true and observed events are related linearly through the d × c spillover matrix S, according to

$${\bf{x}}{\bf{S}}={\bf{y}}.$$
(2)

Classical flow cytometry systems have c = d, and compensation is usually achieved by inverting the spillover matrix S and multiplying by the observed data y. Spectral systems feature c > d, and compensation is usually called unmixing and is not unequivocally defined, because Eq. (2) produces an overspecified system of equations. In the following, and for simplicity, we refer to unmixing in spectral systems also as compensation.

Independently of the compensation method used, when the spillover matrix S is estimated as U = S + T, thus with some error T, it unavoidably gives rise to incorrectly compensated data x + p, which verifies, on average,

$$({\bf{x}}+{\bf{p}})({\bf{S}}+{\bf{T}})={\bf{y}}.$$
(3)

Therefore,

$${\bf{x}}{\bf{T}}=-{\bf{p}}{\bf{U}}.$$
(4)

The vectors x and p, and the matrices S, T, and U, have the following properties:

  • Because x represents the true value of events in the single-color control for dye i, then xi > 0 and xj = 0, for all j ≠ i.

  • The ith row of the spillover matrix S is normalized with 1 = Sir ≥ Sis ≥ 0, for some r = 1…c and every s ≠ r.

  • The row normalization of S implies that the true value of the dye in the control, xi, can always be obtained from the observed value yr, as Eq. (2) implies yr = xiSir = xi. Therefore, pi = 0, irrespective of errors in the estimation of the spillover matrix.

  • Also because of the row normalization of the spillover matrix, the estimation of the spillover coefficient Sir = 1 will always be exact, i.e. Uir = 1 and Tir = 0, irrespective of errors in the estimation of the spillover matrix.

Let us consider now the LHS of Eq. (4), i.e. the row vector xT. Its sth coefficient, for any s = 1…c, equals

$${({\bf{x}}{\bf{T}})}_{s}=\mathop{\sum }\limits_{j=1}^{d}{x}_{j}{T}_{js}={x}_{i}{T}_{is}.$$
(5)

Note that (xT)r = 0.

Let us consider the RHS of Eq. (4), i.e. the row vector−pU. Its sth coefficient, for any s = 1…c, equals

$${(-{\bf{p}}{\bf{U}})}_{s}=-\mathop{\sum }\limits_{j=1}^{d}{p}_{j}{U}_{js}.$$
(6)

Note that the summation term piUis = 0.

Equations (46) imply that, for any s = 1…c,

$${T}_{is}=-\mathop{\sum }\limits_{j=1}^{d}\frac{{p}_{j}}{{x}_{i}}{U}_{js}.$$
(7)

The ratio pj/xi can be considered as the compensation error for the average event, corresponding to a spurious signal assigned to dye j, caused by incorrectly compensated spillover from dye i. Equation (3) implies that the ratio pj/xi is invariant w.r.t. the level of fluorescence, and thus it can be estimated by regressing pj vs. xi.

Let us define the compensation error matrix E as the d × d matrix with coefficients

$${E}_{ij}=\frac{{p}_{j}}{{x}_{i}}.$$
(8)

Note that Eii = 0. We can then rewrite Eq. (7) as

$${T}_{is}=-\mathop{\sum }\limits_{j=1}^{d}{E}_{ij}{U}_{js}=-{\bf{E}}(i,* )\ {\bf{U}}(* ,s),$$
(9)

for any s = 1…c.

In summary, Eq. (9) allows to calculate the ith row of the spillover error matrix T. By repeating the same argument for every dye, we can obtain all the rows i = 1…d, and thus the complete matrix as

$${\bf{T}}=-{\bf{E}}{\bf{U}}.$$
(10)

Box 1

Linear models for estimation of SSM

Successful compensation equilibrates around zero the fluorescence levels in all secondary channels, but with the cost of accentuating undesirable variance or spread in those channels. Again for quantum mechanical reasons, the variance in fluorescence for any (compensated or uncompensated) channel/dye grows linearly with the fluorescence level, and therefore the coefficients of the SSM can be estimated with linear regression.

We start with the formula for an SSM coefficient \({\mathrm{{S{S}}}}_{C}^{P}\), which characterizes the incremental standard deviation induced in parameter C by the spillover from parameter P15,

$${\mathrm{{S{S}}}}_{C}^{P}=\frac{\sqrt{{\sigma }_{{\rm{positive}}}^{2}-{\sigma }_{{\rm{negative}}}^{2}}}{\sqrt{{F}_{{\rm{positive}}}-{F}_{{\rm{negative}}}}}\ ,$$
(11)

where σpositive and σnegative are the standard deviations in C-fluorescence in positive and negative populations, respectively, and FpositiveFnegative is the difference in P-fluorescence intensity between them. While the traditional algorithm estimates the above quantities using medians and robust standard deviations of fluorescence in the positive and negative populations, we will, for the sake of linear regression, let our negative be the theoretical quantity when P-fluorescence (F) is equal to zero, while the standard deviation is an unknown quantity, which we call σ0. This assumption, introduced for practical computation, excludes the quadratic effect that σ0 imparts. The effect of this exclusion is negligible, as: (i) σ0 (characterization of the cytometer’s machine noise) is guaranteed to be small when compared to the standard deviation introduced by the Poisson process of counting photons (otherwise the cytometer cannot generate meaningful data), (ii) compensation controls used during SSM calculation include negative populations that reside close to zero, and (iii) the result of a small σ0 and presence of a population near-zero dramatically reduces the impact of the σ0 quadratic effect on the model because they guarantee that the data reside on a near-linear region of a parabola. This gives us the following equation relating F to σ, which is suitable for estimating σ0 by linear regression:

$$\sigma =\sqrt{F}\ \beta +{\sigma }_{0}\ .$$
(12)

Notice that the slope β is not equal to the spillover spreading coefficient \({\mathrm{{S{S}}}}_{C}^{P}\), except in the unique case where σ0 equals zero. We thus proceed with the estimation of σ0 as the first step of AutoSpread.

To supply data for the regression, we partition the events of the single-color control for parameter P by quantile. For controls with a large number of events, we use 256 quantiles, but we allow as few as 8 to ensure enough events in each quantile to estimate standard deviation reliably. For each other parameter C, we calculate in each quantile the robust standard deviation of fluorescence (the 84th percentile minus the median) as the estimate of σ and the median fluorescence as the estimate of F. The F values may be negative and/or close to zero, so they are passed through a square-root-like transform defined by \({f}_{\sqrt{}}(x)={\rm{sign}}(x)\ (\sqrt{| x| +1}-1)\) prior to regression, instead of the simple square root function. The resulting regression provides an estimate of σ0.

Using the estimate of σ0, AutoSpread calculates for each quantile the estimate of \(\sigma ^{\prime} \), defined by \(\sigma ^{\prime} ={f}_{\sqrt{}}({\sigma }^{2}-{\sigma }_{0}^{2})\), and these adjusted standard deviation estimates provide the data for the second regression, \(\sigma ^{\prime} =\sqrt{F}\ {\mathrm{{S{S}}}}_{C}^{P}\). This regression is calculated without an intercept term because the adjustment of σ0 forces it to zero.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.