Spatial-linked alignment tool (SLAT) for aligning heterogenous slices

Xia, Chen-Rui; Cao, Zhi-Jie; Tu, Xin-Ming; Gao, Ge

doi:10.1038/s41467-023-43105-5

Download PDF

Article
Open access
Published: 09 November 2023

Spatial-linked alignment tool (SLAT) for aligning heterogenous slices

Chen-Rui Xia^1,2^na1,
Zhi-Jie Cao ORCID: orcid.org/0000-0002-0026-671X^1,2^na1,
Xin-Ming Tu¹^nAff3 &
…
Ge Gao ORCID: orcid.org/0000-0001-6470-8815^1,2

Nature Communications volume 14, Article number: 7236 (2023) Cite this article

7050 Accesses
6 Citations
13 Altmetric
Metrics details

Subjects

Abstract

Spatially resolved omics technologies reveal the spatial organization of cells in various biological systems. Here we propose SLAT (Spatially-Linked Alignment Tool), a graph-based algorithm for efficient and effective alignment of spatial slices. Adopting a graph adversarial matching strategy, SLAT is the first algorithm capable of aligning heterogenous spatial data across distinct technologies and modalities. Systematic benchmarks demonstrate SLAT’s superior precision, robustness, and speed over existing state-of-the-arts. Applications to multiple real-world datasets further show SLAT’s utility in enhancing cell-typing resolution, integrating multiple modalities for regulatory inference, and mapping fine-scale spatial-temporal changes during development. The full SLAT package is available at https://github.com/gao-lab/SLAT.

Alignment of spatial genomics data using deep Gaussian processes

Article Open access 17 August 2023

Sprod for de-noising spatially resolved transcriptomics data based on position and image information

Article 04 August 2022

Stabilized mosaic single-cell data integration using unshared features

Article Open access 25 May 2023

Introduction

Recently emerging spatial omics technologies enable profiling the location, intercommunication, and functional cooperation of native cells through fluorescence in situ hybridization (seqFISH¹, MERFISH², seqFISH+³ and Xenium⁴) and spatial barcoding (10× Visium⁵, HDST⁶, Slide-seqV2⁷, Stereo-seq⁸ and spatial-ATAC-seq⁹) from multiple tissue “slices,” revealing tissue structure heterogeneity and shedding light on the underlying physiological and pathological mechanisms^8,10.

Properly aligning cells that share common molecular identity (e.g., cell type) and spatial context across multiple slices, especially these generated from distinct sources, is critical for their follow-up analysis. For example, inter-technology alignment effectively bridges technologies complementary in spatial resolution and omics coverage⁴, while aligning various time points during spatially dynamic processes like embryogenesis helps identify key spatial temporal changes and their molecular underpinnings. However, current spatial alignment algorithms^11,12,13 are mostly designed for homogeneous alignments (e.g., three-dimensional reconstruction from consecutive slices^14,15), and can hardly handle heterogeneous slices which often involve complex non-rigid deformations, uneven spatial resolutions as well as complex batch effects.

Here, we introduce SLAT (Spatially-Linked Alignment Tool), a unified framework for aligning both homogenous and heterogeneous single-cell spatial datasets. By modeling the intercellular relationship as a spatial graph, SLAT adopts graph neural networks and adversarial matching for robustly aligning spatial slices. In addition to its superior performance revealed by systematic benchmarks, as the first algorithm capable of aligning heterogenous spatial data, SLAT introduces a wide range of application scenarios including alignment across distinct technologies and experimental conditions. SLAT is publicly accessible at https://github.com/gao-lab/SLAT and will be continuously updated as spatial omics technologies evolve.

Results

Align heterogenous spatial omics data via graph adversarial matching

By modeling the spatial topology per slice as a spatial graph where each cell is connected to its nearest neighbors by edges, we reformulate the slice-alignment task as a graph-matching problem. In efforts to correct potential cross-dataset batch effect, SLAT employed a Singular Value Decomposition (SVD)-based strategy to project omics profiles of the cells into a shared low-dimensional space (“Methods”), which in turn serves as node features of the spatial graphs.

Multilayer lightweight graph convolutional networks are incorporated to propagate and aggregate information between cells and their neighbors via stepwise concatenations, generating a holistic representation with information at multiple scales from individual cells to local niches as well as global positions. Then, SLAT solves a minimum-cost bipartite matching problem between the spatial graphs through a dedicated adversarial component¹⁶ to align cells from different slices (Fig. 1, “Methods”).

**Fig. 1: Architecture of SLAT framework.**

Notably, apart from congruent regions that can be well aligned across slices, heterogeneous alignment often involves distinct regions that reflect biologically relevant spatial alterations. To avoid artificially aligning such regions (i.e., over-alignment), SLAT adopts an adaptive clipping strategy during adversarial matching to retain only the closest pairs between slices in terms of their cosine similarity in the embedding space, taken as reliable anchors for guiding the whole alignment procedure (Supplementary Fig. 1). In particular, SLAT utilizes an empirical probabilistic matching strategy and returns matches with high confidence only (default p value threshold = 0.05, see “Methods”).

Systematic benchmarks suggest SLAT is accurate robust and fast

To evaluate SLAT’s performance over existing algorithms which are designed for homogeneous alignment^12,14, systematic benchmarks were conducted based on consecutive, homogeneous slices from the same tissue generated by three representative technologies spanning a wide range of throughput, resolution, and technological routes: 10× Visium⁵, MERFISH², and Stereo-seq⁸ (Fig. 2a, Supplementary Table 1, and Supplementary Fig. 2).

**Fig. 2: Evaluation on homogeneous spatial alignment.**

We first ran synthetic tests where a spatial slice and its rotated and noise-perturbed copy were fed into the alignment algorithms (“Methods”). We examined how the alignments could be used to correct the artificial rotation as well as how they compare with the known ground truth matching (Supplementary Fig. 3). Both SLAT and PASTE achieved high accuracy, while the performance of spatially unaware algorithms Seurat and Harmony degraded substantially with increasing levels of noise (Supplementary Fig. 3). To our surprise, STAGATE produced low accuracy across all noise levels, which could be attributed to its lack of noise reduction and batch correction components.

We then tried to align pairs of distinct slices in the same datasets (Supplementary Fig. 4). Due to the lack of ground truth matchings, we quantify alignment accuracy by the fraction of cells correctly matched in expert-curated cell types and spatial regions, respectively (Fig. 2b, c, also see Supplementary Fig. 5), and the joint accuracy is defined as the fraction of cells where both cell types and spatial regions are correctly matched (Fig. 2d).

Overall, SLAT outperformed PASTE¹⁴ and STAGATE¹² in all three datasets (Fig. 2b–d). Of note, PASTE exhibited particularly low cell type accuracy in the MERFISH dataset (Fig. 2c and Supplementary Fig. 5) where different cell types are spatially interlaced (Supplementary Fig. 6), probably due to its excessive reliance on the spatial distance between cells over molecular features. Consistent with the synthetic test, we found that STAGATE produced suboptimal alignments, while its performance improved for split slices which are free of batch effects (“Methods,” Supplementary Fig. 7). By combining spatial context with transcriptome profiles, SLAT is better equipped to distinguish transcriptionally similar but spatially distinct cell groups. Consistently, we found that SLAT also achieved significantly higher alignment accuracy than the conventional spatially unaware algorithms Seurat¹⁷ and Harmony¹⁸ (Fig. 2c, d). In particular, although these algorithms matched cell types reasonably well, their alignment largely disarranged spatial regions (Fig. 2b and Supplementary Figs. 4 and 5). The comparisons were also consistent in alternative metrics based on micro- and macro-F1 scores (Supplementary Fig. 8), as well as the correction of artificial rotation (Supplementary Fig. 9).

Meanwhile, we noticed a gap between cell type macro and micro F1 scores (by ~0.15) in the MERFISH and Stereo-seq datasets for all methods (Supplementary Fig. 8), which is largely due to mismatches in rare cell types (Supplementary Fig. 10). E.g., “OD mature 3” (6 cells) and “OD mature 4” (10 cells) in the MERFISH dataset tend to be aligned to “OD mature 2”, a transcriptionally similar cell type, by SLAT. Similar matchings were also produced by Seurat and Harmony, while PASTE and STAGATE incorrectly aligned most “OD mature 3” and “OD mature 4” cells with “Inhibitory,” “Endothelial,” or “Ambiguous.” Transcriptionally more distinct rare cell types such as “Pericytes” and “Microglia” were well-aligned by SLAT, Harmony and Seurat, but misaligned by spatially aware algorithms PASTE and STAGATE, suggesting that SLAT’s combination of graph convolution and adversarial learning could integrate spatial context and molecular features more effectively. On the other hand, mismatches in the Stereo-seq dataset could mostly be attributed to inconsistent taxonomies in the source annotation, e.g., cells annotated as various organs were partially aligned with “connective tissue,” which does exist in most organs.

SLAT’s lightweight graph convolutional component effectively improves model robustness. Evaluation with subsampled data of various sizes (spanning from 200 to 102,400) showed that SLAT consistently provides the best results, even with as few as 200 cells (Supplementary Fig. 11, “Methods”). Further inspection confirmed that the performance of SLAT remains highly robust to a wide range of hyperparameter settings and against random corruption of the spatial graph (Supplementary Fig. 12a, b, “Methods”).

As technologies continue to evolve, the throughput of spatial single-cell experiments is constantly increasing¹⁹. Implemented as a neural network and optimized for parallelism, SLAT is highly scalable: benchmarks showed that SLAT is consistently the fastest method across all data sizes and aligns slices each with over 100,000 cells in just 3 min (Fig. 2e). We also noticed that PASTE fails to run as cell number exceeds 25,600, partly due to its memory intensive implementation for optimal transport.

Matching heterogeneous datasets across distinct technologies and modalities

Spatial alignment across multiple technologies and modalities amalgamates complementary information towards a comprehensive in situ view of cellular states²⁰. Benefiting from its unique design, SLAT provides reliable alignment across heterogeneous datasets, which is largely beyond the capability of current alignment algorithms.

We first used SLAT to align spatial transcriptomic datasets generated by two distinct technologies with varying scale and detectability^8,21. Specifically, the seqFISH dataset contains over 10,000 cells with only 350 genes detected in total^3,21, whereas the Stereo-seq dataset contains about 5,000 cells covering over 20,000 genes⁸. Meanwhile, the cell types in the two datasets were annotated at different resolutions: the seqFISH dataset has finer-grained annotation (21 cell types, Supplementary Fig. 13a) than the Stereo-seq dataset (11 cell types, Supplementary Fig. 13b). The high-quality alignment produced by SLAT (Fig. 3a) enables an accurate label transfer for cell typing improvement (Supplementary Fig. 13c). For example, cells labeled as “Neural crest” in the Stereo-seq dataset were aligned to seqFISH regions with four major cell types (Fig. 3b and Supplementary Fig. 13d), refining cell typing which was further validated by known marker genes (Fig. 3c and Supplementary Fig. 13e). Notably, slices in the two datasets also differ in spatial structure with substantial non-rigid deformations (Supplementary Fig. 13a, b). Such challenging scenario effectively failed methods other than SLAT (Supplementary Figs. 14 and 15a). Similar cross-scale alignment can also be achieved on Visium and Xenium slices⁴, where SLAT accurately pinpoints a rare group of triple positive breast tumor cells, while all other methods failed (Supplementary Fig. 16).

**Fig. 3: Spatial alignment across distinct technologies and modalities.**

More challenging is the cross-modality spatial alignment, which is partly due to the disjoint feature spaces that invalidate the use of our SVD-based cross dataset matrix decomposition strategy as well as other canonical batch correction methods. Benefiting from the modular design of SLAT (Fig. 1), we employed the graph-linked multi-modality embedding strategy we proposed before²² to project cells of different modalities into a shared embedding space before feeding them into the LGCNs (“Methods”). With this extension, SLAT successfully produced a spatial alignment across RNA (Stereo-seq) and ATAC (spatial-ATAC-seq) slices. While the different modalities featured drastically different spatial resolutions (0.2 μm for Stereo-seq but 20 μm for spatial-ATAC-seq), SLAT managed to align them well (Fig. 3d and Supplementary Fig. 17a). Cell-type labels transferred from Stereo-seq to spatial-ATAC-seq based on the alignment were consistent with anatomical features and with the accessibility of tissue-specific genes (Fig. 3e, f and Supplementary Fig. 17b). We also experimented with other spatial alignment methods by feeding the same multi-modality embeddings as input, but found suboptimal results (Supplementary Fig. 15b and 18). Joint regulatory analysis using the aligned cell pairs and transferred labels effectively identified various key regulators in the heart, such as Jund and Ctnnb1^23,24 (“Methods,” Supplementary Fig. 17c), which could not be identified using the spatially unaware GLUE embeddings (Supplementary Fig. 19).

Mapping fine-scale spatial-temporal transitions by developmental alignment

Embryonic development is a highly dynamic process with extensive spatial-temporal transitions involving the generation, maturation, and functional alteration of organs and tissues at specific timepoints. To probe spatial-temporal dynamics during early development, we used SLAT to align two spatial atlases of mouse embryonic development at E11.5 and E12.5⁸ (Fig. 4a).

**Fig. 4: Developmental alignment of mouse embryo.**

Most cells in the brain, heart, and liver were well aligned with high alignment similarity scores, consistent with their spatial-temporal conservation at the two timepoints (Fig. 4b and Supplementary Fig. 20a–c). In addition, several regions were enriched with less-aligned cells of lower alignment similarity scores, which may be attributed to both biological development (e.g., the newly emerged organs kidney and ovary at region I^8,25,26,27, the rapid enlargement of lung primordium and its displacement from the upper part of the heart to the lower part at region II²⁸ as shown in Supplementary Fig. 20d, and the disappearance of branchial arch at corresponding position of region III in E11.5⁸) and technical variation (e.g., tissue loss at corresponding position of region IV in E11.5).

We next followed up on kidney and ovary, two newly emerging organs at around E11.5–12.5^8,25,26,27. SLAT accurately identified them both as developing from the “Urogenital ridge” at E11.5 (highlighted by the green box in Fig. 4c). Consistent with previous reports, SLAT alignment showed precisely that the ovary develops directly from a single area²⁹ (left panel of Fig. 4d) while the kidney develops from two separate areas corresponding to the mesonephros and metanephros structures in early kidney development^25,30,31 (right panel of Fig. 4d and Supplementary Fig. 20e). In addition, based on the clustering analysis of aligned cells, we further identified a group of rare nephron progenitor cells located in the mesonephros, likely corresponding to an ephemeral mesonephric tubule during mesonephros development³² (Fig. 4e, f). Interestingly, the results show that the mesonephros is spatially adjacent to the ovary, which is consistent with the well-documented developmental cascade whereby a part of the mesonephros develops into the fallopian tube after degeneration (Fig. 4d). Our findings revealed the unique value of heterogeneous spatial alignment for interrogating the spatial-temporal process of organogenesis.

We also attempted the same alignment task with other methods (Supplementary Fig. 21). The spatially unaware algorithms (Seurat, Harmony) simply missed mesonephros, while the spatially aware algorithms (PASTE, STAGATE) align them to incorrect cells. Of interest, SLAT’s unique capability on heterogeneous spatial alignment further enables spatiotemporally progressive alignment across multiple development stages: applying SLAT to mouse embryo slices spanning E9.5–E16.5 effectively recapitulated the development dynamics of multiple organs across all three germ layers⁸ (Supplementary Fig. 22).

Discussion

One of the essential challenges in spatial omics alignment is to appropriately model spatial context. Early methods such as Splotch¹³ and Eggplant³³ model slices as rigid bodies and require manual annotation of landmark spots to guide the alignment. PASTE¹⁴ eliminates the need for landmark annotation by considering gene expression of all spatial spots, but its reliance on exact spatial distance impedes application to heterogeneous alignments involving complex non-rigid structural alterations. By combining spatial graph convolution and adversarial matching, SLAT achieves reliable spatial alignment for both homogeneous and heterogeneous slices in an unsupervised, data-oriented manner.

3D reconstruction from consecutive slices is a common application of spatial alignment. SLAT supports reconstruction from multiple slices via progressive pairwise alignment (e.g., Supplementary Fig. 23 for a consecutive stack of four slices from the same E15.5 mouse embryo). While similar 3D stacking can also be achieved with PASTE¹⁴, SLAT is better equipped to account for non-rigid structural shifting and alteration among slices, enabling adaptive correction for potential deformation artifacts (see Supplementary Fig. 24 for a quantitative assessment).

We noticed that other topology-based metrics have been proposed as performance measurements, such as the edge score which quantifies preservation of spatial neighbors³⁴ (see Supplementary Fig. 25a and “Methods”). However, we’d argue that these metrics could be intrinsically biased as they measure the continuity of matching solely which could be particularly problematic with the existence of structural changes (see Supplementary Fig. 25b and “Methods” for a counterexample).

SLAT’s unique ability to conduct heterogeneous spatial alignments promises a wide range of biological applications. In particular, such spatial alignment enables identifying spatially-resolved changes such as key alterations of spatial patterns during development. Meanwhile, aligning slices from the same tissue generated by different technologies could enable in silico data enhancement, and ultimately combining their complementary advantages in spatial resolution and genomic coverage. Meanwhile, proper cross-modal alignment further sheds lights on key regulators and corresponding regulatory circuits.

SLAT is fast. In fact, generating the input cell embeddings is the most time-consuming step, while training and inferring with the SLAT core model takes only about 10 s for 10⁶ cells. Once the input embeddings are ready, aligning millions of cells can be completed in near real-time, enabling an efficient search of a massive database within an affordable timeframe. SLAT’s blazing speed sets the stage for some exciting applications such as inferring spatially dependent causal mechanisms by systematically comparing multiple Perturb-map slices³⁵, or constructing whole organ 3D atlases involving thousands of slices³⁶.

Last but not least, designed as a flexible framework, SLAT can be readily adapted and extended. For instance, additional information such as expert curation may be incorporated into the coordinate matching module to help distinguish symmetric structures and polish the final alignment (“Methods”). Meanwhile, the spatial graph modeling technique employed in SLAT may also be adapted to address other problems, such as comparative alignment across species.

Overall, SLAT provides a unified framework for various spatial integration scenarios. To promote its application by the research community, the SLAT package, along with detailed tutorials and demo cases, is available online at https://github.com/gao-lab/SLAT.

Methods

SLAT framework

Joint modeling of spatial coordinates and omics features

We denote a spatial omics dataset as ${{{{{\mathscr{D}}}}}}{{{{{\mathscr{=}}}}}}\left\{\left({{{{{{\bf{g}}}}}}}^{\left(i\right)},\, {{{{{{\bf{s}}}}}}}^{\left(i\right)}\right),\, i={{{{\mathrm{1,\, 2}}}}},\ldots,\, N\right\}$, where $N$ is the number of spots or cells, ${{{{{{\bf{g}}}}}}}^{\left(i\right)}\in {{\mathbb{R}}}^{G}$, and ${{{{{{\bf{s}}}}}}}^{\left(i\right)}\in {{\mathbb{R}}}^{2}$ are the raw omics features (e.g., genes) and spatial coordinates of cell $i$, respectively, where $G$ is the number of omics features. For datasets containing non-identical omics features, we use their overlapping features. For ease of notation, we denote the combination of omics features and spatial coordinates across all cells in a dataset as matrices ${{{{{\bf{G}}}}}}\in {{\mathbb{R}}}^{N\times G}$, ${{{{{\bf{S}}}}}}\in {{\mathbb{R}}}^{N\times 2}$, respectively. Subscripts such as ${{{{{{\bf{G}}}}}}}_{1}{{{{{\boldsymbol{,}}}}}} \, {{{{{{\bf{G}}}}}}}_{2}$ and ${{{{{{\bf{S}}}}}}}_{1}{{{{{\boldsymbol{,}}}}}} \, {{{{{{\bf{S}}}}}}}_{2}$ are added to distinguish two datasets being aligned.

In attempts to correct inter-sample batch effects, we employ an SVD-based cross-dataset matrix decomposition strategy as a preprocessing step. To begin with, we denote the log-normalized and scaled omics matrices of two spatial datasets as ${\widetilde{{{{{{\bf{G}}}}}}}}_{1}\in {{\mathbb{R}}}^{{N}_{1}\times G}$ and ${\widetilde{{{{{{\bf{G}}}}}}}}_{2}\in {{\mathbb{R}}}^{{N}_{2}\times G}$. We then apply SVD on their dot product as follows:

$${\widetilde{{{{{{\bf{G}}}}}}}}_{1}{{\widetilde{{{{{{\bf{G}}}}}}}}_{2}}^{\top }={{{{{\bf{U}}}}}}{{{{{\boldsymbol{\Sigma }}}}}}{{{{{{\bf{V}}}}}}}^{\top }$$

(1)

Using the decomposed matrices, we obtain the batch-corrected embeddings of the two datasets as:

$$\begin{array}{c}{{{{{{\bf{X}}}}}}}_{1}={{{{{{\bf{U}}}}}}}_{1:M}{{{{{{{\boldsymbol{\Sigma }}}}}}}_{1:M}}^{\frac{1}{2}}\\ {{{{{{\bf{X}}}}}}}_{2}={{{{{{\bf{V}}}}}}}_{1:M}{{{{{{{\boldsymbol{\Sigma }}}}}}}_{1:M}}^{\frac{1}{2}}\end{array}$$

(2)

where ${{{{{{\bf{U}}}}}}}_{1:M}\in {{\mathbb{R}}}^{{N}_{1}\times M}$, ${{{{{{\bf{V}}}}}}}_{1:M}\in {{\mathbb{R}}}^{{N}_{2}\times M}$, ${{{{{{\boldsymbol{\Sigma }}}}}}}_{1:M}\in {{\mathbb{R}}}^{M\times M}$ are truncated decompositions corresponding to the $M$-largest singular values.

The SVD-based correction performed well with comparable (and somewhat superior) joint accuracy over state-of-the-art method Harmony^18,37 (Supplementary Fig. 12c). Notably, this preprocessing step is also flexible and allows for modular input (e.g., embedding produced by dedicated algorithms like GLUE²² for cross-modality integration).

We model the spatial information of cells in ${{{{{\mathscr{D}}}}}}$ as a spatial graph ${{{{{\mathscr{G}}}}}}=({{{{{\mathscr{V}}}}}},\, {{{{{\mathscr{E}}}}}}{{{{{\mathscr{,}}}}}} \, {{{{{\bf{X}}}}}})$, where each node ${v}_{i}\in {{{{{\mathscr{V}}}}}}$ corresponds to a cell with the batch-corrected embedding ${{{{{{\bf{x}}}}}}}^{\left(i\right)}{{{{{\boldsymbol{\in }}}}}}{{\mathbb{R}}}^{M}$ as its node attribute, and the edges connect the K-nearest neighbors in the spatial space. We also denote the adjacency matrix of the graph as ${{{{{\bf{A}}}}}}\in {\left\{{{{{\mathrm{0,1}}}}}\right\}}^{N\times N}$. ${{{{{{\bf{A}}}}}}}_{i,j}=1$ if the edge $({v}_{i},\, {v}_{j})\in {{{{{\mathscr{E}}}}}}$, otherwise ${{{{{{\bf{A}}}}}}}_{i,j}=0$. In particular, for cross-technology alignment, different technologies may have distinct spatial resolutions, in which case we can select different $K$’s for each slice according to its spatial resolution. SLAT also supports building the spatial graph by radius, where all cells located within a specific radius are taken as neighbors.

SLAT formulates the alignment of two spatial datasets ${{{{{{\mathscr{D}}}}}}}_{1}=\left\{\left({{{{{{\bf{x}}}}}}}_{1}^{\left(i\right)},\, {{{{{{\bf{s}}}}}}}_{1}^{\left(i\right)}\right),\, i={{{{\mathrm{1,\, 2}}}}},\ldots,\, {N}_{1}\right\}$ and ${{{{{{\mathscr{D}}}}}}}_{2}=\left\{\left({{{{{{\bf{x}}}}}}}_{2}^{\left(i\right)},\, {{{{{{\bf{s}}}}}}}_{2}^{\left(i\right)}\right),i={{{{\mathrm{1,\, 2}}}}},\ldots,\, {N}_{2}\right\}$ as a minimum-cost bipartite matching problem of their corresponding spatial graphs ${{{{{{\mathscr{G}}}}}}}_{1}=({{{{{{\mathscr{V}}}}}}}_{1},\, {{{{{{\mathscr{E}}}}}}}_{1},\, {{{{{{\bf{X}}}}}}}_{1})$ and ${{{{{{\mathscr{G}}}}}}}_{2}=({{{{{{\mathscr{V}}}}}}}_{2},\, {{{{{{\mathscr{E}}}}}}}_{2},\, {{{{{{\bf{X}}}}}}}_{2})$:

$$\mathop{\min }\limits_{{{{{{\mathscr{M}}}}}}}\mathop{\sum }\limits_{\left({v}_{i},\, {v}_{j}\right){{{{{\mathscr{\in }}}}}}{{{{{\mathscr{M}}}}}}}{||}{{{{{{\bf{z}}}}}}}_{1}^{\left(i\right)}-{{{{{{\bf{z}}}}}}}_{2}^{\left(j\right)}{||},\, {{{{{\rm{s}}}}}}.{{{{{\rm{t}}}}}}.\,{{{{{\mathscr{M}}}}}}{{{{{\mathscr{\subset }}}}}}{{{{{{\mathscr{V}}}}}}}_{1}\times {{{{{{\mathscr{V}}}}}}}_{2},\left|{{{{{\mathscr{M}}}}}}\right|=m,$$

(3)

where ${{{{{{\bf{z}}}}}}}_{1}^{\left(i\right)},\, {{{{{{\bf{z}}}}}}}_{2}^{\left(j\right)}\in {{\mathbb{R}}}^{P}$ are node embeddings of ${v}_{i}$ and ${v}_{j}$ in ${{{{{{\mathscr{G}}}}}}}_{1}$ and ${{{{{{\mathscr{G}}}}}}}_{2}$, respectively, ${{{{{\mathscr{M}}}}}}$ is a set of matches of fixed size, and $P$ is the dimensionality of node embeddings. Similarly, we use matrices ${{{{{{\bf{Z}}}}}}}_{1}\in {{\mathbb{R}}}^{{N}_{1}\times P}$ and ${{{{{{\bf{Z}}}}}}}_{2}\in {{\mathbb{R}}}^{{N}_{2}\times P}$ to denote the combination of cell embeddings of all cells in the two datasets.

It has been demonstrated in a previous work, that the above matching problem is equivalent to minimizing the Wasserstein distance between node embeddings from different graphs¹⁶. SLAT follows the same approach with adaptations for spatial omics data. Below, we explain how node or cell embeddings ${{{{{{\bf{Z}}}}}}}_{1}{{{{{\boldsymbol{,}}}}}} \, {{{{{{\bf{Z}}}}}}}_{2}$ can be obtained and optimized for spatial graph alignment.

Construction of holistic cell representations

An accurate alignment of spatial omics datasets should align cells that are similar in both the molecular modality and the spatial context. In particular, the spatial context can involve various resolutions, ranging from microenvironments to global positions within the tissue. Inspired by previous work^16,38,39, we first employ the lightweight graph-convolutional network (LGCN) to derive a holistic cell representation with all such information integrated for each dataset. A LGCN propagates and aggregates information along the spatial graph through stepwise concatenations:

$$\widetilde{{{{{{\bf{X}}}}}}}={f}_{{{{{{\rm{LGCN}}}}}}}\left({{{{{\bf{A}}}}}},\, {{{{{\bf{X}}}}}}\right)={{{{{\rm{Concat}}}}}}\left({{{{{\bf{X}}}}}},\, \hat{{{{{{\bf{A}}}}}}}{{{{{\bf{X}}}}}},\, {\hat{{{{{{\bf{A}}}}}}}}^{2}{{{{{\bf{X}}}}}},\ldots,\, {\hat{{{{{{\rm{A}}}}}}}}^{L}{{{{{\bf{X}}}}}}\right){{{{{\boldsymbol{,}}}}}}$$

(4)

where $\hat{{{{{{\bf{A}}}}}}}={\widetilde{{{{{{\bf{D}}}}}}}}^{-\frac{1}{2}}\widetilde{{{{{{\bf{A}}}}}}}{\widetilde{{{{{{\bf{D}}}}}}}}^{-\frac{1}{2}}$, $\widetilde{{{{{{\bf{A}}}}}}}{{{{{\boldsymbol{=}}}}}}{{{{{\bf{A}}}}}}{{{{{\boldsymbol{+}}}}}}{{{{{\bf{I}}}}}}$, $\widetilde{{{{{{\bf{D}}}}}}}$ is the diagonal degree matrix of $\widetilde{{{{{{\bf{A}}}}}}}$, and $L$ is the maximal number of steps. The resulting cell representation $\widetilde{{{{{{\bf{X}}}}}}}\in {{\mathbb{R}}}^{N\times \left(L+1\right)M}$ is a concatenation of multi-level information. The first $M$ dimensions correspond to no graph propagation, which is simply a copy of the omics data ${{{{{\bf{X}}}}}}$. The second $M$ dimensions correspond to one-step graph propagation, reflecting the composition of a cell’s immediate neighbors, which form its microenvironment. The information coarsens as the number of steps increases, gradually becoming a representation of rough locations within the tissue. Thus, $\widetilde{{{{{{\bf{X}}}}}}}$ contains informative features for spatial alignment at multiple levels of spatial context.

The cell representation $\widetilde{{{{{{\bf{X}}}}}}}$ is constructed separately for each dataset by using dataset-specific omics profiles and adjacency matrices:

$$\begin{array}{c}{\widetilde{{{{{{\bf{X}}}}}}}}_{1}={f}_{{{{{{\rm{LGCN}}}}}}}\left({{{{{{\bf{A}}}}}}}_{1},\, {{{{{{\bf{X}}}}}}}_{1}\right),\\ {\widetilde{{{{{{\bf{X}}}}}}}}_{2}={f}_{{{{{{\rm{LGCN}}}}}}}({{{{{{\bf{A}}}}}}}_{2},\, {{{{{{\bf{X}}}}}}}_{2}).\end{array}$$

(5)

Adversarial graph alignment

Based on the holistic cell representations ${\widetilde{{{{{{\bf{X}}}}}}}}_{1}{{{{{\boldsymbol{,}}}}}} \, {\widetilde{{{{{{\bf{X}}}}}}}}_{2}$ described above, we use adversarial alignment to learn cell embeddings ${{{{{{\bf{Z}}}}}}}_{1}{{{{{\boldsymbol{,}}}}}} \, {{{{{{\bf{Z}}}}}}}_{2}$ that minimize the Wasserstein distance for graph matching¹⁶.

Specifically, we apply a multilayer perceptron denoted as ${f}_{{{{{{\rm{Z}}}}}}}$ to mitigate systematic bias in ${\widetilde{{{{{{\bf{X}}}}}}}}_{1}$ and ${\widetilde{{{{{{\bf{X}}}}}}}}_{2}$ that may arise from differences in omics distribution or spatial topology across datasets:

$$\begin{array}{c}{{{{{{\bf{Z}}}}}}}_{1}={f}_{{{{{{\rm{Z}}}}}}}\left({\widetilde{{{{{{\bf{X}}}}}}}}_{1}\right),\\ {{{{{{\bf{Z}}}}}}}_{2}={f}_{{{{{{\rm{Z}}}}}}}\left({\widetilde{{{{{{\bf{X}}}}}}}}_{2}\right).\end{array}$$

(6)

We then introduce the Wasserstein discriminator ${f}_{{{{{{\rm{D}}}}}}}$, which uses ${{{{{{\bf{Z}}}}}}}_{1},\, {{{{{{\bf{Z}}}}}}}_{2}$ as input and tries to maximize the following Wasserstein loss ${L}_{{{{{{\rm{W}}}}}}}$ to estimate the Wasserstein distance:

$${L}_{{{{{{\rm{W}}}}}}}=\frac{1}{\left|{{{{{{\mathscr{V}}}}}}}_{1}\right|}\mathop{\sum }\limits_{{v}_{i}\in {{{{{{\mathscr{V}}}}}}}_{1}}{f}_{D}\left({{{{{{\bf{z}}}}}}}_{1}^{\left(i\right)}\right)-\frac{1}{\left|{{{{{{\mathscr{V}}}}}}}_{2}\right|}\mathop{\sum }\limits_{{v}_{j}\in {{{{{{\mathscr{V}}}}}}}_{2}}{f}_{D}\left({{{{{{\bf{z}}}}}}}_{2}^{\left(j\right)}\right),$$

(7)

where ${{{{{{\bf{z}}}}}}}_{1}^{\left(i\right)}$ is a row in ${{{{{{\bf{Z}}}}}}}_{1}$ corresponding to cell ${v}_{i}$, and ${{{{{{\bf{z}}}}}}}_{2}^{\left(j\right)}$ is a row in ${{{{{{\bf{Z}}}}}}}_{2}$ corresponding to cell ${v}_{j}$. The transformation ${f}_{{{{{{\rm{Z}}}}}}}$ can then be adversarially trained to minimize (7), for aligning the distribution of cell embeddings in the two datasets properly. However, different single-cell spatial datasets may contain different cell-type proportions or distinct spatial regions. It is thus unreasonable to assume identical distribution of their cell embeddings as assumed in the standard scheme described above⁴⁰. Inspired by a previous study¹⁶, we use the output of the Wasserstein discriminator ${f}_{D}$ as a dynamic clipping criterium to select $c\times {N}_{1}$ and $c\times {N}_{2}$ cells from the two datasets with minimum Wasserstein distance for adversarial training (Supplementary Fig. 1):

$${{{{{{\mathscr{V}}}}}}}_{1}^{{\prime} }={\underset{{{{v}_{i}} \in {{{{{{\mathscr{V}}}}}}}_{1}^{{\prime} } \subseteq {{{{{{\mathscr{V}}}}}}}_{1}}}{{{{{{\mathrm{arg }}}}}}\, {k}\,{{{{{\mathrm{min}}}}}} }}{f}_{{{{{{\rm{D}}}}}}}\left({{{{{{\bf{z}}}}}}}_{1}^{\left(i\right)}\right),| {{{{{{\mathscr{V}}}}}}}_{1}^{{\prime} } |=c \times {N}_{1},\\ {{{{{{\mathscr{V}}}}}}}_{2}^{{\prime} }={\underset{{{v}_{j}} \in {{{{{{\mathscr{V}}}}}}}_{2}^{{\prime} } \subseteq {{{{{{\mathscr{V}}}}}}}_{2}}{{{{{{\mathrm{arg }}}}}}\,{k}\,{{{{{\mathrm{max}}}}}} }}{f}_{{{{{{\rm{D}}}}}}}\left({{{{{{\bf{z}}}}}}}_{2}^{\left(j\right)}\right),|{{{{{{\mathscr{V}}}}}}}_{2}^{{\prime} } |=c\times {N}_{2},$$

(8)

where $c$ is a hyperparameter between 0 and 1. These cells correspond to the most reliable anchors to guide the alignment. The Wasserstein discriminator loss ${L}_{{{{{{\rm{W}}}}}}}$ is then modified accordingly as follows:

$${L}_{{{{{{\rm{W}}}}}}}=\frac{1}{\left|{{{{{{\mathscr{V}}}}}}}_{1}{\prime} \right|}\mathop{\sum}\limits_{{v}_{i}\in {{{{{{\mathscr{V}}}}}}}_{1}{\prime} }{f}_{{{{{{\rm{D}}}}}}}\left({{{{{{\bf{z}}}}}}}_{1}^{\left(i\right)}\right)-\frac{1}{\left|{{{{{{\mathscr{V}}}}}}}_{2}{\prime} \right|}\mathop{\sum}\limits_{{v}_{j}\in {{{{{{\mathscr{V}}}}}}}_{2}{\prime} }{f}_{{{{{{\rm{D}}}}}}}\left({{{{{{\bf{z}}}}}}}_{2}^{\left(j\right)}\right).$$

(9)

This approach ensures that distinct regions across two spatial datasets will not be forcibly aligned. The results show that SLAT performs best when $c=0.6$, although the performance depends only weakly on $c$ (Supplementary Fig. 12a).

To avoid a degenerate solution where all embeddings collapse to a singular point, we adopt an additional reconstruction term to ensure that the embeddings have sufficient information to reconstruct input, essentially enhancing model stability. We use a simple multilayer perceptron network denoted as ${f}_{{{{{{\rm{R}}}}}}}$ for data reconstruction, making the following reconstruction loss:

$${L}_{{{{{{\rm{R}}}}}}}=\frac{1}{\left|{{{{{{\mathscr{V}}}}}}}_{1}\right|}\mathop{\sum }\limits_{{v}_{i}\in {{{{{{\mathscr{V}}}}}}}_{1}}{\Big|\Big|}{f}_{{{{{{\rm{R}}}}}}}\left({{{{{{\bf{z}}}}}}}_{1}^{\left(i\right)}\right)-{{{{{{\bf{x}}}}}}}_{1}^{\left(i\right)}{\Big|\Big|}+\frac{1}{\left|{{{{{{\mathscr{V}}}}}}}_{2}\right|}\mathop{\sum }\limits_{{v}_{j}\in {{{{{{\mathscr{V}}}}}}}_{2}}{\Big|\Big|}{f}_{{{{{{\rm{R}}}}}}}\left({{{{{{\bf{z}}}}}}}_{2}^{\left(j\right)}\right)-{{{{{{\bf{x}}}}}}}_{2}^{\left(j\right)}{\Big|\Big|}.$$

(10)

We assessed the necessity of Wasserstein discriminator though an ablation study on the homogenous (Stereo-seq) and heterogenous (spatial-ATAC-seq vs. Stereo-seq) alignments, respectively. We found that ablation of Wasserstein discriminator did not significantly affect accuracy in homogeneous alignments where difference in the spatial domain is negligible (Supplementary Table 2). However, it did substantially influence accuracy in the heterogeneous alignment of Stereo-seq (0.22 μm) vs. spatial-ATAC-seq (20 μm) where the spatial domain is drastically different in resolution (Supplementary Table 3).

Overall objective

Finally, the overall training objective of SLAT can be summarized as follows:

$$\mathop{{{{{\mathrm{max}}}}} }\limits_{{f}_{{{{{{\rm{D}}}}}}}}\alpha \cdot {L}_{{{{{{\rm{W}}}}}}},$$

(11)

$$\mathop{{{{{\mathrm{min}}}}} }\limits_{{f}_{{{{{{\rm{Z}}}}}}},{f}_{{{{{{\rm{R}}}}}}}}\alpha \cdot {L}_{{{{{{\rm{W}}}}}}}+\left(1-\alpha \right)\cdot {L}_{{{{{{\rm{R}}}}}}},$$

(12)

where ${L}_{{{{{\mathrm{W}}}}}}$ and ${L}_{{{{{\mathrm{R}}}}}}$ are defined by Eqs. (9) and (10), respectively, $\alpha$ is a hyperparameter balancing the contribution of adversarial alignment and data reconstruction. We use stochastic gradient descent (SGD) with the Adam optimizer to train the SLAT model.

Coordinate matching

Apart from the core model described above, we also provide options to use additional information to match the spatial coordinates ${{{{{\bf{S}}}}}}$ of two slices which can help distinguish symmetric structures (e.g., left and right hemispheres of the brain) and improve the final matching quality. The goal of coordinate matching is to roughly align different slices in terms of their overall direction by estimating an affine transformation matrix ${{{{{\bf{M}}}}}}$. The exact strategy depends on the type of information available.

First, guided by expert knowledge of tissue structures, ${{{{{\bf{M}}}}}}$ can be computed by combining the scaling, rotation, and translation operations required to obtain a rough alignment between the two slices. Second, if imaging data like H-E staining images are available, ${{{{{\bf{M}}}}}}$ can be estimated following our tutorial based on SimpleElastix⁴¹, which is a state-of-the-art medical image registration tool. Finally, if no other information is available, we also provide a default solution based on iterative closet point (ICP)⁴², a point-cloud registration algorithm, where we treat the spatial datasets as point clouds on a two-dimensional plane and uses geometric features for registration. With the obtained ${{{{{\bf{M}}}}}}$ matrix, we consort the coordinates of the two datasets by the following transformation:

$${{{{{{\bf{S}}}}}}}_{2}^{{\prime} }={{{{{{\bf{S}}}}}}}_{2}\cdot {{{{{\bf{M}}}}}}{{{{{\boldsymbol{.}}}}}}$$

(13)

Quality assessment and probabilistic matching

With the cell embeddings ${{{{{{\bf{Z}}}}}}}_{1}{{{{{\boldsymbol{,}}}}}}{{{{{{\bf{Z}}}}}}}_{2}$ learned by the SLAT core model and the matched spatial coordinates ${{{{{{\bf{S}}}}}}}_{1}$ and ${{{{{{\bf{S}}}}}}}_{2}^{{\prime} }$, we match dataset ${{{{{{\mathscr{D}}}}}}}_{2}$ with ${{{{{{\mathscr{D}}}}}}}_{1}$ using the following strategy: For cell $i$ in dataset ${{{{{{\mathscr{D}}}}}}}_{1}$, SLAT first selects the closest $K$ cells from ${{{{{{\mathscr{D}}}}}}}_{2}$ in matched spatial coordinates ${{{{{\bf{S}}}}}}$ as a candidate set ${{{{{{\mathscr{C}}}}}}}_{i}$. We then compute the cosine similarity between the embedding of cell $i$ and each candidate cell in ${{{{{{\mathscr{C}}}}}}}_{i}$, and evaluate their significance by comparing with a null distribution obtained from 1,000 randomly sampled cell pairs. The final match set ${{{{{{\mathscr{M}}}}}}}_{i}$ consists of cells from the candidate set with p-values less than 0.05:

$${{{{{{\mathscr{C}}}}}}}_{i}={\underset{{v}_{j}\in {{{{{{\mathscr{C}}}}}}}_{i}\subseteq {{{{{{\mathscr{V}}}}}}}_{2}}{{{{{{\mathrm{arg}}}}}} \, k\,{{{{{\mathrm{min}}}}}} }} {\Big |\Big |}{{{{{{\bf{s}}}}}}}_{1}^{\left(i\right)}-{{{{{{{\bf{s}}}}}}}_{2}^{{\prime} }}^{\left(j\right)}{\Big | \Big |},\Big | {{{{{{\mathscr{C}}}}}}}_{i} \Big |=K,$$

(14)

$$p-{{{{{\rm{value}}}}}}\left({v}_{i},\, {v}_{j}\right)=P\left(\cos \left({{{{{{\bf{z}}}}}}}_{1},\, {{{{{{\bf{z}}}}}}}_{2}\right) \, > \cos \left({{{{{{\bf{z}}}}}}}_{1}^{\left(i\right)},\,{{{{{{\bf{z}}}}}}}_{2}^{\left(j\right)}\right)\right)$$

(15)

$${{{{{{\mathscr{M}}}}}}}_{i}=\{{v}_{j}\in {{{{{{\mathscr{C}}}}}}}_{i}{{{{{\rm{|}}}}}}p-{{{{{\rm{value}}}}}}({v}_{i},\, {v}_{j}) \, < \, 0.05\},$$

(16)

where ${{{{{{\bf{s}}}}}}}_{1}^{\left(i\right)}$ and ${{{{{{{\bf{s}}}}}}}_{2}^{{\prime} }}^{\left(j\right)}$ are rows in the coordinate matrices ${{{{{{\bf{S}}}}}}}_{1}$ and ${{{{{{\bf{S}}}}}}}_{2}^{{\prime} }$, respectively. $K$ is the size of spatial neighborhood as described above.

For convenience of 3D visualization, we only plot the alignment with the smallest p-value for each cell in 3D plots (Figs. 3 and 4 and Supplementary Figs. 4, 14, 16, 18, 21–23, and 29).

Systematic benchmarks

Benchmark datasets

We selected 10× Visium, MERFISH and Stereo-seq as representative spatial technologies for benchmarking alignment methods. The 10× Visium dataset comes from consecutive slices of human dorsolateral prefrontal cortex, containing about 3,000 spots per slice, each spanning 50 μm with over 20,000 genes detected⁴³. The MERFISH dataset comes from consecutive slices of mouse hypothalamic preoptic, with subcellular resolution but only 151 genes detected in total²¹. The Stereo-seq dataset comes from consecutive slices of an E15.5 mouse embryo, containing over 100,000 single cells per slice, divided into 25 cell types with complex spatial organization, and detects over 20,000 genes in total⁸. We used all available slices in these datasets (9 Visium slices, 12 MERFISH slices and 4 Stereo-seq slices), results of the first two slices of each dataset were presented in Fig. 2, while the accuracy statistics reported were aggregated across all slice pairs. Cell type annotation and tissue region segmentation were obtained from the original authors whenever possible. Since the Stereo-seq dataset does not provide tissue segmentation, we segmented the most prominent regions in the embryo, including Brain, Jaw and face, Spinal cord, Heart, Lung, Liver, and Belly under the guidance of an expert in mouse anatomy. The spatial segmentation of the MERFISH dataset is provided in publication figures but not in raw data, so we re-segmented the data as guided by the figures.

Slice processing

For each technology, we first removed the cells/spots that are unannotated in both slices, then rotated the second slice with a random angle before feeding to the alignment methods.

Benchmarked methods

The benchmarked methods Harmony, PASTE, STAGATE, and SLAT were executed using the Python packages “harmonypy” (v0.0.6), “paste-bio” (v1.3.0), “STAGATE_pyG” (latest commit 8b9c8ef), and “scSLAT” (v0.2.0), respectively, in Python (v3.8). Seurat was executed using the R package “Seurat” (v4.1.1) in R (v4.1.3). For each method, we used the default data preprocessing steps recommended by the original authors, and searched for the best hyperparameters for each method starting from their default settings (Supplementary Fig. 26). For all experiments involving SLAT, we used the default hyperparameters (SVD dimensionality: 30, graph neighbor: 50, LGCN layer: 3, MLP hidden layer dimension: 256, embedding hidden size: 2048, learning rate: 1e-4, dynamic clipping ratio: 0.6) unless otherwise stated.

Benchmark tasks

We benchmarked the alignment methods based on the following four tasks: (1) duplicate slice alignment, (2) real world alignment, (3) split slice alignment, and (4) scalability test:

In duplicate slice alignment, we duplicate and add noise to the first slice of each benchmark dataset by sampling from a negative binomial distribution centered at the measured expression count. We set the inverse dispersion parameter to different values to simulate varying noise levels:

$${{{{{\rm{NB}}}}}}\left(x;\mu,\, \theta \right)=\frac{\Gamma \left(x+\theta \right)}{\Gamma \left(x\right)\Gamma \left(x+1\right)}{\left(\frac{\mu }{\theta+\mu }\right)}^{x}{\left(\frac{\theta }{\theta+\mu }\right)}^{\theta }$$

(17)

where $\mu$ and $\theta$ are the mean and inverse dispersion, respectively. The duplicate slices were also rotated by 60° to avoid leaking information in spatial orientation¹⁴.

In real world alignment, the slices to be aligned are different slices in the benchmark datasets produced from the same position of the same tissue^5,8,43 (Supplementary Fig. 2). We also rotate the slices by 60° before feeding to the alignment algorithms. Each algorithm was run eight times with different model random seeds.

In split slice alignment, we split the first slice of each benchmark dataset into two pseudo-slices of equal size by randomly sampling the cells without replacement. Each algorithm was run eight times with different model random seeds.

For scalability benchmark, we randomly subsampled the Stereo-seq dataset used in real world benchmark to a range of cell numbers (3200, 6400, 12800, 25600, 51200, 102400). The subsampling process was repeated eight times with different random seeds.

Evaluation metrics

For synthetic tests on duplicated slices, the ground truth one-to-one matching is known. For each cell $i$ in the original slice, we find cell ${j}^{*}$ with maximal matching score on the duplicate slice ${j}^{*}=\mathop{{{{{{\rm{argmax}}}}}}}\nolimits_{{v}_{j}\in {{{{{{\mathscr{C}}}}}}}_{i}}\cos \left({z}_{1}^{\left(i\right)},\, {z}_{2}^{\left(j\right)}\right)$ and compute the ground truth accuracy as follows:

$${{{{{\rm{ground}}}}}}\,{{{{{\rm{truth}}}}}}\,{{{{{\rm{accuracy}}}}}}=\frac{1}{N}\mathop{\sum }\limits_{i=1}^{N}{1}_{{j}^{*}=i}$$

(18)

where ${1}_{{j}^{*}=i}$ is an indicator function that evaluates to 1 when ${j}^{*}=i$ and 0 otherwise.

In real world spatial alignments, the exact ground truth matching does not exist, but intuitively, a proper spatial alignment should align cells matched in both molecular profile and spatial context. Thus, we reported the cell type matching accuracy and spatial region matching accuracy simultaneously in the form of contingency tables. For convenience, we also defined a “joint accuracy” as the proportion of cells with both cell type and spatial region matched correctly, which corresponds to the upper right corner of the contingency table. We also report the micro and macro F1 score of cell type matching and spatial region matching and joint matching, respectively.

For more comprehensive assessment, we also used the following alternative metrics to quantify the quality of alignments. The first is the estimation of rotation angle that corrects artificial rotation. Specifically, based on the spatial alignment of each benchmarked method, we estimated the optimal corrective rotation angle by solving the Procrustes problem⁴⁴, which is then compared with the ground truth to calculated the deviation. However, it is worth noting that rotation estimation may not reliably reflect alignment quality when non-rigid deformations exist.

The second is the edge score, which quantifies how well an alignment preserves neighborhoods^34,45:

$$\begin{array}{c}{{{{{{\rm{A}}}}}}}_{{nm}}=\left\{\begin{array}{c}1,\, {if}{n}^{{\prime} }\sim {m}^{{\prime} }\\ -1,\, {if}{{n}}^{{\prime} }\nsim {m}^{{\prime} }\end{array}\right.\\ {{{{{\rm{edge}}}}}}\,{{{{{\rm{score}}}}}}=\frac{1}{N}\mathop{\sum }\limits_{n=1}^{N}\mathop{\sum }\limits_{m\in {{{{{{\mathscr{N}}}}}}}_{\left(n\right)}}{{{{{{\rm{A}}}}}}}_{{nm}}\end{array}$$

(19)

However, we found that the edge score could be deceived in certain situations. For example, see Supplementary Fig. 25b shows two graphs with known ground truth node pairing information. “Alignment 1” is the correct matching (gray lines) and “Alignment 2” has four mismatched pairs (highlighted by red lines), but the two alignments get the same edge score.

Benchmark workflow

We used Snakemake (v7.12.0) to manage the whole benchmark workflow. All benchmarked methods were allocated 16 cores of Intel Xeon Platinum 8358 CPU, 128 GB of RAM, and a NVIDIA A100 GPU with 80 GB VRAM by the Slurm workload management system.

Hyperparameter robustness

We tested SLAT’s robustness to key hyperparameters including: (1) SVD dimension $M$, (2) number of LGCN layers $L$, (3) learning rate of the SLAT model, (4) MLP hidden layer dimension, (5) dimension $P$ of SLAT embedding, (6) dynamic clipping ratio $c$.

We ran SLAT on the same slices as in Fig. 2b. For Stereo-seq slices containing more than 100,000 cells, we randomly subsampled 8,000 cells from each slice to save time. Every experiment was run 8 times with different model random seeds.

In this test, we also demonstrated the advantage of LCGN architecture, especially in MERFISH dataset, where cell type and spatial region are not significantly correlated (Supplementary Fig. 3). We found substantial decrease in joint accuracy without LGCN (Supplementary Fig. 12a), mainly caused by deteriorated spatial region matching (Supplementary Fig. 28).

Robustness to noise in the spatial graph

In practice, the spatial graph may be imperfect due to technical limitations. Therefore, we tested SLAT’s robustness to graph corruption in the same dataset used in hyperparameter robustness evaluation. Specifically, we randomly masked the edges in the graph by increasing ratios (from 0.1 and 0.9). Every experiment was run 8 times with different masking random seeds.

Heterogeneous alignment across distinct technologies and modalities

Visium and Xenium data alignment

The Visium and Xenium datasets were generated from consecutive slices of the human breast cancer tissue sample⁴. In order to maintain the comparability with the original paper, we chose the exact same slices used in their analysis (see Fig. 4c of ref. ⁴). We further selected the shared region between the two slices as the original authors reported⁴ for follow-up analysis.

Considering that the Xenium slice contains more than 100,642 cells while Visium only contains less than 3841 spots in same physical region, we used different neighbor sizes proportional to cell density ($K=5$ for Visium, and $K=130$ for Xenium) when constructing the spatial graphs, in order to ensure that the GCNs have similar spatial receptive fields. SLAT is then run with otherwise default parameters. We selected Visium triple positive spots based on the number of aligned Xenium triple positive cells (Supplementary Fig. 29c). 7 spots with more than 2 aligned cells were chosen. Following the original paper, we did differential gene expression analysis of the SLAT identified and manually curated triple positive spots against all other spots, respectively, using the Scanpy⁴⁶ function “scanpy.tl.rank_genes_groups” with parameter “methods=‘wilcoxon’”. For comparison, all methods included in the benchmarks (PASTE, Harmony, Seurat, and Harmony) were applied to the same Visium and Xenium datasets with their default parameters.

For comparison, all methods benchmarked (PASTE, Harmony, Seurat, and STAGATE) were applied to the same datasets with their default parameters (Supplementary Fig. 16).

SeqFISH data and Stereo-seq data alignment

For Stereo-seq and seqFISH, we chose the E9.5 slice with the most complete cell type annotation by the original authors (i.e., the slices with the least proportion of unannotated cells), respectively. To align the chosen seqFISH and Stereo-seq slices, we run SLAT using $K=20$ for the Stereo-seq slice and $K=50$ for the seqFISH+ slice to balance cell density, while using default value for other parameters. We next refined cell type annotations in the Stereo-seq dataset based on the higher resolution annotation in seqFISH through label transfer for “Neural crest” cells (Fig. 3b). We also manually annotated “Neural crest” cells for independent validation. Scanpy⁴⁶ was used for the analysis following its official tutorial: the data were log-normalized using the functions “sc.pp.normalize_total” and “sc.pp.log1p”. Highly variable genes were identified with “sc.pp.highly_variable_genes”. The first 50 principal components after PCA (“sc.tl.pca”) were used to generate neighborhood graphs (“sc.pp.neighbors”) for computing UMAP embeddings (“sc.tl.umap”) and Leiden clustering (“sc.tl.leiden”). The “Neural crest” cells were annotated via marker genes of different germ layers (Foxc2 and Vcan for mesoderm, Msx1, Mif, and Dik1for ectoderm, see Supplementary Fig. 13e).

For comparison, all methods benchmarked (PASTE, STAGATE, Seurat and Harmony) were applied to the same datasets with their default parameters (Supplementary Fig. 14).

Spatial-ATAC-seq and Stereo-seq data alignment

For cross-modality alignment, the slices to align were two 11.5-day mouse embryo datasets from Stereo-seq (RNA) and spatial-ATAC-seq (ATAC), respectively. For Stereo-seq, we chose the E11.5 slice with the most complete cell type annotation by the original authors (i.e., the slice with the least proportion of unannotated cells). For spatial-ATAC-seq, we chose the E11.5 slice with the highest spatial resolution (20 μm). Given that the current spatial-ATAC-seq data did not cover the entire embryo due to technical limitations, we extracted the anatomically corresponding regions from the Stereo-seq dataset under expert guidance.

To project cells from different modalities into a shared latent space, we employed the graph-linked multi-modality embedding strategy we proposed before²², and built the graph with $K=50$ for Stereo-seq, and $K=20$ for spatial-ATAC-seq, then run SLAT with default hyperparameters. Based on the outputted alignment, we transferred cell type labels from Stereo-seq to spatial-ATAC-seq which was not annotated, and then applied SCENIC+⁴⁷ for joint regulatory inference.

To compare with benchmarked methods (PASTE, Harmony, Seurat and Harmony), we also used the same multi-modality embedding as their input followed by their default pipeline. Exceptions are Seurat and STAGATE, which do not support low dimensional embeddings as input, thus cannot be compared. In addition, we also compared with the original multi-modality embeddings²² directly (Supplementary Fig. 18).

Spatial-temporal alignment

We applied SLAT to align E11.5 and E12.5 heterogeneous mouse embryo Stereo-seq datasets with the default hyperparameters. In order to maintain the comparability with the original paper, we chose the exact same slices used in their analysis (see Fig. 3a of ref. ⁸). Regions with lower SLAT similarity scores were marked in Fig. 4b. To focus on kidney development, we extracted cells labeled as “Kidney” in E12.5 and their aligned cells in E11.5, then clustered these cells by using the standard Scanpy clustering pipeline mentioned above and annotated them via well-defined kidney markers^31,32 (Osr1, Foxc1 and Podxl for nephron progenitors; Uncx, Nr2f2, Dach, Wt1, Nphs1, and Cd44 for kidney; see Fig. 4f). To further demonstrate the robustness of SLAT, we rerun the analysis against two additional slices randomly chosen from E11.5 and E12.5, and obtained similar results (Supplementary Fig. 30).

For comparison, all methods benchmarked (PASTE, STAGATE, Seurat and Harmony) were applied to the same datasets with their default parameters (Supplementary Fig. 21).

Statistics and reproducibility

No statistical method was used to predetermine the sample size. No data were excluded from the analyses. The experiments were not randomized. The Investigators were not blinded to allocation during experiments and outcome assessment. More information was provided in the Reporting summary file.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All relevant data supporting the key findings of this study were already published and were obtained from public data repositories. The data used in this study are provided in Supplementary Table 1, including publication and downloading URLs. All benchmarking data and Source data are provided with this paper.

Code availability

The SLAT framework was implemented in the “scSLAT” Python package, which is available at https://github.com/gao-lab/SLAT⁴⁸. For reproducibility, the scripts for all benchmarks were assembled using Snakemake (v7.12.0), which is also available in the above repository.

References

Lubeck, E., Coskun, A. F., Zhiyentayev, T., Ahmad, M. & Cai, L. Single-cell in situ RNA profiling by sequential hybridization. Nat. Methods 11, 360–361 (2014).
Article CAS PubMed PubMed Central Google Scholar
Chen, K. H., Boettiger, A. N., Moffitt, J. R., Wang, S. & Zhuang, X. Spatially resolved, highly multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).
Article PubMed PubMed Central Google Scholar
Eng, C.-H. L. et al. Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH. Nature 568, 235–239 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Janesick, A. et al. High resolution mapping of the breast cancer tumor microenvironment using integrated single cell, spatial and in situ analysis of FFPE tissue. Preprint at https://doi.org/10.1101/2022.10.06.510405 (2022).
Ståhl, P. L. et al. Visualization and analysis of gene expression in tissue sections by spatial transcriptomics. Science 353, 78–82 (2016).
Article ADS PubMed Google Scholar
Vickovic, S. et al. High-definition spatial transcriptomics for in situ tissue profiling. Nat. Methods 16, 987–990 (2019).
Article CAS PubMed PubMed Central Google Scholar
Stickels, R. R. et al. Highly sensitive spatial transcriptomics at near-cellular resolution with Slide-seqV2. Nat. Biotechnol. 39, 313–319 (2021).
Article CAS PubMed Google Scholar
Chen, A. et al. Spatiotemporal transcriptomic atlas of mouse organogenesis using DNA nanoball-patterned arrays. Cell 185, 1777.e21–1792.e21 (2022).
Article Google Scholar
Deng, Y. et al. Spatial profiling of chromatin accessibility in mouse and human tissues. Nature 609, 375–383 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Kuppe, C. et al. Spatial multi-omic map of human myocardial infarction. Nature 608, 766–777 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Long, Y. et al. Spatially informed clustering, integration, and deconvolution of spatial transcriptomics with GraphST. Nat. Commun. 14, 1155 (2023).
Article ADS CAS PubMed PubMed Central Google Scholar
Dong, K. & Zhang, S. Deciphering spatial domains from spatially resolved transcriptomics with an adaptive graph attention auto-encoder. Nat. Commun. 13, 1739 (2022).
Article ADS CAS PubMed PubMed Central Google Scholar
Äijö, T. et al. Splotch: robust estimation of aligned spatial temporal gene expression data. Preprint at https://doi.org/10.1101/757096 (2019).
Zeira, R., Land, M., Strzalkowski, A. & Raphael, B. J. Alignment and integration of spatial transcriptomics data. Nat. Methods 19, 567–575 (2022).
Article CAS PubMed PubMed Central Google Scholar
Bergenstråhle, J., Larsson, L. & Lundeberg, J. Seamless integration of image and molecular analysis for spatial transcriptomics workflows. BMC Genomics 21, 482 (2020).
Article PubMed PubMed Central Google Scholar
Gao, J., Huang, X. & Li, J. Unsupervised graph alignment with Wasserstein distance discriminator. In Proc. 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining 426–435 (ACM, 2021).
Stuart, T. et al. Comprehensive integration of single-cell data. Cell 177, 1888.e21–1902.e21 (2019).
Article Google Scholar
Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. Nat. Methods 16, 1289–1296 (2019).
Article CAS PubMed PubMed Central Google Scholar
Moses, L. & Pachter, L. Museum of spatial transcriptomics. Nat Methods 19, 534–546 (2022).
Article CAS PubMed Google Scholar
Eisenstein, M. Seven technologies to watch in 2022. Nature 601, 658–661 (2022).
Article ADS CAS PubMed Google Scholar
Lohoff, T. et al. Integration of spatial and single-cell transcriptomic data elucidates mouse organogenesis. Nat. Biotechnol. 40, 74–85 (2022).
Article CAS PubMed Google Scholar
Cao, Z.-J. & Gao, G. Multi-omics single-cell data integration and regulatory inference with graph-linked embedding. Nat. Biotechnol. 40, 1458–1466 (2022).
Article CAS PubMed PubMed Central Google Scholar
Ricci, R. et al. Distinct functions of junD in cardiac hypertrophy and heart failure. Genes Dev. 19, 208–213 (2005).
Article CAS PubMed PubMed Central Google Scholar
Ombrato, L. et al. Generation of neighbor-labeling cells to study intercellular interactions in vivo. Nat. Protoc. 16, 872–892 (2021).
Article CAS PubMed Google Scholar
Harding, S. D. et al. The GUDMAP database – an online resource for genitourinary research. Development 138, 2845–2853 (2011).
Article CAS PubMed PubMed Central Google Scholar
Liu, C.-F., Liu, C. & Yao, H. H.-C. Building pathways for ovary organogenesis in the mouse embryo. Curr. Top Dev. Biol. 90, 263–290 (2010).
Article PubMed PubMed Central Google Scholar
Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019).
Article ADS CAS PubMed PubMed Central Google Scholar
Warburton, D. et al. Lung organogenesis. Curr. Top Dev. Biol. 90, 73–158 (2010).
Article CAS PubMed PubMed Central Google Scholar
Wear, H. M., McPike, M. J. & Watanabe, K. H. From primordial germ cells to primordial follicles: a review and visual representation of early ovarian development in mice. J. Ovarian Res. 9, 36 (2016).
Article PubMed PubMed Central Google Scholar
Bouchard, M. Transcriptional control of kidney development. Differentiation 72, 295–306 (2004).
Article CAS PubMed Google Scholar
Reidy, K. J. & Rosenblum, N. D. Cell and molecular biology of kidney development. Semin. Nephrol. 29, 321–337 (2009).
Article CAS PubMed PubMed Central Google Scholar
Miao, Z. et al. Single cell regulatory landscape of the mouse kidney highlights cellular differentiation programs and disease targets. Nat. Commun. 12, 2277 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Andersson, A. et al. A landmark-based common coordinate framework for spatial transcriptomics data. Preprint at bioRxiv https://doi.org/10.1101/2021.11.11.468178 (2021).
Douglas, J. et al. Metrics for evaluating network alignment. In ACM Conference on Web Search and Data Mining (WSDM), Workshop on Graph Techniques for Adversarial Activity Analytics (GTA3) 1–6 (ACM, 2018).
Dhainaut, M. et al. Spatial CRISPR genomics identifies regulators of the tumor microenvironment. Cell 185, 1223.e20–1239.e20 (2022).
Article Google Scholar
Peng, H. et al. Morphological diversity of single neurons in molecularly defined cell types. Nature 598, 174–181 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Tran, H. T. N. et al. A benchmark of batch-effect correction methods for single-cell RNA sequencing data. Genome Biol. 21, 12 (2020).
Article CAS PubMed PubMed Central Google Scholar
Wu, F. et al. Simplifying graph convolutional networks. In Proceedings of the 36th International Conference on Machine Learning 6861–6871 (ICML, 2019).
Wei, R., Yin, H., Jia, J., Benson, A. R. & Li, P. Understanding non-linearity in graph neural networks from the bayesian-inference perspective. In Adv. Neural Inf. Process. Syst. 35, 34024–34038 (NeurIPs, 2022).
Arjovsky, M., Chintala, S. & Bottou, L. Wasserstein generative adversarial networks. In Proc. 34th International Conference on Machine Learning 214–223 (PMLR, 2017).
Marstal, K., Berendsen, F., Staring, M. & Klein, S. SimpleElastix: a user-friendly, multi-lingual library for medical image registration. In 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 574–582 (2016).
Besl, P. J. & McKay, N. D. A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14, 239–256 (1992).
Article Google Scholar
Maynard, K. R. et al. Transcriptome-scale spatial gene expression in the human dorsolateral prefrontal cortex. Nat. Neurosci. 24, 425–436 (2021).
Article CAS PubMed PubMed Central Google Scholar
Kabsch, W. A solution for the best rotation to relate two sets of vectors. Acta Crystallogr. A 32, 922–923 (1976).
Article ADS Google Scholar
Conte, D., Foggia, P., Sansone, C. & Vento, M. Thirty years of graph matching in pattern recognition. Int. J. Pattern Recogn. 18, 265–298 (2004).
Article Google Scholar
Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19, 15 (2018).
Article PubMed PubMed Central Google Scholar
Bravo González-Blas, C. et al. SCENIC+: single-cell multiomic inference of enhancers and gene regulatory networks. Nat. Methods 20, 1355–1367 (2023).
Article PubMed PubMed Central Google Scholar
Xia, C.-R. et al. Spatial-linked alignment tool (SLAT) for aligning heterogenous slices. zenodo. https://doi.org/10.5281/zenodo.10042913 (2023).

Download references

Acknowledgements

We thank Mr. H. Yang for his helpful guidance on mouse embryonic development and anatomy, as well as Dr. Z. Zhang, Dr. L. Tao, Dr. F. Tang, Dr. X.S. Xie, Dr. C. Li, and Dr. J. Lu at Peking University for their helpful discussions and comments during the study. This work was supported by funds from the National Key Research and Development Program of China (2022ZD0115004), as well as the State Key Laboratory of Protein and Plant Gene Research, the Beijing Advanced Innovation Center for Genomics (ICG) at Peking University, the Changping Laboratory, and the Shaw Foundation Hong Kong Limited. The research of G.G. was supported in part by the National Program for Support of Top-Notch Young Professionals.

Author information

Xin-Ming Tu
Present address: Paul Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, 98195, USA
These authors contributed equally: Chen-Rui Xia, Zhi-Jie Cao.

Authors and Affiliations

State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Biomedical Pioneering Innovative Center (BIOPIC) and Beijing Advanced Innovation Center for Genomics (ICG), Center for Bioinformatics (CBI), Peking University, 100871, Beijing, China
Chen-Rui Xia, Zhi-Jie Cao, Xin-Ming Tu & Ge Gao
Changping Laboratory, 102206, Beijing, China
Chen-Rui Xia, Zhi-Jie Cao & Ge Gao

Authors

Chen-Rui Xia
View author publications
You can also search for this author in PubMed Google Scholar
Zhi-Jie Cao
View author publications
You can also search for this author in PubMed Google Scholar
Xin-Ming Tu
View author publications
You can also search for this author in PubMed Google Scholar
Ge Gao
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

G.G. and Z.-J.C. conceived the study and supervised the research. C.-R.X. and Z.-J.C. designed and implemented the computational framework and conducted benchmarks and case studies with guidance from G.G. X.-M.T. proposed and implemented the matrix decomposition-based batch correction strategy. C.-R.X., Z.-J.C., X.-M.T., and G.G. wrote the manuscript.

Corresponding authors

Correspondence to Zhi-Jie Cao or Ge Gao.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Qing Nie, Mingyu Yang, and the other, anonymous, reviewer(s) for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File - NEW

Reporting Summary

Source data

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Xia, CR., Cao, ZJ., Tu, XM. et al. Spatial-linked alignment tool (SLAT) for aligning heterogenous slices. Nat Commun 14, 7236 (2023). https://doi.org/10.1038/s41467-023-43105-5

Download citation

Received: 02 June 2023
Accepted: 31 October 2023
Published: 09 November 2023
DOI: https://doi.org/10.1038/s41467-023-43105-5

This article is cited by

Multi-slice spatial transcriptome domain analysis with SpaDo
- Bin Duan
- Shaoqi Chen
- Qi Liu
Genome Biology (2024)
Cross-modality mapping using image varifolds to align tissue-scale atlases to molecular-scale measures with application to 2D brain sections
- Kaitlin M. Stouffer
- Alain Trouvé
- Michael I. Miller
Nature Communications (2024)
Spatial transcriptomics: a new frontier in cancer research
- Siyuan Huang
- Linkun Ouyang
- Ruibin Xi
Clinical Cancer Bulletin (2024)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.