Main

The vertebrate brain is organized into subregions that are specialized in function and distinct in cytoarchitecture and connectivity. This spatial specialization of function and structure is established by developmental processes involving intrinsic genetic programs and/or external signalling10. Although gene expression can change during cell maturation and remains dynamic in response to internal cellular conditions and external stimuli, a core transcriptional program that maintains cellular identity usually remains steady in mature neurons11. Thus, resolving the expression of core sets of genes that distinguish different types of neuron provides insight into the functional and structural specialization of neurons.

Many large brain structures are spatially organized into divisions, or modules, within which neurons are more similar in morphology, connectivity and activity. In the cortex these modules usually involve a set of adjacent cortical areas that are highly interconnected3,4,7 and correlated in neuronal activity5,6. Many cortical areas also share the same medium- and fine-grained transcriptomically defined neuronal types9,12. Whether and how the areal and modular organization of cortical connectivity and activity is reflected in the transcriptomic signatures of areas is unknown.

To address this question, here we apply BARseq13,14 to interrogate gene expression and the distribution of excitatory neuron types across nine mouse forebrain hemispheres at high spatial resolution. BARseq is a form of in situ sequencing15 in which Illumina sequencing-by-synthesis chemistry is used to achieve a robust readout of both endogenous messenger RNAs and synthetic RNA barcodes. These RNA barcodes are used to infer long-range projections of neurons. We have previously used BARseq to identify the projections of neuronal types defined by gene expression14,16 and/or their locations13,17, and to identify genes associated with differences in projections within neuronal populations14. Importantly, we showed that BARseq can resolve transcriptomically defined cell types of cortical neurons at cellular resolution by sequencing dozens of cell-type markers14. Because BARseq has high throughput and low cost compared with many other spatial techniques18,19,20,21,22,23,24, it is ideally suited for studying the spatial organization of gene expression at cellular resolution over whole-brain structures such as the cortex.

Here we use BARseq as a standalone technique for sequencing gene expression in situ at brain-wide scale in nine animals, with or without binocular enucleation, to resolve the distribution of neuronal populations and gene expression across the cortex. We generate high-resolution maps of 10.3 million cells with detailed gene expression, including 4,194,658 cortical cells. We find that, although most neuronal populations are found in multiple cortical areas, the composition of neuronal populations is distinct across areas. The neuronal compositions of highly connected areas are more similar, suggesting a modular transcriptomic organization of the cortex that matches cortical hierarchy and modules defined by connectivity in previous studies3,4,7. By comparing littermates with and without binocular enucleation, we then show that peripheral inputs have a critical role in shaping cortical gene expression and area-specific cell-type compositional profiles.

BARseq maps brain-wide gene expression

Recent single-cell transcriptomic studies8,12,25,26,27 have used different nomenclatures to refer to cell types across hierarchical levels. To avoid confusion we first define our cell-type nomenclature. The highest hierarchical level, or H1 type, divides neurons into excitatory neurons, inhibitory neurons and other cells; this level is the ‘class’ level in many studies. Within each H1 type we subdivide neurons into H2 types, which are sometimes referred to as ‘subclasses’8,9. Cortical excitatory neurons fall into nine H2 types that are shared across most cortical areas. This division refines the traditional projection-based intratelencephalic (IT)/pyramidal tract (PT)/corticothalamic (CT) neuron classification28 as follows: PT and CT neurons correspond to L5 extratelencephalic neurons (ET) and L6 CT neurons, respectively, whereas IT neurons are subdivided into L2/3 IT, L4/5 IT, L5 IT, L6 IT, NP (near-projecting neurons), Car3 and L6b. This division follows recent single-cell RNA sequencing studies but differs from the classical tripartite classification of IT/PT/CT neurons. Each H2 type can be further divided into H3 types (‘cluster’ or ‘type’ level in some studies8,9). Previous reports showed that H1 and H2 types are largely shared across most cortical areas, but the expression of many genes is localized to specific parts of the cortex both during development10,29 and in the adult30. Clusters at the H3 level appear to be enriched in neurons from different parts of the cortex8,12,31, but the detailed distribution of neuronal populations at this higher granularity across cortical areas remains unclear.

To assess the distribution of neuronal populations across the cortex we first generated a pilot dataset by applying BARseq to interrogate the expression of 104 cell-type marker genes (Supplementary Table 1) in 40 hemibrain coronal sections covering the whole forebrain in one animal (Fig. 1a,b). We applied the same approach that we used previously to resolve excitatory neuron types in the motor cortex14 (Supplementary Note 1, Fig. 1c and Extended Data Fig. 1a–f show marker gene selection and overall strategy), and found 2,167,762 cells across the whole hemisphere. Removal of cells with an insufficient number of reads (20 reads per cell and five genes per cell minimum) resulted in 1,259,256 cells after quality control (Supplementary Methods), with a mean of 60 unique reads per cell and 27 genes per cell (Extended Data Fig. 1g,h). At the gross anatomical level many genes were differentially expressed across major brain structures and cortical layers (Fig. 1a). These expression patterns were consistent with in situ hybridization patterns in the Allen Brain Atlas30 (Extended Data Fig. 1i and Supplementary Note 1). Thus, our pilot dataset recapitulated the known spatial distribution of gene expression.

Fig. 1: BARseq reveals brain-wide gene expression.
figure 1

a, Images showing mRNA reads of all 40 slices (bottom) and close-up images of three representative slices (top). For clarity, only 17 out of 104 genes (indicated on the right) are plotted. Inset on the left shows an illustration of mRNA detection using BARseq. b, Left, decoded genes and cell segmentations (middle) from a representative imaging tile (out of 4,385 tiles across 40 slices) corresponding to the dashed box in a. Right, close-up images of this area showing the last sequencing cycle, hybridization cycle, decoded genes and cell segmentation. c, Single-cell cluster assignment performance using the full transcriptome, top principal components (PCs) and the 104-gene panel with or without subsampling to match the sensitivity of BARseq for H2 (top) and H3 (bottom) clusters. Scale bars, 1 mm (a), 100 μm for full-tile images (b), 10 μm for the boxed area (b). cDNA, complementary DNA.

BARseq distinguishes neuronal types

We next identified transcriptomic types of neurons by de novo and hierarchical clustering based on single-cell gene expression in the pilot dataset (Fig. 2a and Supplementary Methods). Clustering all cells resulted in 24 clusters, which we then combined into three H1 types (642,340 excitatory neurons, 427,939 inhibitory neurons and 188,977 other cells) based on the expression of Slc17a7 and Gad1 (Extended Data Fig. 2a and Supplementary Methods). Of these 1.2 million cells, 517,428 were in the cortex and were the focus of our analyses. Based on the fraction of excitatory neurons expressing both Slc17a7 and Gad1, we estimated that the probability of segmentation errors in which two neighbouring cells were merged (that is, doublet rate) would be 5–7% (Extended Data Fig. 2b,c and Supplementary Note 2). The 24 clusters, comprising the three H1 types, largely corresponded to coarse anatomical structures in the brain (Fig. 2b). For example, different clusters were enriched in the lateral and ventral groups of the thalamus, the intralaminar nuclei, the epithalamus, the medial, basolateral and lateral nuclei of the amygdala, the striatum and the globus pallidus (Fig. 2b). These results recapitulate the clear distinction of transcriptomic types across anatomically defined brain structures as observed in whole-brain, scRNA-seq studies26,31,32,33,34.

Fig. 2: BARseq captures gene expression and spatial distribution of cortical excitatory cell types.
figure 2

a, Workflow of hierarchical clustering. b, Gene expression (left) and H1 clusters (right) in a representative slice. Major anatomical divisions distinguished by H1 clusters are labelled: SUB, subiculum; DG, dentate gyrus; CP, caudate putamen; GPe, globus pallidus, external segment; LA, lateral amygdala; BLA, basolateral amygdala; MEA-pd (pv), medial amygdalar nucleus, posterodorsal (posteroventral); EPI, epithalamus; APN, anterior pretectal nucleus; LAT, lateral group of the dorsal thalamus; VENT, ventral group of the dorsal thalamus; PF, parafascicular nucleus; HY, hypothalamus. c, UMAP plot of gene expression of excitatory neurons, coloured by H2 type. CA, cornu Ammonis; MOB, main olfactory bulb; MH, medial habenula; AON, anterior olfactory nucleus; RSP, retrosplenial area. d, Marker gene expression in cortical excitatory H3 types. Colours indicate mean expression level and dot size indicates fraction of cells expressing the gene. The dendrogram (top) shows hierarchical clustering of pooled gene expression within each H3 type. e, Overlap (Jaccard index) between BARseq H3 types and scRNA cell types9. Dashed boxes indicate parent H2 types. Min., minimum; max., maximum.

We then reclustered the excitatory and inhibitory neurons separately into H2 types (Fig. 2a,c and Extended Data Fig. 2d) to improve the resolution of clustering. At this level we recovered major inhibitory neuron subclasses (Pvalb, Sst, Vip/Sncg, Meis2-like and Lamp5), all excitatory subclasses that are shared across the cortex (L2/3 IT, L4/5 IT, L5 IT, L6 IT, L5 ET, L6 CT, NP, Car3 and L6b) and an excitatory subclass specific to the medial cortex (RSP) observed in previous cortical scRNA-seq datasets8,9,12,35. The H2 types expressed known cell-type markers and other highly differentially expressed genes (Fig. 2d). For example, Cux2 is expressed mostly in superficial-layer IT and Car3 neurons, Fezf2 in NP and L5 ET neurons and Foxp2 specifically in L6 CT neurons (Supplementary Note 3 provides a detailed description). Although we generated the full 40-section dataset in two batches (Supplementary Methods) we did not observe strong batch effects, as evidenced by the intermingling of excitatory neurons from different slices across the two batches in the uniform manifold approximation and projection (UMAP) plot (Extended Data Fig. 2e). Thus the H2 types recapitulated, at medium granularity, known neuronal types identified in previous scRNA-seq datasets8,9,12.

We then reclustered each excitatory H2 type into H3 types (Fig. 2a). To quantify how well H3 types corresponded to reference transcriptomic types identified in previous scRNA-seq studies, we used a k-nearest-neighbour-based approach to match each H3 type to leaf-level clusters recorded in ref. 9 (Supplementary Methods). We found that cortical H2 types had a one-to-one correspondence with subclass-level cell types in the scRNA-seq data (Fig. 2e). Within each H2 type, the H3 types differentially mapped onto single or small subsets of leaf-level clusters in the scRNA-seq data (Fig. 2e; Extended Data Fig. 2f shows matching of clusters outside of the cortex). Both H2 and H3 types were organized in an orderly fashion along the depth of the cortex, recapitulating the laminar organization of cortical excitatory neurons (Extended Data Fig. 2g,h and Supplementary Note 3). At a coarse spatial resolution the H3 types were also found in cortical areas similar to matching clusters in previous scRNA-seq datasets (Extended Data Fig. 2i–k and Extended Data Fig. 3). For example, the H3-type PT AUD and its corresponding scRNA-seq cluster (242_L5_PT CTX) were both enriched in lateral cortical areas (TEa-PERI-ECT) and auditory cortex (AUD), whereas H3-type PT CTX P and its corresponding scRNA-seq clusters (245_L5_PT CTX and 259_L5_PT CTX) were enriched in the visual cortex. Therefore, these results demonstrate that our pilot dataset resolved fine-grained transcriptomic types of cortical excitatory neurons that were consistent with previous scRNA-seq datasets9 and recapitulated their areal and laminar distribution9,12,36. The high resolution and cortex-wide span of our dataset now enabled us to resolve the spatial enrichment of gene expression and the distribution of neuronal subpopulations across the cortex at micrometre-level resolution.

Gene expression patterns across the cortex

Gene expression varies substantially across the cortex30,37 but most cortical areas largely share the same H2 types, or subclasses, of excitatory neurons9,12. Therefore it is unclear how differences in the organization of neuronal subpopulations lead to area-specific gene expression. Three sources of variation could contribute to gene expression differences across areas (Fig. 3a). First, the composition of H2 types may drive differences in gene expression across the cortex (Fig. 3a (left), the cell-type composition model). For example, the ratio of H2-type X to -type Y might be high in the visual but low in the motor cortex, so genes that are expressed more highly in X than in Y will be more highly expressed in the visual cortex. Second, the expression of some genes may vary across space regardless of H2 type—that is, they change consistently across space in multiple H2 types (Fig. 3a (middle), the spatial gradient model). In this model, gene A may be more highly expressed in the visual than in the motor cortex in types X and Y. Finally, the expression of some genes may vary across space in an H2-specific manner (Fig. 3a (right), the area-specialized cell-type model). For example, gene A may be more highly expressed in the visual cortex than in the motor cortex in H2 type X but not in H2 type Y.

Fig. 3: Spatial variations of gene expression across the cortex.
figure 3

a, Three models of differential gene expression across cortical areas. H2 types are indicated by cell shape; cool-toned shapes indicate H2-type-specific genes; warm-toned shapes indicate area-specific genes in model 2 and H3-type-specific genes in model 3. b, Expression of selected spatially variant NMF modules plotted on cortical flatmaps. c, Expression of selected marker genes in each NMF module. d, Histogram showing the number of areas (out of 37) in which an H3 type was detected: an H3 type was considered present in an area if that area contained at least 3% of that H3 type. e, Spatial distribution of example L4/5 IT and L5 ET H3 types across the cortex plotted on cortical flatmaps. Colours indicate relative cell count in each cubelet; grey lines delineate cortical areas. f, Distribution of L4/5 IT H3 types in an example coronal section. Dashed lines indicate area borders in CCF. Magnified views of dashed boxes are shown on the right. Brackets indicate barrels in the barrel cortex. g, Cortical areas defined in CCF (left) and those predicted by H3 type (middle) and cubelet gene expression (right). h, Fraction of correctly predicted cubelets using H3-type composition, cubelet gene expression and shuffled control. Each box shows the performance of n = 100 resampled trials. Boxes show median, and quartiles and whiskers indicate range after exclusion of outliers. Dots indicate outliers. i, Matrix showing the AUROC of pairwise classification between combinations of cortical areas. Areas are sorted by modules, which are colour coded on the left. The dendrogram was calculated using similarity of H3-type composition; clusters were obtained based on the matrix and are shown in the grey-bordered boxes. j, Cortical flatmaps coloured by cell-type-based modules (left) and by connectivity-based modules identified by Harris et al.3 (right).

To determine the contribution of each source to the variation in gene expression across areas we discretized the cortex on each coronal slice into 20 spatial bins (Supplementary Methods and Extended Data Fig. 4a). We then assessed how much of the variation in bulk gene expression across bins could be explained by either space or composition of H2 or H3 types using one-way analysis of variance (Extended Data Fig. 4b,c and Supplementary Methods). We found that all three models contribute to the spatial variation of gene expression, and that the model that contributes most to variation varies across genes (Extended Data Fig. 4b–d and Supplementary Note 4). Because the spatial patterns of many genes were similar, we sought to extract basic spatial components that were shared across genes and H2 types using non-negative matrix factorization (NMF)38 (Supplementary Note 4 and Extended Data Fig. 4e,f). We found that the majority of NMF components were patterned not in broad gradients along major spatial axes, but rather were concentrated in areas that were functionally related and highly interconnected (Fig. 3b and Extended Data Fig. 4g). For example, NMF5 was found mostly in visual areas whereas NMF8 was predominantly in somatosensory areas. Other NMF modules, including NMF1 (medial areas) and NMF10 (lateral areas), were present in combinations of areas that were functionally distinct but also highly interconnected3,4. Spatially variant genes were usually strongly associated with only one or two components (Fig. 3c and Extended Data Figs. 4h and 5), and the association recapitulated known spatial patterns of these genes. For example, Tenm3 was expressed mostly in posterior sensory areas including the visual cortex, auditory cortex and part of the somatosensory cortex30 (Extended Data Fig. 4d, bottom); Tenm3 was strongly associated with NMF5 (Fig. 3c), which was also expressed in the same sets of areas (Fig. 3b). Thus, gene expression varies along sets of interconnected areas, suggesting an intriguing link between gene expression and intracortical connectivity across areas.

Cell-type-defined cortical modules

The spatially varying NMF modules were obtained after controlling for variability in the composition of H2 types, but not of H3 types. Therefore we hypothesized that these modules reflected differences in the composition of H3 types across cortical areas. Consistent with this hypothesis, each H3 type was enriched in a small subset of NMF modules and H3 types also overlapped with their corresponding NMF modules in space (Extended Data Figs. 6 and 7a,b, Supplementary Methods and Supplementary Note 5). To further assess the areal distribution of H3 types we rediscretized the cortex on each coronal slice into ‘cubelets’ of similar width along the mediolateral axis across all slices (Supplementary Methods and Extended Data Fig. 4a). These cubelets were of similar physical size and were narrower on the mediolateral axis than the spatial bins used in the previous analysis; this higher lateral resolution makes it easier to assign cubelets to individual cortical areas. We found that H3 types were shared by multiple cortical areas and were not specific to any single area (each H3 type was found in between six and 12 areas, median ± 1 s.d., Fig. 3d; Fig. 3e shows distributions of example H3 types and Supplementary Fig. 1). Thus the distinctness of neighbouring cortical areas cannot be explained simply by the presence or absence of an area-specific H3 type. However, we noticed that the compositional profiles of H3 types often changed abruptly near area borders defined in the Allen Common Coordinate Framework v.3 (CCF)39 (Fig. 3f and Extended Data Fig. 7c,d). Most salient changes occurred at the lateral and medial areas, which is consistent with scRNA-seq data9. Within the dorsolateral cortex, although neighbouring cortical areas sometimes shared sets of H3 types, their proportions typically changed at or near area borders. Using either gene expression or the compositions of H3 types in each cubelet, we could accurately predict cubelet locations and cortical area labels (75% correct using gene expression and 69% correct using H3-type composition, compared with 8% in shuffled control; Supplementary Note 5, Fig. 3g,h and Extended Data Fig. 7e–i). Thus both cubelet gene expression and H3-type composition are highly predictive of locations along the tangential plane of the cortex and the identity of the cortical areas.

We next assessed the similarity and modularity of cortical areas based on how well these could be distinguished by their H3-type composition (Fig. 3i and Supplementary Methods). In brief, we built a distance matrix between cortical areas based on how well they can be distinguished pairwise using H3 type composition then performed Louvain clustering on the distance matrix. We identified six clusters, each of which consisted of more than one area (Fig. 3i, grey-bordered boxes); these included two clusters corresponding to the visio-auditory areas and one cluster each for the association areas, somatosensory cortex, motor cortex and lateral areas. This modular organization is robust to small errors in CCF registration (Extended Data Fig. 7j and Supplementary Methods). We further combined these clusters with singlet areas (PL, RSPd and RSPv) that did not cluster with any other area into cortical modules based on similarity in H3-type composition. These modules largely included the visio-auditory, somatomotor, association, medial and lateral areas, respectively (Fig. 3i). Notably, these cell-type-based modules were largely consistent with cortical modules that are highly connected (connectivity-based modules)3 (Fig. 3j). Thus, highly interconnected cortical areas share similar groups of H3 types and, consequently, characteristic transcriptomic signatures.

Cell types are robust to enucleation

Transcriptomic types, areas and modules reflect cortical organization at different scales, suggesting that they may be generated through different developmental mechanisms. As a first step in understanding the developmental processes that contribute to cortical organization at different scales, we applied BARseq to examine how the postnatal removal of peripheral sensory input alters the organization of cortical transcriptomic types. Thalamocortical projections have a central role in shaping the identities and borders of cortical areas10,40,41, and loss of postnatal visual inputs affects gene expression in VISp and other areas40,42,43. How peripheral inputs shape cortical neuronal types and the characteristic cell-type compositional profiles of cortical areas, however, is unclear. For example, altered gene expression could result in new cell types that are not seen in a normal brain; alternatively, it could enrich or deplete existing cell types (Fig. 4a). Because BARseq is cost effective and has high throughput, it is uniquely suited for interrogating changes in neuronal gene expression and cell-type compositional profiles on a brain-wide scale across many animals, with or without developmental perturbation.

Fig. 4: BARseq consistently detects cell types across eight enucleated and control animals.
figure 4

a, Models of possible effects of removing peripheral sensory inputs postnatally, including generation of new cell types (left) and/or enrichment and depletion of existing cell types (right). b, We collected brain-wide transcriptomic data from four littermate pairs: within each pair one mouse was enucleated (Enu.) at P1 and the other was a sham control (left, n = 8 animals). A representative stack of 32 slices from one brain (bottom) and close-up images of matching coronal slices from all eight brains (top) are shown. For clarity, only 17 out of 104 genes (indicated on the right) are plotted. c, Genes and read counts per cell for the pilot dataset and the enucleation and control littermates (in order plotted, n = 0.6, 1.0, 1.0, 1.1, 1.3, 1.2, 1.1, 1.2 and 1.1 million biologically independent cells). Boxes indicate the median and first and third quartiles; whiskers extend to the most extreme value up to 1.5 times interquartile range from each bound; remaining dots are plotted individually. df, UMAP plots of gene expression of excitatory neurons from all eight animals. Neurons are colour coded by animal (d), by H2 type (e) and by condition (enucleated or control, f). Labels show only H2 types in cortex.

We performed binocular enucleation on four mice at postnatal day 1 and collected their brains at postnatal day 28, along with those of four matched littermate controls (n = 8 animals) (Fig. 4b). We performed BARseq using an improved microscope that achieved better data quality and much faster data acquisition compared with the pilot dataset (2.3 days per brain; Supplementary Methods and Supplementary Note 6). In total, the full dataset contained 9.1  million quality controlled cells covering most of the forebrain of all eight animals (Fig. 4b), with a median of 87 reads per cell and 37 genes per cell (Fig. 4c). Cells from individual brains were interdigitated with those from other brains in UMAP space, suggesting that there were minimal batch effects (Fig. 4d and Extended Data Fig. 8a). Therefore, we performed de novo clustering hierarchically on the concatenated data of 3,957,252 excitatory neurons, 1,526,182 inhibitory neurons and 3,635,402 other cells at the H1 level. The fraction of other cells was significantly higher than that in the pilot dataset, probably because the improved data quality allowed more cells with lower read counts to pass quality control and be included. We then reclustered the excitatory neurons into 35 H2 types (Fig. 4e) and 154 H3 types, including 12 H2 types and 70 H3 types predominantly found in the cortex. These H3 types in the new dataset closely matched those in the pilot dataset (Extended Data Fig. 8b,c; Supplementary Note 6 shows mapping to the pilot dataset). Notably, no H3 type was strongly enriched or depleted in enucleated brains compared with control (Extended Data Fig. 8f; Supplementary Note 7 and Extended Data Fig. 8d–i provide detailed analyses). Although we cannot fully rule out the possibility that minor changes in gene expression were missed at our transcriptomic resolution, these results suggest that enucleation did not lead to the creation of new cell types at the H3 level; rather, the main effect of enucleation was probably reflected in changes in the compositional profiles of H3 types.

Enucleation alters cell-type make-up

Having established that enucleation did not create new H3 types, we sought to characterize enucleation-induced changes in area-specific H3-type composition. We divided the cortex into cubelets using an approach similar to that used for the pilot data (Supplementary Methods). This discretization resulted in about 270 neurons per cubelet, with a mean distance of 181 µm between adjacent cubelets in a section. To visualize H3 type composition we plotted UMAP analysis based on the fraction of H3 types in each cubelet (Fig. 5a–c). Consistent with the absence of batch effects seen in single-neuron gene expression (Fig. 4d), cubelets from all eight animals mixed smoothly in most areas (Fig. 5a). Colour coding of cubelets by condition (Fig. 5b), however, revealed an ‘island’ (left) within which cubelets from the two populations (enucleated versus control) were largely segregated. This island contained mainly cubelets from VISp and other visual areas (Fig. 5b,c, insets). To quantify differences in the compositional profiles of H3 types between control and enucleated brains we trained a classifier to assess how distinct cubelets from each cortical area were between the two conditions (Supplementary Methods). If enucleation consistently altered the compositional profile of H3 types in a cortical area, then we would expect the classifier to predict whether a cubelet was from a control or an enucleated animal based on its H3 type composition above chance level. In most cortical areas the classifier performed at chance level, but VISp cubelets were highly predictive of condition (Fig. 5d; area under the receiver operating characteristic (AUROC) 0.90 ± 0.06 compared with shuffled AUROC 0.56 ± 0.27, median ± s.d.; P = 2 × 10−33 using rank sum test and Bonferroni correction). Two higher visual areas (VISpm and VISl; AUROC median ± s.d. 0.70 ± 0.12 and 0.66 ± 0.15; and shuffled AUROC median ± s.d. 0.50 ± 0.25 and 0.44 ± 0.25; P = 3 × 10−9 and 7 × 10−9, respectively, comparing each area with shuffled control using rank sum test and Bonferroni correction) and a non-visual area (SSp-ll; AUROC 0.57 ± 0.09 and shuffled AUROC 0.42 ± 0.18, median ± s.d.; P = 2 × 10−8 compared with shuffled control using rank sum test and Bonferroni correction) were also predictive above chance level, although the predictive powers were much lower. Thus, enucleation largely affected the relative composition of H3 types within visual areas.

Fig. 5: Enucleation alters visual area identities within the visio-auditory module.
figure 5

ac, UMAP plots of H3 type composition of cubelets from all eight brains, colour coded by animal (a), experimental condition (b) and cortical area (c). Insets show amplified views of boxed areas. d, Flatmap showing how distinct an area was between control and enucleated brains (AUROC). Colours indicate ΔAUROC between scores with or without shuffling of brain conditions. Red asterisks indicate false discovery rate below 0.05. e, Representative slices from a pair of littermates showing selected L2/3 IT and L6 IT H3 types enriched or depleted in VISp after enucleation. f, Fractions of enriched or depleted H3 types in each area. g, Fractions of H3 types in VISp and their fold change after enucleation. Colours indicate log(fold change) and circle size indicates the fraction of VISp neurons belonging to each H3 type. h, Shifts in areal identity for VISp (left), VISl (top right) and VISpm (bottom right). Colours indicate fraction of enriched or depleted neighbours in enucleated brains compared with control. The area of interest in each plot is shaded in grey and indicated by an orange asterisk. Areas outlined in black indicate the visio-auditory module. i, Models of cortical cell-type organization. Cortical modules are established independently of peripheral inputs (left). Under normal conditions, cortical areas follow wire-by-similarity (top). Areas within a module have similar cell types and are more interconnected. When peripheral inputs are removed (bottom), the cell-type compositional profile of the affected area shifts towards other areas within the same module. Modules are denoted by patterned backgrounds. Coloured dots and stars indicate cell types in the two modules, respectively; coloured arrows indicate thalamic inputs; black arrows within the cortex indicate connectivity.

The effect of enucleation can be observed directly in the distribution of H3 types in the primary visual area (Fig. 5e and Extended Data Fig. 9a). For example, many L2/3 IT M–L_2 neurons (Fig. 5e, yellow dots) were found in VISp in control animals, but L2/3 IT L–M neurons (Fig. 5e, green dots) became enriched in VISp in enucleated animals. Similarly, L6 IT DL neurons (Fig. 5e, purple dots) were found in higher numbers in the VISp of enucleated animals compared with control animals. To systematically examine how enucleation affected the compositional profiles of cortical excitatory cell types in each area we looked for H3 types that were enriched or depleted in enucleated brains using an analysis of variance model, adjusting for litter and area effects (Supplementary Methods). We found that 46 H3 types in 18 areas across the whole cortex were either over- or under-represented in enucleated animals compared with control. VISp had the most H3 types (16) whose compositions were altered by enucleation (Fig. 5f). The affected H3 types were found across most H2 types, with the strongest enrichment or depletion of H3 types of L2/3 IT, L4/5 IT and L6 IT (Fig. 5g). Intriguingly, L6b/CT A–L_2, a transitional type between L6 CT and L6b H2 types usually found only in lateral areas, was also highly enriched in VISp after enucleation. The affected H3 types remained in their characteristic sublaminar positions (Extended Data Fig. 9b) and overall changes were consistent with, but broader than, those observed during dark rearing during the critical period42 (Extended Data Fig. 9c and Supplementary Note 8). The top enriched H3 types, including L2/3 IT M–L, L2/3 IT L–M, L4/5 IT M–L, L6 IT DL and L6b/CT A–L_2, were all enriched in medial and lateral areas in the control brains, including areas immediately medial and lateral to the visual areas (Extended Data Fig. 9d). Thus, enucleation broadly shifted neurons in VISp towards H3 types that were usually enriched in the medial and lateral areas in control brains.

Peripheral inputs sculpt area identities

Because enriched H3 types were consistently found in medial and/or lateral areas in control animals, we wondered whether enucleation had also shifted overall area identity—as defined by the H3-type compositional profiles—of the visual cortex towards other areas. To examine how area identities had changed after enucleation we used a nearest-neighbour-based approach inspired by MetaNeighbor44 to assess the similarity of cubelets in both control and enucleated brains to other cubelets in control brains (Supplementary Methods). If enucleation had shifted the compositional profile of an area towards a target area, cubelets from the affected area in the enucleated brain would then have had more neighbours in the target area than cubelets from the same area in the control brain. For each cubelet in a littermate pair we found the 20 cubelets with closest match in H3-type composition in control brains from the other three pairs of littermates. We then calculated the similarity, quantified by AUROC for assigning cubelets from each area to areas in the control brains based on nearest neighbours (Extended Data Fig. 9e). All three visual areas (VISp, VISl and VISpm, circled in Extended Data Fig. 9e) remained highly similar to the same areas in control brains (AUROC 0.97 and 0.98 for control and enucleated VISp, 0.90 and 0.92 for control and enucleated VISl and 0.88 and 0.94 for control and enucleated VISpm, respectively), indicating that their H3-type compositions remained highly distinct from other areas despite the changes induced by enucleation. However, all three visual areas also shifted towards the identities of neighbouring regions as judged by the fraction of neighbours from an area (Fig. 5h). For example, VISp cubelets from the enucleated brains had higher AUROC scores with both VISl and VISpm than those from control brains (0.85 and 0.89 for enucleated cubelets and 0.76 and 0.83 for control cubelets; Extended Data Fig. 9e). Consistent with the high AUROC scores observed, VISp cubelets from enucleated brains also had more neighbours in VISl and VISpm (Fig. 5h). Similarly, VISl cubelets from enucleated brains had more neighbours in auditory areas and VISpm cubelets from enucleated brains had more neighbours in VISam and RSPagl (Fig. 5h, insets). Notably, all three areas shifted towards neighbouring areas that were physically further away from VISp and were within the visio-auditory module (black outlines in Fig. 5h). To examine whether these changes reflected a shift in area borders or a change in composition across an area, we plotted each cubelet from enucleated brains and coloured them by differences in the number of neighbour cubelets in VISl (Extended Data Fig. 9f, top) and VISpm (Extended Data Fig. 9f, bottom). In VISp the enrichment of neighbours in VISl and VISpm was found in cubelets across the whole area. In particular, cubelets that had more neighbours in VISpm after enucleation (red dots in VISp in Extended Data Fig. 9f, bottom) appeared to be concentrated at the centre of VISp rather than at the borders, suggesting that changes in similarity among these areas reflected an overall change in cell-type composition rather than a shift in area borders. Thus, enucleation shifted the H3-type composition-defined area identities of the visual areas towards neighbouring areas within the visio-auditory module.

Discussion

Using BARseq, we generated cortex-wide maps of transcriptomic types of excitatory neurons at high transcriptomic and spatial resolution across nine animals. These maps not only elaborate the distribution of cortical excitatory neuron types previously revealed by single-cell studies9,12, but also provide an ‘anchor’ to associate other neuronal properties and activity with neuronal types. Thus, our spatial cell-type map provides a foundational resource for understanding the structural and functional specialization of cortical areas. We focused on the cortex, but the same approach can be applied to any other brain region with adequately designed gene panels. When examining large numbers of genes, overcoming optical crowding by computational demixing of overlapping signals45 and/or optimization of cell segmentation using more recent approaches24,46,47 may further improve the ability to resolve single-cell gene expression accurately.

Our results suggest that the cell-type compositional profiles of cortical areas reflect their modular organization seen in connectivity studies: cortical areas that are highly interconnected also have similar H3 types (Fig. 5i, top). This ‘wire-by-similarity’ relationship is not a trivial consequence of cell-type-specific connectivity observed at a cortex-wide scale, because cortical neurons of the same type are not necessarily highly connected (for example, Sst neurons48). Thus, wire-by-similarity does not describe the connectivity of individual neuronal types but rather reflects how divisions within a large brain region (that is, areas within the cortex) relate to each other in terms of cell types and connectivity. Future studies using BARseq to map the projections of neuronal types at cellular resolution, from multiple cortical areas and at multiple developmental time points, can help resolve the single-cell basis of the wire-by-similarity organization.

The combination of single-cell resolution, high transcriptomic resolution and broad interrogation across many cortical areas allowed us to describe in detail how gene expression and cell-type compositional profiles change after removal of peripheral sensory inputs. Overall, the effects of enucleation suggest that peripheral activity refines the cell-type compositional profiles of cortical areas. Enucleation affected IT neurons in all layers and also L6b/CT neurons, a broader population than the L2/3 IT neurons affected by dark rearing (Fig. 5g)42. However, enucleation did not completely abolish the distinction between primary and secondary visual areas, as observed by Chou et al.40 after genetic ablation of thalamocortical axons (Supplementary Note 8). Thus, together with previous studies, our results suggest a consistent model: the physical connections established by thalamocortical axons are needed to define the primary visual cortex, and peripheral activity sharpens cell-type composition across both the primary visual cortex and neighbouring higher visual areas within a cortical module (Fig. 5i, bottom).

BARseq stands out among spatial transcriptomic methods with its high throughput (about 2.3 days per brain on one microscope), low cost (approximately US $2,000 per brain) and high reproducibility. These features make it possible to compare brain-wide spatial gene expression across many animals, thus providing a path to go beyond a single-reference brain atlas31,33,34 towards a ‘pan-transcriptomic’ atlas that captures population diversity. Furthermore, combining interrogation across multiple individuals with perturbations enables the discovery of causal relationships. Whereas we studied the effect of developmental perturbations, the same approach can also be used in neuropsychiatric disease models, ageing studies, cross-species comparison and other experimental perturbations. Our approach based on BARseq can be broadly applied to link brain-wide, network-level dynamics with detailed changes in gene expression in single neurons, and to establish causal relationships between developmental processes and brain-wide cell-type organization.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.