Main

The wide array of debilitating disorders affecting the central nervous system (CNS) make it a primary target for the application of novel therapeutic modalities, as evidenced by the rapid development and implementation of gene therapies targeting the brain. AAVs are increasingly used as delivery vehicles owing to their strong clinical safety record, low pathogenicity and stable, non-integrating expression in vivo1. AAVs were first approved for gene therapy in humans in 2012 to treat lipoprotein lipase deficiency2 and have, more recently, been approved to treat spinal muscular atrophy3, retinal dystrophy and hemophilia4,5. Many more advanced-stage clinical trials using AAVs are underway6. Currently, most AAV-based gene therapies rely on naturally occurring serotypes with highly overlapping tropisms7, limiting the applicability, efficacy and safety of novel gene therapies. At the same time, the inability to broadly and efficiently target many therapeutically relevant cell populations within organs that have traditionally been refractory to AAV delivery has hindered novel exploratory and therapeutic efforts. These constraints, which are especially prevalent for the CNS, motivated us to enhance AAV efficiency and specificity through directed evolution.

Naturally occurring AAVs have evolved to broadly infect cells8, which is a desirable characteristic for the survival of the virus but undesirable for targeting specific cell types. This limitation has typically been addressed by injecting viral vectors directly into the area of interest in the CNS9, through either intracranial or intrathecal injection. Although direct injection is a valuable technique for targeting focal cell populations, it requires considerable surgical expertise, and this approach cannot address applications that require broad and uniform area coverage (for example, all cortex or all striatum) or applications that are surgically difficult to access owing to the invasiveness of direct injections (for example, cerebellum and dorsal raphe)9,10,11. Systemic administration into the bloodstream is an appealing solution to the limitations of direct injections, as gene therapy vectors can be non-invasively delivered throughout the body. However, naturally occurring AAV serotypes tend to target non-CNS tissues, notably the liver, at high levels7,8. The liver is an immunologically active organ, with large populations of phagocytic cells that play a critical role in immune activation12,13. Viral targeting to these tissues can trigger immune response, such as liver toxicity14,15, reducing the safety and limiting the efficacy of systemic injection. Naturally occurring serotypes also have severely limited transduction efficiency in the brain owing to the stringency of the blood–brain barrier, consequently requiring the production of large, high-quality vector titers. Enabling and refining both direct and systemic injections will best equip the community with flexible options to pursue a broader range of neuroscience applications.

To target a precise anatomical region and/or cell type in conjunction with systemic injection, specificity can be obtained or refined through the inclusion of cell-type-specific promoters16,17,18, enhancer elements19,20,21,22,23 and microRNA target sites24,25 into AAV viral genomes, or viral capsids can be engineered to alter their tissue tropism. Previously, we harnessed the power and specificity of Cre transgenic mice to apply increased selective pressure to viral engineering, leading to variants AAV-PHP.B and AAV-PHP.eB, which cross the blood–brain barrier after intravenous administration and broadly transduce cells throughout the CNS22,26. These variants transformed the way that the CNS is studied in rodents, opening up completely novel approaches for measuring and affecting brain activity, in both exploratory and translational contexts18,27,28,29. However, these variants also transduce cells in the liver and other off-target organs. Therefore, the development of AAV variants with decreased liver targeting is particularly important to avoid strong, systemic immune responses30,31,32.

To achieve specificity with the viral capsid, it is imperative that one applies both positive and negative selective pressure to engineered capsid libraries. The M-CREATE method33 applies next-generation sequencing (NGS) of synthetic libraries with built-in controls to screen viral variants across multiple Cre transgenic lines for both positive and negative features. Using the M-CREATE method, we selected a panel of novel variants that are both highly enriched in the CNS and targeted away from varying peripheral organs. Transgene expression after delivery with AAV.CAP-B10, described herein, was found to be highly specific for neurons in the CNS, significantly decreased in all peripheral organs assayed and targeted away from the liver in mice. Notably, although AAV-PHP.B failed to translate to non-human primates (NHPs)34,35, here we show robust transgene expression after intravenous administration of newly engineered variants in the adult marmoset CNS with minimal liver expression compared to AAV9 and AAV-PHP.eB.

Results

Engineering AAV capsids at the three-fold point of symmetry

The most commonly altered position within the AAV capsid is the surface-exposed loop containing amino acid (AA) 588, because it is the site of heparan sulfate binding in AAV2 (ref. 36) and is amenable to peptide display37,38. The only known receptors for AAV9 are N-linked terminal galactose39 and the AAV receptor40,41, but the possibility of co-receptors is still unexplored. Binding interactions with cell surface receptors occur near the three-fold axis of symmetry of the viral capsid (Fig. 1a) where the surface-exposed loop containing AA455 of AAV9 is the farthest protruding42. Having already engineered AAV-PHP.B and AAV-PHP.eB at the AA588 loop for enhanced CNS transduction22,26, and with the additional goal of decreasing viral transduction of peripheral organs, we theorized that introducing diversity into the AA455 loop would enhance the interaction with existing mutations of the AA588 loop and refine transduction. Thus, we engineered and performed two rounds of selection with a 7-AA substitution library of the AA455 loop, between AA452 and AA460 (Fig. 1a) in AAV-PHP.eB.

Fig. 1: Capsid engineering locations and CAP-B library characterization.
figure 1

a, Left, AAV9 capsid surface model illustrates the location of the protruding loop structures, in yellow. Right, capsid zoom-in shows the spike created by the AA579–AA594 and the AA449–AA468 variable regions of one AAV9 monomer interacting with the AA487–AA504 variable region of a second monomer. b, Experimental workflow. First, a library of variants is designed by mutating AAs between the AA452 and AA458 sites on an AAV-PHP.eB backbone. Cre-Lox sites are inserted into the viral genome to enable detection of genomes that reach Cre-recombinase-containing cells. This viral library is used to create viral capsids containing their corresponding genome. Next, variants are injected into transgenic mouse lines, where Cre recombination occurs after transduction. After 3 weeks of expression, relevant tissues are harvested from each transgenic line, and the DNA is isolated, amplified and prepared for sequencing. After NGS, variants are ranked based on enrichment in select tissues, and the top 10–20% performers repeat the selection process above. Top enriched variants with unique tissue tropism profiles after two rounds of selection are validated individually for transgene expression by in vivo screening in mice. Variants that display similar brain, but reduced liver, transgene expression are selected for pooled marmoset testing. c, Heat map plotting the enrichment scores of a subset of the 22 variants identified after two rounds of selection in vivo across Cre transgenic lines shows enrichment and specificity for the CNS. The sequences for the 7-AA substitutions present in each of the variants show divergence from the parent AAV-PHP.eB sequence (see also Extended Data Fig. 1). d, mNeonGreen was packaged in each variant under control of the ubiquitous CAG promoter and intravenously administered to mice at a dose of 5 × 1011 viral genomes per animal. Transgene expression was assayed by mNeonGreen fluorescence throughout brain and liver. The expression from ssAAV9, ssAAV-PHP.eB, ssAAV.CAP-B1, ssAAV.CAP-B2, ssAAV.CAP-B8, ssAAV.CAP-B10, ssAAV.CAP-B18 and ssAAV.CAP-B22 packaging CAG-mNeonGreen as assessed after 2 weeks. Direct comparison of the expression profiles in brain and liver of the top-performing variants shows a correlation between validated tropisms and those predicted by the NGS data. Scale bars, 2 mm.

We applied the M-CREATE method developed by Kumar et al.33 to identify AAV variants with desired tropism after systemic administration. Initially, we generated a library of AAV capsid sequences, theoretically containing 1.28 billion 7-mer AA substitutions, and produced the corresponding viruses in HEK293 cells that package a replication-incompetent version of their own genome with a polyadenylation sequence flanked by Cre-Lox sites (Fig. 1b). This viral library is then injected into transgenic animals expressing Cre recombinase in specific cell populations. Variants that successfully transduce Cre+ cells have their genome flipped, and the corresponding sequences are recovered. To exert simultaneous positive and negative selective pressure, we performed parallel selections of this library in multiple transgenic lines expressing Cre recombinase in different cell populations. The differential expression of Cre in the transgenic lines allows for shifts in the tropism of each viral variant to be compared between tissue types, enabling the recovery of unique sequences with a desired transduction profile after only two rounds of selection (for example, strong CNS and weak liver transduction).

We injected viral libraries into four separate Cre transgenic mouse lines to select viral variants specific to the CNS: Tek-Cre for endothelial cells throughout the body; hSyn1-Cre for neurons of the CNS and peripheral nervous system; GFAP-Cre for astrocytes; and TH-Cre for midbrain cells. Using the described Cre-dependent method, we recovered viral DNA from all tissues across each mouse line and identified 434,770 capsid variants capable of tissue transduction through NGS. Afterwards, we ranked variant sequences from each tissue by their enrichment score, defined as the relative abundance of the sequence found within the specific tissue over the relative abundance of that sequence within the injected viral library. After ranking by enrichment, we selected the top 10–20% of variants with desired profiles from each tissue for the next round of selection.

Capsid engineering refines expression patterns in mice

We used the enriched sequences from the first round of selection to synthesize a library for the second round of selection. The second-round library contained codon sequences of 82,710 unique variants and duplicate, codon-modified versions of each variant as a replicate. Using the same selection methods and transgenic lines as round 1, we determined the enrichment of each variant across tissues and transgenic lines. The second round of selection reduced the number of enriched variants by two orders of magnitude, and we identified 39,034 sequences that result in CNS enrichment and decreased targeting of the liver. Using these sequences, we differentiated a subset of AAs in each position that contributed to this tropism profile (Extended Data Fig. 1). We ranked sequences by their specificity to CNS tissues and compared against the enrichment in peripheral tissues across the different transgenic lines, selecting 22 sequences with diverse sequence identity (Fig. 1c) for further testing in vivo.

To validate the enrichment profile obtained from sequencing, we individually screened each of these 22 capsid variants for their tissue tropism after systemic injection in mice. We produced capsid variants, packaging a mNeonGreen fluorescent reporter under regulation of the ubiquitous CAG promoter, and administered each variant to wild-type (WT) mice by intravenous injection at a dose of 5 × 1011 viral genomes per animal. After 2 weeks of expression, we imaged brains and livers and quantified fluorescence to assess mNeonGreen expression. Six variants displayed similar or higher fluorescence in the brain to AAV-PHP.eB as well as reduced liver expression in mice (Fig. 1d): AAV.CAP-B1, AAV.CAP-B2, AAV.CAP-B8, AAV.CAP-B10, AAV.CAP-B18 and AAV.CAP-B22. Owing to the negligible liver expression of AAV.CAP-B10, we characterized this variant further in mice, whereas we selected all six variants for pooled testing in marmosets.

AAV.CAP-B10 yields CNS-specific transgene expression in mice

To characterize the performance of AAV.CAP-B10 in comparison to AAV9 and AAV-PHP.eB, we packaged a nuclear-localized enhanced green fluorescent protein (EGFP) under regulation of a ubiquitously expressed CAG promoter, enabling quantification across cell types throughout the body. The variants were injected into mice at a dose of 1 × 1011 viral genomes per animal, and tissue was collected after 3 weeks of expression. Detailed cell counts of overall EGFP expression throughout the body showed that expression after delivery with AAV.CAP-B10 is highly specific to the CNS. In the brain, delivery with AAV.CAP-B10 yielded a similar number of EGFP-expressing cells with a similar level of expression per cell to AAV-PHP.eB, whereas both AAV.CAP-B10 and AAV-PHP.eB had many more cells expressing EGFP than AAV9 (Fig. 2a). In the spinal cord, 40% fewer cells displayed EGFP expression after delivery with AAV.CAP-B10 than with AAV-PHP.eB but approximately 16-fold more than AAV9 (Extended Data Fig. 2). Conversely, the number of cells expressing EGFP after delivery with AAV.CAP-B10 was significantly reduced in the liver compared to both AAV-PHP.eB (~50-fold) and AAV9 (>100-fold). EGFP expression after delivery with AAV.CAP-B10, but not AAV-PHP.eB, was significantly dimmer per cell than AAV9 in the liver (~tenfold) (Fig. 2b). This trend was maintained in the other peripheral organs, with significantly fewer cells displaying EGFP expression after delivery with AAV.CAP-B10 than AAV9 (Extended Data Fig. 2).

Fig. 2: AAV.CAP-B10 tissue expression profile is biased toward the brain, with a significant decrease in liver targeting.
figure 2

a, ssAAV9, ssAAV-PHP.eB and ssAAV.CAP-B10, packaging a nuclear-localized GFP under the control of the CAG promoter, were intravenously injected into male adult mice at 1 × 1011 viral genomes per mouse. GFP fluorescence was assessed after three weeks of expression. Quantification of the total number of cells expressing GFP in the brain (P = 0.0016 (AAV9 versus AAV-PHP.eB), P = 0.0003 (AAV9 versus AAV.CAP-B10) and P = 0.4345 (AAV-PHP.eB versus AAV.CAP-B10)) and the average brightness per cell (P = 0.06 (AAV9 versus AAV-PHP.eB), P = 0.0043 (AAV9 versus AAV.CAP-B10) and P > 0.999 (AAV-PHP.eB versus AAV.CAP-B10)) show an increase in transgene expression efficiency in the brain of AAV.CAP-B10 similar to AAV-PHP.eB. b, Quantification of the total percentage of cells expressing GFP in the liver (P = 0.0127 (AAV9 versus AAV-PHP.eB), P < 0.0001 (AAV9 versus AAV.CAP-B10) and P = 0.0047 (AAV-PHP.eB versus AAV.CAP-B10)) and average brightness per cell (P = 0.080 (AAV9 versus AAV-PHP.eB), P = 0.0009 (AAV9 versus AAV.CAP-B10) and P = 0.48 (AAV-PHP.eB versus AAV.CAP-B10)) show an iterative decrease in transgene expression efficiency from AAV-PHP.eB to AAV.CAP-B10. a, b, n = 6 mice per group, mean ± s.e. Statistical significance was determined using Brown–Forsythe and Welch ANOVA tests with the Dunnett T3 correction for multiple comparisons for transgene expression and the Kruskal–Wallis test with Dunn’s correction for multiple comparisons for brightness. c, d, Within the brain, AAV.CAP-B10 is biased toward neurons (P = 0.44 (cortex), P = 0.21 (hippocampus), P = 0.81 (thalamus), P = 0.56 (striatum), P = 0.31 (midbrain) and P = 0.82 (total)), stained with αNeuN (Abcam, 177487). Expression and percentage of neurons expressing transgene. e, f, Expression and percentage of astrocytes expressing transgene (P = 0.00002 (cortex), P = 0.0034 (hippocampus), P = 0.017 (thalamus), P = 0.057 (striatum), P = 0.0070 (midbrain) and P = 0.0012 (total)), stained with αS100 (Abcam, 868). g, h, Expression and percentage of oligodendrocytes expressing transgene (P = 0.0039 (cortex), P = 0.00046 (hippocampus), P = 0.0047 (thalamus), P = 0.0085 (striatum), P = 0.00019 (midbrain) and P = 0.0018 (total)), stained with αOlig2 (Abcam, 109186). ch, n = 6 mice per group, except for hippocampal NeuN+ counts of AAV.CAP-B10, where n = 3, mean ± s.e. Statistical significance was determined using two-sided Welch’s t-tests. The contribution of cells from different classification to overall EGFP expression was measured and indicates a shift toward neuronal specificity of AAV.CAP-B10 compared to AAV-PHP.eB (see also Extended Data Figs. 24). Scale bars, 200 µm. A.U., arbitray units; NS, not significant; vg, viral genomes.

AAV.CAP-B10 brain transgene expression is neuronal specific

To further characterize transgene expression in the CNS after delivery with AAV.CAP-B10 compared to AAV-PHP.eB, we co-stained for neurons (αNeuN), astrocytes (αS100), oligodendrocytes (αOlig2) and Purkinje cells (αCalbindin) and quantified the efficiency of EGFP expression after delivery with each capsid in each cell type across various brain regions (Fig. 2c–h and Extended Data Fig. 3). Whereas neurons displayed EGFP expression with similar efficiencies after delivery by AAV.CAP-B10 or AAV-PHP.eB across brain regions (Fig. 2c,d), roughly four- to fivefold fewer astrocytes and oligodendrocytes display EGFP expression after delivery by AAV.CAP-B10 compared to AAV-PHP.eB (Fig. 2e–h). This indicates that the AAV.CAP-B10 mutation confers a bias for neurons over other cell types, an interesting deviation from AAV9, which mostly targets astrocytes in the brain43,44. A noteworthy indication from the NGS data for AAV.CAP-B10 was this variant’s decreased presence in the cerebellum, which was also evidenced by a decrease in the overall cerebellar fluorescence in the initial characterization (Fig. 1d). When comparing the cerebellar expression after delivery with AAV.CAP-B10 to that of AAV-PHP.eB, we found a significant, roughly fourfold decrease in the number of Purkinje cells displaying EGFP expression (Extended Data Fig. 4).

Engineered variants maintain robust tropism in marmosets

Of primary concern for the therapeutic applicability of variants engineered in rodents is how well their transgene expression profile translate to NHPs. Therefore, we sought to characterize the marmoset CNS transgene expression profile of a subset of the variants validated in mice, along with AAV9 and AAV-PHP.eB as controls. We produced a pool of eight viruses—AAV9, AAV-PHP.eB, AAV.CAP-B1, AAV.CAP-B2, AAV.CAP-B8, AAV.CAP-B10, AAV.CAP-B18 and AAV.CAP-B22—each packaging an HA-tagged frataxin (FXN) with a unique molecular barcode under control of the ubiquitous CAG promoter. We opted to use FXN because it is an endogenous protein expressed throughout the body, and previous efforts to characterize NHP transgene expression after delivery with naturally occurring and engineered serotypes have resulted in deleterious effects for the host, potentially due to the packaging of exogenous transgenes such as GFP30,45,46. We also included a separate 12-base RNA barcode into each genome to differentiate the contribution of each virus from the rest after using NGS. The eight viruses were pooled at equal ratios and intravenously injected into two adult marmosets at doses of 1.2 × 1014 (marmoset pool virus 1 (MPV1)) and 7 × 1013 (MPV2) viral genomes per kg (Extended Data Table 1). After 6 weeks of expression, during which no adverse health effects were observed, the brains and livers were recovered, and sections were taken for RNA sequencing and immunohistochemistry.

Staining for the HA tag revealed that robust and broad expression was achieved in the adult marmoset brain, with strong expression of the viral pool observed in cortical, subcortical and cerebellar layers throughout (Fig. 3a,b). In the liver, moderate FXN expression was observed after staining for the HA tag (Fig. 3c). From multiple slices per animal distributed throughout the brain and liver, we extracted RNA, performed NGS and quantified the relative expression levels of each of the barcoded viruses. AAV.CAP-B10 recapitulated the trends observed in mice (albeit at lower levels), with a ~sixfold increase in RNA in the brain and a ~fivefold decrease in RNA in the liver compared to AAV9 (Fig. 3d,e). AAV.CAP-B22 exhibited the largest increase in RNA levels in the brain, with a greater than 12-fold increase in the brain compared to AAV9, but relatively similar levels in the liver (Fig. 3d,e). Based on their unique expression profiles in the pooled screen, we selected AAV.CAP-B10 and AAV.CAP-B22 for individual characterization of their CNS transgene expression profile in adult marmosets compared to AAV9 and AAV-PHP.eB.

Fig. 3: Characterization of pooled capsid transgene expression in NHPs.
figure 3

a, AAV.CAP-B1, AAV.CAP-B2, AAV.CAP-B8, AAV.CAP-B10, AAV.CAP-B18 and AAV.CAP-B22, along with AAV9 and AAV-PHP.eB as controls, packaging a human FXN fused to an HA tag under the control of the ubiquitous CAG promoter, were pooled and intravenously injected into two adult marmosets at doses of 1.2 × 1014 (MPV1) and 7 × 1013 (MPV2) viral genomes per kg total. Six sections distributed throughout the brain of one marmoset (MPV2) showed robust expression after immunostaining for the HA tag (Cell Signaling Technologies, C29F). Scale bars, 2 mm. b, Magnified frames from a for a variety of cortical and sub-cortical regions. c, Liver section taken from one marmoset (MPV2) shows minimal expression after immunostaining for the HA tag (Cell Signaling Technologies, C29F). d, NGS quantification within brain tissue of the unique RNA barcode associated with each virus shows a substantial increase for several variants, including a more than 12-fold increase in RNA levels of AAV.CAP-B22 and a fivefold increase for AAV.CAP-B10 compared to AAV9. e, NGS quantification of pooled variant injections, showing relative targeting away from the liver for the different variants. AAV.CAP-B22 contributes similar RNA levels as AAV9, but AAV.CAP-B10 contributes more than fivefold less. n = 2 marmosets per group. Scale bars, 2 mm.

We intravenously injected individual adult marmosets with AAV9, AAV-PHP.eB, AAV.CAP-B10 or AAV.CAP-B22, each packaging an HA-tagged FXN under the control of the ubiquitous CAG promoter, at a dose of 7 × 1013 viral genomes per kg (Extended Data Table 2). The brains were recovered, and sections were taken for sequencing and immunofluorescence between 34 d and 42 d after intravenous injection. To characterize the CNS transgene expression after delivery with the four variants, we stained sections for the presence of the HA tag. In stained sections, both AAV.CAP-B10 and AAV.CAP-B22 conditions displayed a marked increase in FXN expression efficiency within cortical regions in comparison to AAV9 and AAV-PHP.eB (Fig. 4a). Each of AAV-PHP.eB, AAV.CAP-B10 and AAV.CAP-B22 displayed reduction in liver FXN expression relative to AAV9 (Fig. 4c).

Fig. 4: Characterization of single-variant expression after delivery with each of AAV9, AAV-PHP.eB, AAV.CAP-B10 and AAV.CAP-B22 in marmosets.
figure 4

Human FXN fused to an HA tag is packaged in each variant under control of the ubiquitous CAG promoter. Marmosets were injected at a dose of 7 × 1013 viral genomes per kg (see also Extended Data Figs. 5 and 6). a, Cortical expression is compared for AAV9, AAV-PHP.eB, AAV.CAP-B10 and AAV.CAP-B22 by immunostaining for the HA tag (Roche, 3F10) on the FXN transgene in conjunction with NeuN (Abcam, 177487). A qualitative increase in transgene expression efficiency for AAV.CAP-B10 and AAV.CAP-B22 is observed in comparison to AAV9 and AAV-PHP.eB. Displayed sections are taken from a similar plane and cortical region. Scale bar, 200 µm. b, Similar regions across all layers of the marmoset cortex are quantified, where AAV.CAP-B10 shows ~fourfold increase in HA-positive neurons over AAV9 (P = 0.0299) and AAV-PHP.eB (P = 0.0295), respectively. Differences among other groups are not significant. n = 2 for AAV9, n = 3 for AAV-PHP.eB, n = 4 for AAV.CAP-B10 and n = 4 for AAV.CAP-B22, mean ± s.e. Significance is determined by two-tailed Welch’s t-test. c, Liver expression is compared for AAV9, AAV-PHP.eB, AAV.CAP-B10 and AAV.CAP-B22 by immunostaining for the HA tag (Roche, 3F10) on the FXN transgene in conjunction with DAPI. Scale bar, 200 µm. d, In the liver, we quantified the fraction of HA-positive cells out of the total nuclei stained, where AAV.CAP-B10 had expression in ~17-fold fewer cells than AAV9. No significant differences were observed among AAV-PHP.eB, AAV.CAP-B10 or AAV.CAP-B22. n = 1 for AAV9, n = 3 for AAV-PHP.eB, n = 2 for AAV.CAP-B10 and n = 3 for AAV.CAP-B22, mean ± s.e. e, Five sections distributed throughout the brain, spinal cord and DRG of the marmoset show robust expression after immunostaining for the HA tag (Roche, 3F10). Scale bar, 5 mm. f, Magnified frames from e display expression across cortical, sub-cortical, cerebellar, spinal column and DRG regions. Scale bar, 500 µm.

After delivery with AAV.CAP-B22, FXN expression was observed across broad cell types in the marmoset brain. Qualitatively, many more astrocytes (measured by co-staining with S100β) displayed FXN expression compared to delivery with AAV9 or AAV.CAP-B10 (Extended Data Fig. 5). The FXN expression levels of AAV.CAP-B22 had high variability among animals in both the brain and liver, and, therefore, we observed no significant difference in the neuronal FXN expression after delivery with AAV.CAP-B22 relative to AAV9 or AAV-PHP.eB.

The neuronal specificity of AAV.CAP-B10 in mice was recapitulated in marmoset brains, where HA-positive cells were highly correlated with NeuN staining across brain regions. In the marmoset cortex, AAV.CAP-B10 displays a statistically significant ~fourfold increase in HA-positive neurons over AAV9 and AAV-PHP.eB (Fig. 4b). Notably, this increased protein expression in cortical neurons after delivery with AAV.CAP-B10 occurs despite no significant difference between the bulk viral genome and transcript measurements of AAV.CAP-B10 and AAV9 or AAV-PHP.eB (Extended Data Fig. 6b), further highlighting tropism differences between AAV9 and AAV.CAP-B10. The decreased liver targeting of AAV.CAP-B10 in mice was also recapitulated in the marmoset, where there were ~17-fold fewer cells expressing the transgene compared to an AAV9 control (Fig. 4d). Viral genome and transcript measurements corroborate the protein expression data in the liver (Extended Data Fig. 6a). We observed broad and robust transgene expression after delivery with AAV.CAP-B10 across cortical, sub-cortical and cerebellar regions as well as in the spinal column and the dorsal root ganglia (DRG) (Fig. 4e,f).

Discussion

The power of directed evolution and AAV engineering to confer novel tropisms and tissue specificity has broadened potential research applications and enabled new therapeutic approaches in the CNS47,48,49,50,51,52. Our results show that introducing diversity at multiple locations on the capsid surface into native or previously engineered variants can provide useful features, such as increased transgene expression efficiency and tissue or neuronal specificity in WT mice and marmosets. This finding has broad implications for the field of AAV engineering, and the variants discovered using the M-CREATE method22,26,33 have the potential to provide targeted gene delivery in WT animals.

Novel variants from this library achieved overall similar levels of transgene expression throughout the mouse brain as the previously engineered parent AAV (AAV-PHP.eB) but with striking deviations in cell type specificity and targeting or, notably, targeting away from other organs. Across the two engineering efforts, iterative modifications made to adjacent loops first conferred a new phenotype—the ability to cross the blood–brain barrier efficiently—and then refined the phenotype away from other organs or toward specific cell types. Most notably from this dataset, AAV.CAP-B10 is targeted away from the entire periphery and exhibits specificity for neurons over other cell types in the CNS. Of the 82,710 variants that comprised our second round of selection, roughly 39,000 exhibited positive enrichment in the brain and negative enrichment in the liver. Of those, the variants tested in vivo in mice showed expression patterns that correlated with their NGS enrichments and rankings. Furthermore, the presence of additional sequences that were positively enriched in the brain and negatively enriched in peripheral organs indicates the richness of this dataset and engineering location, with many more potentially interesting variants to be identified and characterized. Together, these results indicate that future engineering efforts can also take a stepwise approach toward attaining specificity for certain targets, with each engineering round refining and enhancing novel tropisms identified in the previous round. Notably, this implies that AAV.CAP-B10 might be further refined in its specificity to neuronal sub-classes53,54 in the CNS or away from the DRG, which has been associated with toxicity in NHP studies55.

The therapeutic use of engineered AAVs for treating disease has increased exponentially in recent years, including the use of systemically administered AAVs for CNS gene therapies. The brain is a prime target for gene therapy of a wide array of diseases, such as Huntington’s disease, Parkinson’s disease and Friedreich’s ataxia56,57,58, and, by developing gene therapy vectors for systemic administration, we can enable the wide coverage and non-invasive delivery conducive to effective treatment of these diseases. To minimize side effects typically associated with systemic delivery, gene therapy vectors should be engineered for specificity toward the therapeutic target of interest in conjunction with decreased off-target expression. Basic research in NHP brains, compared to rodent research, has been hampered by few options for genetic access. Gene delivery vectors could bridge this gap and enable eventual use in gene therapy applications. However, the translation of engineered variants from mice to NHPs has been challenging. Two generations of blood–brain-barrier-crossing engineered AAV variants (AAV-PHP.B and AAV-PHP.eB) have failed34 to increase transgene expression in the brain after systemic injection in NHPs. Therefore, variants engineered in this study, including AAV.CAP-B10 and AAV.CAP-B22, represent notable progress toward evolving variants that are capable of efficiently, specifically and safely delivering gene therapies to the CNS in NHPs and, ultimately, in humans. Robust transgene expression in the CNS, decreased liver targeting and neuronal bias of AAV.CAP-B10 observed after two rounds of selection in mice were recapitulated in marmosets, indicating that M-CREATE in rodents can be effectively used as a screening platform before direct screening or validation in NHPs, thus greatly increasing the throughput of the engineering process. Together, these results constitute an important step forward toward achieving the goal of engineered AAV vectors that can be used to broadly deliver gene therapies to the CNS in humans.

Methods

Plasmids

The first-round viral DNA library was generated by amplification of a section of the AAV-PHP.eB capsid genome between AA450 and AA599 using NNK degenerate primers (Integrated DNA Technologies) to substitute AA452–AA458 with all possible variations. The resulting library inserts were then introduced into the rAAV-ΔCap-in-cis-Lox plasmid via Gibson assembly as previously described26. The resulting capsid DNA library, rAAV-Cap-in-cis-Lox, contained a diversity of ~1.28 billion variants at the AA level. The second-round viral DNA library was generated similarly to the first round, but, instead of NNK degenerate primers at the AA452–AA458 location, a synthesized oligo pool (Twist Bioscience) was used to generate only selected variants. This second-round DNA library contained a diversity of ~82,000 variants at the AA level.

The AAV2/9 REP-AAP-ΔCap plasmid transfected into HEK293T cells for library viral production was modified from the AAV2/9 REP-AAP plasmid previously used26 by deletion of the AAs between 450 and 592. This modification prevents production of a WT AAV9 capsid during viral library production after a plausible recombination event between this plasmid co-transfected with rAAV-ΔCap-in-cis-Lox containing the library inserts.

Three rAAV genomes were used in this study. The first, pAAV-CAG-mNeonGreen (Addgene, 99134), uses a single-stranded (ss) rAAV genome containing the fluorescent protein mNeonGreen under control of the ubiquitous CMV-β-actin-intron-α-globin hybrid promoter (CAG). The second, pAAV-CAG-NLS-GFP (Addgene, 104061), uses an ssAAV genome containing the fluorescent protein EGFP flanked by two nuclear localization sites, PKKKRKV, under control of the CAG promoter. The third, pAAV-CAG-FXN-HA, uses an ssAAV genome containing an HA-tagged FXN protein under control of the CAG promoter and harboring a unique 12-bp sequence in the 3′ untranslated region to differentiate different capsids packaging the same construct.

Viral production

rAAVs were generated according to established protocols59. In brief, HEK293T cells (American Type Culture Collection) were triple transfected using polyethylenimine; virus was collected after 120 h from both cell lysates and media and purified over iodixanol (OptiPrep, Sigma-Aldrich). The isolated variants investigated in vivo (AAV.CAP-B1, AAV.CAP-B2, AAV.CAP-B8, AAV.CAP-B10, AAV.CAP-B18 and AAV.CAP-B22) have similar production titer to AAV9, with normal titers around 1 ± 0.7 × 1012 viral genomes per 15-cm dish.

A modified protocol was used for transfection and purification of viral libraries. First, to prevent mosaic capsid formation, only 10 ng of rAAV-Cap-in-cis-Lox library DNA was transfected (per 150-mm plate) to decrease the likelihood of multiple library DNAs entering the same cell. Second, virus was collected after 60 h, instead of 120 h, to limit secondary transduction of producer cells. Finally, instead of polyethylene glycol precipitation of the viral particles from the media, as performed in the standard protocol, media were concentrated more than 60-fold for loading onto iodixanol.

Animals

All rodent procedures were approved by the Institutional Animal Use and Care Committee of the California Institute of Technology. Transgenic animals, expressing Cre under the control of various cell-type-specific promoters, and C57Bl/6J WT mice (000664) were purchased from Jackson Laboratory. Transgenic mice included Syn1-Cre (3966), GFAP-Cre (012886), Tek-Cre (8863) and TH-Cre (008601). Mice were housed under standard conditions between 71 °F and 75 °F, in 30–70% humidity and on a light cycle of 13 h on and 11 h off. For round 1 and round 2 selections from the viral library, we used one male and one female mouse from each transgenic line (aged 8–12 weeks), as well as a single male C57Bl/6J mouse. For validation of individual viral variants, male C57Bl/6J mice aged 6–8 weeks were used. Intravenous administration of rAAV vectors was performed via injection into the retro-orbital sinus.

Marmoset (C. jacchus) procedures for MPV1 and MPV2 and for marmoset single variant 1–4 (MSV1–4) and MPV11–13 were approved by the Animal Care and Use Committee of the National Institutes of Mental Health (NIMH). MPV1, MPV2, MSV1–4 and MSV11–13 were born and raised in NIMH colonies and housed in family groups under standard conditions of 27 °C and 50% humidity. They were fed ad libitum and received enrichment as part of the primate enrichment program for NHPs at the National Institutes of Health (NIH). For AAV infusions, animals were screened for endogenous neutralizing antibodies. None of the animals that were screened showed any detectible blocking reaction at 1:5 dilution of serum (Penn Vector Core, University of Pennsylvania). They were then housed individually for several days and acclimated to a new room before injections. Two animals were used for the pooled injection study, both males, aged 7.6 (MPV1) and 11.5 (MPV2) years (Extended Data Table 1). Five animals were injected with single variants for characterization, but only four were usable (Extended Data Table 2), as one animal (AAV.CAP-B22 injected, 6.9 years, female, 0.475 kg) was found dead (27 d after injection), and, at necropsy, the pathology report indicated chronic nephritis unrelated to the virus. The day before infusion, the animals’ food was removed. Animals were anesthetized with isoflurane in oxygen; the skin over the femoral vein was shaved and sanitized with an isopropanol scrub; and the virus (Extended Data Tables 1 and 2) was infused over several minutes. Anesthesia was withdrawn, and the animals were monitored until they became active, upon which they were returned to their cages. Activity and behavior were closely monitored over the next 3 d, with daily observations thereafter.

Marmoset procedures for MSV8–10 were approved by the Committee on Animal Care of the Massachusetts Institute of Technology (MIT), and all experiments were performed in accordance with the relevant guidelines and regulations. Marmosets were born and raised in an MIT facility accredited by the Association for Assessment and Accreditation of Laboratory Animal Care. Marmosets were housed in social groups under standard conditions of 23.3 ± 1.1 °C, 50% ± 20% humidity and a 12-h light/dark cycle. They were fed ad libitum with standard diet as well as fruits, vegetables and various protein sources. Periodic neutralizing antibody testing of animals in the facility did not reveal significant levels of neutralizing antibodies against AAV9. Each of the animals was injected with single variants for characterization. The day before infusion, the animals’ food was removed. Animals were sedated by alfaxalone; the skin over the cephalic vein was shaved and sanitized with an isopropanol scrub; and the virus (Extended Data Table 2) was infused through a 24-gauge catheter over several minutes. After the viral infusion was completed, animals were recovered on a warm water blanket (38 °C) until they regained normal motor functions. Then, animals were returned to their cages and monitored closely for normal behavior over the next 4 d, followed by daily observations thereafter.

Marmoset procedures for MSV5–7 were approved by the Institutional Animal Care and Use Committee of Shenzhen Institute of Advanced Technology (SIAT), Chinese Academy of Sciences. Marmosets were born and raised in SIAT colonies and housed in family groups under standard conditions of 22 ± 1 °C and 40–70% relative humidity. The marmoset breeding and housing facilities are accredited by the Association for Assessment and Accreditation of Laboratory Animal Care. Although animals were not screened for endogenous neutralizing antibodies, the animals were born and raised in the animal facility, and the housing environment for each animal was clean and isolated to prevent bacterial and viral infection. Therefore, the possibility of the animal carrying neutralizing antibodies for AAV virus is low. Before injection, marmosets were separated from family groups, housed two animals per each room for several days and acclimated to a new room before injections. Each of the animals was injected with single variants for characterization, but one animal (AAV.CAP-B22 injected, 2 years, male, 0.364 kg) was found dead (29 d after injection); at necropsy, the pathology report indicated that the death was unrelated to the virus. The day before infusion, the animals’ food was removed. Animals were anesthetized with isoflurane in air; the skin over the saphenous vein was shaved and sanitized with an ethanol scrub; and the virus (Extended Data Table 2) was infused over several minutes. Anesthesia was withdrawn, and the animals were monitored until they became active, upon which they were returned to their cages. Activity and behavior were closely monitored over the next 3 d, followed by daily observations thereafter.

DNA/RNA recovery and sequencing

Round 1 and round 2 viral libraries were injected into C57Bl/6J and Cre transgenic animals (Syn1-Cre, GFAP-Cre, Tek-Cre and TH-Cre) at a dose of 8 × 1010 viral genomes per animal, and rAAV genomes were recovered 2 weeks after injection, as described in the M-CREATE protocol33. To determine the number of variants included in round 2, 0.01 times the enrichment of the top variant in each tissue was set as a threshold, and variants above that threshold were included. Mice were euthanized, and most major organs were recovered, snap-frozen on dry ice and placed into long-term storage at −80oC. Tissues collected included brain, spinal cord, DRG, liver, lungs, heart, stomach, intestines, kidneys, spleen, pancreas, testes, skeletal muscle and adipose tissue. Then, 100 mg of each tissue (~250 mg for brain hemispheres and <100 mg for DRG) was homogenized in TRIzol (Life Technologies, 15596) using a BeadBug (Benchmark Scientific, D1036), and viral DNA was isolated according to the manufacturer’s recommended protocol. Recovered viral DNA was treated with RNase, underwent restriction digestion with SmaI (found within the inverted terminal repeats) to improve later rAAV genome recovery by polymerase chain reaction (PCR) and purified with a Zymo DNA Clean and Concentrator kit (D4033). Viral genomes flipped by Cre recombinase in select transgenic lines (or pre-flipped in WT animals) were selectively recovered using the following primers: 5′-CTTCCAGTTCAGCTACGAGTTTGAGAAC-3′ and 5′-CAAGTAAAACCTCTACAAATGTGGTAAAATCG-3′, after 25 cycles of 98 °C for 10 s, 60 °C for 15 s and 72 °C for 40 s, using Q5 DNA polymerase in five 25-µl reactions with 50% of the total extracted viral DNA as a template.

After Zymo DNA purification, samples from the WT C57Bl/6J animals were serially diluted from 1:10 to 1:10,000, and each dilution was further amplified around the library variable region. This amplification was done using the following primers: 5′-ACGCTCTTCCGATCTAATACTTGTACTATCTCTCTAGAACTATT-3′ and 5′-TGTGCTCTTCCGATCTCACACTGAATTTTAGCGTTTG-3′, after ten cycles of 98 °C for 10 s, 61 °C for 15 s and 72 °C for 20 s, to recover 73 bp of viral genome around and including the 21-bp variable region and add adapters for Illumina NGS. After PCR cleanup, these products were further amplified using NEBNext Dual Index Primers for Illumina sequencing (New England Biolabs, E7600), after ten cycles of 98 °C for 10 s, 60 °C for 15 s and 72 °C for 20 s. The amplification products were run on a 2% low-melting-point agarose gel (Thermo Fisher Scientific, 16520050) for better separation and recovery of the 210-bp band. The dilution series was analyzed for each WT tissue and the highest concentration that resulted in no product from WT tissue on the gel was chosen for the amplification of the viral DNA from the transgenic animal tissues. This process was performed to differentiate between viral genomes flipped before packaging or due to Cre in the animal. Pre-flipped viral genomes should be avoided to minimize false positives in the NGS results.

All Cre-flipped viral genomes from transgenic animal tissues were similarly amplified (using the dilutions that do not produce pre-flipped viral genomes) to add Illumina sequencing adapters and subsequently for index labeling. The amplified products now containing unique indices for each tissue from each animal were run on a low-melting-point agarose gel, and the correct bands were extracted and purified with a Zymoclean Gel DNA Recovery kit.

Packaged viral library DNA was isolated from the injected viral library by digestion of the viral capsid and purification of the contained ssDNA. These viral genomes were amplified by two PCR amplification steps, like the viral DNA extracted from tissue, to add Illumina adapters and then indices. Correct bands were extracted and purified after gel electrophoresis. This viral library DNA, along with the viral DNA extracted from tissue, was sent for deep sequencing using an Illumina HiSeq 2500 System (Millard and Muriel Jacobs Genetics and Genomics Laboratory, California Institute of Technology).

A pool of eight viruses (AAV9, AAV-PHP.eB, AAV.CAP-B1, AAV.CAP-B2, AAV.CAP-B8, AAV.CAP-B10, AAV.CAP-B18 and AAV.CAP-B22) packaging CAG-FXN-HA with unique 12-bp barcodes were injected into two adult marmosets (Extended Data Table 1). After 6 weeks, animals were euthanized, and brain and liver were recovered and snap-frozen. Then, 1-mm coronal sections from each tissue (4 mm for the brain) were homogenized in TRIzol (Life Technologies, 15596) using a BeadBug (Benchmark Scientific, D1036), and total RNA was recovered according to the manufacturer’s recommended protocol. Recovered RNA was treated with DNase, and cDNA was generated from the mRNA using SuperScript III (Thermo Fisher Scientific, 18080093) and oligo(dT) primers according to the manufacturer’s recommended protocol. Barcoded FXN transcripts were recovered from the resulting cDNA library, as well as the injected pool, using the following primers: 5′-TGGACCTAAGCGTTATGACTGGAC-3′ and 5′-GGAGCAACATAGTTAAGAATACCAGTCAATC-3′, after 25 cycles of 98 °C for 10 s, 63 °C for 15 s and 72 °C for 20 s, using Q5 DNA polymerase in five reactions using 50 ng of cDNA or viral DNA each as a template. After Zymo DNA purification, samples were diluted 1:100 and further amplified around the barcode region using the following primers: 5′-ACGCTCTTCCGATCTTGTTCCAGATTACGCTTGAG-3′ and 5′-TGTGCTCTTCCGATCTTGTAATCCAGAGGTTGATTATCG-3′, after ten cycles of 98 °C for 10 s, 55 °C for 15 s and 72 °C for 20 s. After PCR cleanup, these products were further amplified using NEBNext Dual Index Primers for Illumina sequencing (New England Biolabs, E7600), after ten cycles of 98 °C for 10 s, 60 °C for 15 s and 72 °C for 20 s. The amplification products were run on a 2% low-melting-point agarose gel (Thermo Fisher Scientific, 16520050) for better separation and recovery of the 210-bp band. All indexed samples were sent for deep sequencing as before.

Individual variants (AAV9, AAV-PHP.eB, AAV.CAP-B10 and AAV.CAP-B22) packaging CAG-FXN-HA with unique 12-bp barcodes were injected into 13 individual adult marmosets (Extended Data Table 2). After between 5 and 6 weeks, animals were euthanized, and brain and liver were recovered and snap-frozen. For DNA extraction from these tissues, 25-mg sections from the cortex and the liver were processed using a QIAamp DNA Mini Kit (Qiagen, 51304) to obtain purified viral and genomic DNA from the samples. For RNA transcript extration, 100-mg sections from the cortex and the liver (taken from consistent sections of tissue across animals) were homogenized in TRIzol (Life Technologies, 15596) using a BeadBug (Benchmark Scientific, D1036), and total RNA was recovered according to the manufacturer’s recommended protocol. From purified RNA, cDNA was generated with SuperScript IV VILO MasterMix (Thermo Fisher Scientific, 11766050) and oligo (dT) primers according to the manufacturer’s recommended protocol (including the DNase step).

Viral genome and RNA transcript copy numbers were determined through qPCR as described59 using FXN-HA-specific primers: 5′-GACCTAAGCGTTATGACTGG-3′ and 5′-AATCTGGAACATCGTATGGG-3′. Within each sample, the viral genome or transcript copy number was normalized on a per-cell basis by quantifying GAPDH transcripts in each sample using the GAPDH-specific primers: 5′-TGTTCCAGTATGATTCCACC-3′ and 5′-GATGACCCTTTTGGCTCC-3′. DNA and RNA in tissues from animals MSV5, MSV6 and MSV7 were analyzed at the SIAT, whereas MPV1–2, MSV1–4 and MSV8–13 were analyzed at the California Institute of Technology.

NGS data alignment and processing

Raw FASTQ files from NGS runs were processed with M-CREATE data analysis code (available on GitHub at https://github.com/GradinaruLab/mCREATE) that align the data to an AAV9 template DNA fragment containing the 21-bp diversified region between AA452 and AA458, for the two rounds of AAV evolution/selection, or to an FXN-HA template containing the 12-bp unique barcode, for the marmoset virus pool. The pipeline to process these datasets involved filtering to remove low-quality reads, using a quality score for each sequence, and eliminating bias from PCR-induced mutations or high GC content. The filtered dataset was then aligned by a perfect string match algorithm and trimmed to improve the alignment quality. For the AAV engineering, read counts for each sequence were pulled out and displayed along with their enrichment score, defined as the relative abundance of the sequence found within the specific tissue over the relative abundance of that sequence within the injected viral library. For the pooled barcodes, read counts for each sequence were pulled out and normalized to the respective contribution of that barcode to the initial, injected pooled virus to account for small inequalities in the amount of each member of the pool that was injected into the marmosets.

Tissue preparation, immunohistochemistry and immunofluorescence

Mice were euthanized with Euthasol and transcardially perfused with ice-cold 1× PBS and then freshly prepared, ice-cold 4% paraformaldehyde (PFA) in 1× PBS. All organs were excised and post-fixed in 4% PFA at 4 °C for 48 h and then sectioned at 50 µm with a vibratome. Immunofluorescence was performed on floating sections with primary and secondary antibodies in PBS containing 10% donkey serum and 0.1% Triton X-100. Primary antibodies used were rabbit anti-NeuN (1:200, Abcam, 177487), rabbit anti-S100 (1:200, Abcam, 868), rabbit anti-Olig2 (1:200, Abcam, 109186) and rabbit anti-Calbindin (1:200, Abcam, 25085). Primary antibody incubations were performed for 16–20 h at room temperature. The sections were then washed and incubated with secondary Alexa Fluor 647-conjugated anti-rabbit FAB fragment antibody (1:200, Jackson ImmunoResearch, 711-607-003) for 6–8 h at room temperature. For nuclear staining, floating sections were incubated in PBS containing 0.2% Triton X-100 and DAPI (1:1,000, Sigma-Aldrich, 10236276001) for 6–8 h and then washed. Stained sections were then mounted with ProLong Diamond Antifade Mountant (Thermo Fisher Scientific, P36970).

Marmosets were euthanized (Euthanasia, VetOne) and perfused with 1× PBS. One hemisphere of the brain (cut into coronal blocks) and the liver were flash-frozen in 2-methylbutane (Sigma-Aldrich, M32631) chilled with dry ice. The other hemisphere and organs were removed and post-fixed with 4% PFA at 4 °C for 48 h. These organs were then cryoprotected using 10% glycerol followed by 20% glycerol and flash-frozen in 2-methylbutane chilled with dry ice. The blocks of tissue were sectioned on an AO sliding microtome, except for spinal cord and DRG, which were cryosectioned. Then, 50-µm slices (20-µm for spinal cord and DRG) were collected in PBS, and immunohistochemistry or immunofluorescence was performed on floating sections.

To visualize cells expressing the HA-tagged FXN from the variant pool, slices were incubated overnight at room temperature with a rabbit anti-HA primary antibody (1:200, Cell Signaling Technologies, C29F4). After primary incubation, sections were washed in PBS and then incubated with a biotinylated goat anti-rabbit secondary antibody (1:200, Vector Laboratories, BA1000) for 1 h at room temperature. Sections were again washed in PBS and incubated for 2 h in ABC Elite (Vector Laboratories, PK6100) as outlined by the supplier. The ABC peroxidase complex was visualized using 3,3′-diaminobenzidine tetrahydrochloride hydrate (Sigma-Aldrich, D5637) for 5 min at room temperature. Sections were then mounted for visualization.

To visualize cells expressing the HA-tagged FXN for individual variant injections, immunofluorescence staining was performed on floating sections with primary and secondary antibodies in PBS containing 10% donkey serum and 0.1% Triton X-100. Primary antibodies used were rat anti-HA (1:200, Roche, 3F10), rabbit anti-NeuN (1:200, Abcam, 177487) and rabbit anti-S100 beta (1:200, Abcam, 52642). Primary antibody incubations were performed for 16–20 h at room temperature. The sections were then washed and incubated with secondary anti-rat Alexa Fluor 488 (1:200, Thermo Fisher Scientific, A-21208) and anti-rabbit Alexa Fluor 647 (1:200, Thermo Fisher Scientific, A32795). Stained sections were then washed with PBS and mounted with ProLong Diamond Antifade Mountant.

Imaging and quantification

All CAG-mNeonGreen-expressing tissues were imaged on a Zeiss LSM 880 confocal microscope using a Fluar ×5 0.25 M27 objective, with matched laser powers, gains and gamma across all samples of the same tissue. The acquired images were processed in Zen Black 2.3 SP1 (Zeiss).

All CAG-NLS-GFP-expressing tissues were imaged on a Keyence BZ-X all-in-one fluorescence microscope at 48-bit resolution with the following objectives: PlanApo-λ ×20/0.75 (1 mm working distance) or PlanApo-λ ×10/0.45 (4 mm working distance). For co-localization of GFP expression to antibody staining, in some cases the exposure time for the green (GFP) channel was adjusted to facilitate imaging of high- and low-expressing cells while avoiding oversaturation. In all cases in which fluorescence intensity was compared between samples, exposure settings and changes to gamma or contrast were maintained across images. To minimize bias, multiple fields of view per brain region and peripheral organ were acquired for each sample. For brain regions, the fields of view were matched between samples and chosen based on the antibody staining rather than GFP signal. For peripheral tissues, fields of view were chosen based on the DAPI or antibody staining to preclude observer bias.

Marmoset tissue sections transduced with the variant pool were examined and imaged on a Zeiss AxioImager Z1 with an Axiocam 506 color camera. Acquired images were processed in Zen Blue 2 (Zeiss).

Marmoset tissues transduced with individual variants were imaged on a Zeiss LSM 880 confocal microscope. Tissues from animals MSV5, MSV6 and MSV7 were imaged at the SIAT, whereas all other marmosets were imaged at the California Institute of Technology. Whole tissue sections were imaged using a Fluor ×5 0.25 M27 objective with a gallium arsenide phosphide photomultiplier tube detector at a pixel size of 1.25 µm × 1.25 µm. Images for cortex quantification were imaged with a LD-LCI Plan-Apochromat ×10/0.45 M27 objective with a photomultiplier tube detector at a pixel size of 0.42 µm × 0.42 µm. Images for liver quantification were taken with an LD-LCI Plan-Apochromat ×10/0.45 M27 objective with a photomultiplier tube detector at a pixel size of 0.42 µm × 0.42 µm or an LD-LCI Plan-Apochromat ×20/0.8 M27 with a gallium arsenide phosphide photomultiplier tube detector at a pixel size of 0.156 µm × 0.156 µm (MSV5, MSV6 and MSV7). Imaging at each magnification was performed with matched laser powers, gains and gamma across all samples of the same tissue imaged in the same location, whereas images taken in different locations were matched by maximizing the range indicator. The acquired images were processed in Zen Black 2.3 SP1 (Zeiss). Quantification of transgene expression in marmoset cortex was performed through manual cell counting of co-localized HA+ cells with NeuN+ staining in maximum intensity projection images. Quantification of transgene expression in the marmoset liver was performed through manual cell counting of co-localized HA+ cells with DAPI staining in maximum intensity projection images.

All image processing was performed with the Keyence BZ-X Analyzer (version 1.4.0.1). Co-localization between the GFP signal and antibody or DAPI staining was performed using the Keyence BZ-X Analyzer with the hybrid cell count automated plugin. Automated counts were validated and routinely monitored by comparison with manual hand counts and found to be below the margin of error for manual counts.

To compare total cell counts and fluorescence intensity throughout the brain between samples, an entire sagittal section located 1,200 µM from the midline was imaged using matched exposure conditions with the Keyence BZ-X automated XY stitching module. Stitched images were then deconstructed in the Keyence BZ-X Analyzer suite and run through the hybrid cell count automated plugin to count the total number of cells in the entire sagittal section. Average fluorescence intensity was calculated by creating a mask of all GFP-positive cells throughout the sagittal section and measuring the integrated pixel intensity of that mask. The total integrated pixel intensity was divided by the total cell count to obtain the fluorescence intensity per cell. In all cases where direct comparisons of fluorescence intensity were made, exposure settings and post-processing contrast adjustments were matched between samples.

Statistics and reproducibility

For the initial characterization of brain and liver expression in mice (Fig. 1d), the experiment was repeated with n = 3. For the statistical analysis in mice and related graphs (Fig. 2 and Extended Data Figs. 24), a single data point was defined as two tissue sections per animal, with multiple technical replicates per section when possible. Technical replicates were defined as multiple fields of view per section, with the following numbers for each region or tissue of interest: cerebellum = 3, cortex = 4, hippocampus = 3, midbrain = 1, striatum = 3, thalamus = 4, liver = 4, spleen = 2, testis = 2, kidney = 2, lung = 2, spine = 1, DRG = 1 and whole sagittal = 1. Unless otherwise noted, all experimental groups were n = 6, determined using preliminary data and experimental power analysis. Normality was tested to ensure that the data matched the assumptions of the statistical tests used.

For the initial pooled experiments in marmosets (Fig. 3), the experiment was repeated in two separate animals (Extended Data Table 1). For the statistical analysis in marmoset and related graphs (Fig. 4 and Extended Data Figs. 5 and 6), a single data point was defined as a single tissue section from a single animal. The data were collected across four separate cohorts in three separate locations (Extended Data Table 2), and the trend was recapitulated in each. For representative brain images and quantification (Fig. 4a,b and Extended Data Fig. 6b), the experiment was repeated n times, where n = 2 (AAV9), n = 3 (AAV.PHP.eB), n = 4 (AAV.CAP-B10) and n = 4 (AAV.CAP-B22), except for astrocyte staining (Extended Data Fig. 5), which was repeated n = 2 (AAV9), n = 3 (AAV.CAP-B10) and n = 2 (AAV.CAP-B22). For representative liver images and quantification (Fig. 4c,d and Extended Data Fig. 6a), the experiment was repeated n times, where n = 1 (AAV9), n = 3 (AAV.PHP.eB), n = 2 (AAV.CAP-B10) and n = 3 (AAV.CAP-B22). Global brain analysis of AAV.CAP-B10 nervous system expression (Fig. 4e,f) was performed with n = 2. No statistical methods were used to predetermine sample sizes, but our sample sizes were similar to those reported in our previous publications34,54. Data distribution was assumed to be normal, but this was not formally tested.

No data were excluded from analysis. Allocation of organisms and samples to separate groups was random, and animals were allocated to experimental conditions based on availability in each cohort. The investigators were not blinded to allocation during experiments and outcome assessment.

Microsoft Excel for Microsoft 365 (version 2107) and GraphPad Prism 8 for Windows (version 8.4.3 (686)) were used for statistical analysis and data representation. For all statistical analyses, significance is represented as *P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001 and ****P ≤ 0.0001; not significant, P ≥ 0.05.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.