The best way to understand the specificity and infectivity of gene therapy vectors is through their atomic structure. However for many vectors this basic blueprint is unavailable. Michael Chapman, with the help of his team at Florida State University has now determined the structure of a representative of one of the major groups of gene therapy vectors1 (Figure 1).

Figure 1
figure 1

Surface topology of AAV-2 colored according to distance from center (red closest, white furthest). Viewed down the 3-fold axis (left) and 5-fold axis (right) of symmetry. The authors would like to thank Michael Chapman for kindly providing these images.

Adeno-associated virus serotype 2 (AAV2), the subject of the new study, is a small nonpathogenic parvovirus requiring a helper virus to complete its lytic life cycle. AAV2 and other AAV serotypes are promising candidates as vehicles for therapeutic gene transfer. Recombinant (r) AAV vectors have been used to transduce numerous tissue types in many animal species resulting in long-term gene expression without severe immune consequences.2

Various types of mutagenic approaches have previously been used to genetically modify AAV2 in order to alter specificity and infectivity: random mutagenesis,3 alanine scanning mutagenesis4 and site-directed insertions based on the previously determined5 crystal structure of the distantly related Canine parvovirus (CPV).67 Yet it is a good bet that data collected using these approaches and others89 have only just scratched the surface compared with what is now possible using rational modifications based on the crystal structure of AAV2. This crystal structure is the Rosetta Stone that will allow the existing mutagenesis data to be deciphered and that should direct all future genetic modifications of AAV2 and the other serotypes.

One of the challenges that Chapman's group faced in determining the structure was obtaining sufficient virus free of adenovirus helper. This was no small task since 2–5 mg of virus (equivalent to 1–2 × 1015 particles) was required in each preparation for X-ray crystallography. Additionally, at the high concentration of material required for crystallography the virus precipitated from solution, requiring a specific co-solvent to increase solubility.

Once these problems of production and concentration were overcome, there was a seemingly intractable problem of virus biology. The virion of AAV2, like all parvoviruses, is composed of more than one capsid protein, the sequences of which share identical reading frames. The smallest of these proteins (Vp3) is the most abundant. As a result, some of the non-overlapping domains of the larger capsid proteins can not be resolved using X-ray cystallography. However genetic and infectivity data show that a Vp3-only virus will bind heparin, and compete with the normal virus in binding assays, but are non-infectious themselves.10 The lack of infectivity of the Vp3 only virus may result from a lack of phospholipase activity, previously mapped to the Vp1 non-overlapping domain (unresolved in this crystal structure).11 Thus genetic and crystallographic studies are both needed to obtain a full understanding of the structure and biology of this virus.

Although several parvovirus crystal structures have been determined, this structure is the first from the dependovirus class. There are striking similarities in the core β-barrel motif between these parvovirus structures, which consist of anti-parallel β-sheets. Between β-sheets are looped-out domains, the longest of which is between β-sheets G and H, the GH-loop. This loop is approximately 230 amino acids in length for AAV serotypes and many autonomous parvoviruses.12 This loop, although similar in length among AAV serotypes and autonomous parvoviruses, is very diverse in its amino acid sequence and, consequently, its surface topology and cell-surface receptor and antibody binding properties.

One of the new study's most interesting findings is the difference in topology between AAV2 and the other parvovirus crystal structures. In AAV2, centered about the three-fold axis of symmetry are three clusters of three peaks. The peaks are made up from the interaction of two adjacent subunits. However, the sequences that compose these structures are all from the GH-loop. In contrast, CPV has a large spike at the three-fold axis, which is also made up of two capsid subunits.51213 Interestingly, in the insect densovirus, which has 134 fewer amino acids in its GH-loop, there are few surface features near or at the three-fold axis of symmetry.14 These examples illustrate that functional viruses can be assembled from very different GH-loops. Given the apparent tolerance of parvoviruses to GH-loop variability, we predict a flurry of structure/function studies based on genetic modifications of the GH-loop. In similar fashion, modifications of the HI loop in adenoviruses resulted in improved adenoviral vectors.

Extensive interactions between capsid subunits at the three-fold axis may play a crucial role during subunit assembly Figure 1, left. In the homologous structures of CPV glycine residues at the base of these loops allow for flexibility such that loops from adjacent capsid proteins can fold over themselves.13 Yet in AAV2 the loop structure between amino acids 485 and 517 (Vp1 numbering) is sandwiched between the adjacent subunit's loop structures βGH2-3 and βGH12-13. Additionally, the authors suggested that interactions at the five-fold axis are potentially driving assembly Figure 1, right.

However AAV2 is assembled, the important question of what influences receptor binding is now answered. Many AAV mutations, widely spaced between amino acids 509 and 591 (Vp1 numbering), have been isolated that interfere with binding and hence virus infectivity.346 We now know how these mutations fit into this clear picture. The region between the peaks is lined with basic amino acids which are centered about the three-fold axis of symmetry.1 This implies that three subunits are required for the interaction between AAV2 and the cell-surface receptor heparan sulfate.

The AAV2 crystal structure will have a big impact on the field of targeted gene therapy. It has already allowed informed interpretation of previously available genetic data and localization of important functions including receptor and antibody binding to the GH-loops. With this information, a combination of site-directed and insertional mutagenesis, serotype and class domain swaps, and shuffling within the GH-loop domains could be used to produce much-improved gene transfer vectors. Additionally, regions of opposing peaks lining the pocket of each three-fold cluster can be exchanged with antibody binding sites or immune cell recognition sequences. Determination of the atomic structure of AAV2 represents a major step in rationally developing this vector, as well as the other AAV serotypes, into the safest possible targeted long-term expressing gene therapy tool. The new crystal-clear view of AAV2 will launch several years of exciting research.