Main

A non-ribosomal peptide synthetase (NRPS) module incorporates a single residue into a peptide natural product. Each module contains a peptidyl carrier protein (PCP) that is post-translationally modified with a phosphopantetheine cofactor6, an adenylation domain that loads the amino-acid substrate onto the PCP cofactor, and a condensation domain that catalyses peptide bond formation. NRPSs then use a carboxy (C)-terminal thioesterase or reductase domain to catalyse product release. Structures of individual domains1 provide insight into the NRPS structural mechanism. Interestingly, the adenylation domains have been shown to adopt two catalytic conformations7. First the adenylate-forming conformation activates the amino-acid substrate using ATP to form an aminoacyl adenylate and pyrophosphate. A C-terminal subdomain then rotates by ~140° to form the thioester-forming conformation that is used to install the amino acid onto the PCP7. These two functional states have been observed in structures of the phenylalanine activating adenylation domain of gramicidin synthetase8 and the complexes between adenylation and carrier proteins obtained with mechanism-based inhibitors9,10. Once loaded, both the pantetheine and loaded substrate have been shown to interact transiently with the core of the carrier protein11,12. The structure of SrfA-C, the terminal module from surfactin biosynthesis, contains a condensation–adenylation-PCP–thioesterase architecture and is to date the only structure of an intact NRPS module13. The condensation and adenylation domains share an extensive interface and were proposed to form the core of the module13. Lacking the pantetheine modification, this apo-structure shows the PCP domain directed towards the condensation domain. The other active sites are 40–60 Å from the pantetheinylation site, indicating that extensive domain rearrangements are required to complete the NRPS catalytic cycle. Movement of the PCP domain, potentially coupled to the adenylation C-terminal subdomain rotation7, is necessary for delivery of the peptide intermediates to the different catalytic domains.

We determined structures of two NRPSs with the same architecture as SrfA-C (Extended Data Fig. 1), but with holo-proteins that show functional interactions between the PCP and catalytic domains (Fig. 1). First we present two structures of AB3403 from the human pathogen Acinetobacter baumannii (protein annotation ABBFA_003403 in strain AB307-0294) that belongs to an uncharacterized biosynthetic pathway implicated in motility14, and biofilm15 and pellicle16 formation. We describe the structures of holo-AB3403 obtained without ligands and also upon crystallization in the presence of MgATP and glycine, which among the proteinogenic amino acids serves as the best substrate (Extended Data Fig. 2). Second, we present the structure of EntF from Escherichia coli, showing the PCP cofactor covalently trapped with a mechanism-based inhibitor to model thioester formation within the adenylation domain. These results provide views of two distinct steps in the NRPS catalytic cycle and demonstrate how the domain rotation within the adenylation domain mediates the delivery of the PCP between the two catalytic domains.

Figure 1: Ribbon diagrams of complete NRPS modules.
figure 1

a, Domain architecture of three structurally characterized termination modules. bd, The protein structures of (b) AB3403, (c) EntF, and (d) SrfA-C are coloured with domains coloured white (condensation), pink and red (adenylation domain N- and C-terminal subdomains), green-cyan (PCP), and blue (thioesterase). The phosphopantetheine moieties of AB3403 and EntF, and inhibitor Ser-AVS, are highlighted.

PowerPoint slide

The structures of AB3403 were determined at 2.7 and 2.9 Å resolution (Extended Data Table 1). No prior structure exists of an NRPS condensation domain bound to a ligand; the holo-AB3403 protein shows the pantetheine cofactor residing in the active site (Fig. 2 and Extended Data Fig. 3a). The two lobes of the condensation domain adopt the closed orientation seen recently in the CDA synthetase condensation domain17. Contacts are made between the pantetheine and the helix running from Glu20 to Leu30, in particular Tyr26 and Ile27, which forms one wall of the tunnel through which the pantetheine approaches the active site (Fig. 2b). Additionally, Tyr37 forms a hydrogen bond with the amide of the cysteamine moiety of the pantetheine cofactor. As the main chain carbonyl of Tyr37 hydrogen bonds to the main chain amide of the catalytic His145, this is a critical interaction to close the two lobes and bring the active histidine into proper position.

Figure 2: NRPS domain structures.
figure 2

a, The condensation domain of AB3403 (white) was aligned with SrfA-C (yellow) and EntF (orange) on the basis of the condensation C-terminal subdomain. The AB3403 PCP is included. b, The AB3403 condensation domain highlights residues that form the hydrophobic tunnel through which the pantetheine passes. c, Superposition of adenylation domains of AB3403 (pink and maroon for N- and C-terminal subdomains), SrfA-C (yellow) and gramicidin synthetase, GrsA (cyan), with phenylalanine and AMP molecules of GrsA. The dotted line highlights the alternative position of the catalytic lysines of AB3403 and SrfA-C. d, The EntF adenylation domain active site shows a covalent linkage from the pantetheine to the Ser-AVS inhibitor. eg, Electron density calculated with coefficients of the form Fo − Fc generated before inclusion of ligands and contoured at 3σ, are shown for the (e) AB3403 condensation, (f) AB3403 adenylation, and (g) EntF adenylation domains.

PowerPoint slide

Holo-AB3403 therefore illustrates the conformation that is adopted to properly deliver the pantetheine of the PCP to the condensation domain. The PCP is rotated ~30° relative to the orientation of the PCP domain of SrfA-C (Extended Data Fig. 4). The AB3403 PCP interface with the condensation domain is composed of residues from helix α2, the helix that follows the pantetheinylation site at Ser1006, and the loops that precede and follow this helix. In particular, residues Phe999 to Tyr1032 face the condensation domain. Leu1007 and Val1010 form a hydrophobic interaction with Leu22 and Ile80 of the condensation domain. The side chain of Lys1011 forms a hydrogen bond with the main chain carbonyl of Gln78. Finally, Val1026, Ala1027, and Ala1030 on the PCP helix α3 form a hydrophobic interaction with Tyr26 and Leu30. Arg344 of the condensation domain, which is positioned on an insertion compared with SrfA-C, interacts with the phosphate from the cofactor.

The AB3403 adenylation domain (Fig. 2c) is precisely positioned in the adenylate-forming conformation, unlike the adenylation domain of SrfA-C, which is in an open conformation that may be used for substrate binding or release5. The lysine of the conserved catalytic A10 motif 7,18 interacts with a phosphate oxygen from AMP and a carboxylate oxygen with glycine and superimposes with the homologous lysine in the gramicidin synthetase domain. In SrfA-C, the homologous lysine is ~12 Å away.

The thioesterase domain of AB3403 is structurally similar to the homologous domains of both SrfA-C and EntF (Extended Data Fig. 5), the latter of which has been characterized by NMR and crystallography in complex with the upstream PCP domain19,20. Despite the similarities in domain structure, the thioesterase domain of AB3403 is in a markedly different location compared with SrfA-C (Fig. 3a). Interestingly, in this new position the thioesterase domain cradles the back face of the PCP domain. The thioesterase domains of SrfA-C or AB3403 do not make substantial contacts with the other catalytic domains.

Figure 3: Conformational dynamics in NRPS modules.
figure 3

a, Alternative locations of the thioesterase domain SrfA-C and AB3403. b, Representative electron microscopy class averages of EntF. The smaller thioesterase (TE) domain is observed in various positions relative to the condensation (C)–adenylation (A) di-domain. Overall EntF adopts a variety of extended (top) to compact (bottom) conformations. c, The interface between the condensation C-terminal subdomain and the adenylation domain is shown for SrfA-C, AB3403, and EntF. The adenylation surface is shown in white, highlighting in red the regions that interact with the condensation domain. The right panel shows this interface, rotated by 90° around the y axis, with the condensation domain omitted for clarity.

PowerPoint slide

We next examined the delivery of the holo-PCP to the adenylation domain in a different NRPS protein. We have previously used targeted mechanism-based inhibitors, harbouring a vinylsulfonamide moiety that traps the thioester-forming reaction21 to characterize functional adenylation-PCP di-domain interactions9,10. These inhibitors mimic the native aminoacyl adenylate, but contain a Michael acceptor positioned to react with the pantetheine thiol. EntF crystallized only in the presence of the serine adenosine vinylsulfonamide (Ser-AVS) inhibitor (Fig. 2d and Extended Data Fig. 6) that limits conformational flexibility to promote crystallization. Crystals of the EntF protein diffract to 2.8 Å (Extended Data Table 2). No electron density was observed for the thioesterase domain although the intact protein was present in the crystal lattice (Extended Data Fig. 7).

The condensation domain of EntF is similar to the closed AB3403 conformation (Fig. 2a). The adenylation domain adopts the catalytic thioester-forming conformation of prior adenylation-PCP proteins9,10, demonstrating that the conformation is compatible with a full NRPS module. The active site of the EntF adenylation domain identifies conserved residues (Fig. 2d) that have been shown to play important catalytic roles in other members of this enzyme superfamily7. Arg863 interacts with the cofactor phosphate, while Gly864 and Gln865 form one wall of the pantetheine tunnel. Interactions with the nucleotide occur between Asp840 and the ribose hydroxyls, and between Tyr746 and Tyr852 and the adenine ring. The inhibitor serine binds in the binding pocket formed by Asp648, Ser722, and Asp754 (Fig. 2d).

The lack of density for the thioesterase domain in EntF suggested multiple conformations in the crystal lattice. This is not surprising given the limited interactions in SrfA-C and AB3403 between the thioesterase domains and the other domains. To assess thioesterase conformational mobility, we examined EntF by negative-stain electron microscopy followed by classification and averaging of single-particle projections (Extended Data Fig. 8). The class averages revealed primarily a tri-lobed density with two neighbouring globular densities of similar size attributed to the condensation and adenylation domains and a smaller lobe attributed to the thioesterase domain (Fig. 3b). The positioning of the thioesterase domain assumes a surprisingly wide range of distances and angles relative to the other domains.

The large interface of the SrfA-C condensation and adenylation domains13 suggested they constitute a catalytic platform, upon which the other domains move. We therefore compared the interfaces of the three NRPS modules (Fig. 3c). The interface in AB3403 is 1,023 Å2, comparable in size to the 1,097 Å2 interface of SrfA-C. In contrast, the interface in EntF is only 780 Å2, resulting from the rotation of the adenylation C-terminal subdomain to the thioester-forming conformation.

Additionally, the conformation of the interface is not conserved between all three proteins. Alignment of the structures on the basis of the amino (N)-terminal subdomains of the adenylation domain shows that the condensation domain of both AB3403 and EntF differ slightly from each other and more significantly from SrfA-C. In AB3403 and EntF, the condensation domains are rotated by ~25° relative to the adenylation domains. Furthermore, the EntF condensation domain is shifted closer towards the adenylation domain. Structural comparisons suggest that this alternative conformation in EntF may not be compatible with the adenylate-forming conformation. The three different condensation–adenylation domain conformations, the adenylate-forming incompatibility seen in EntF, and the multiple extended and compact conformations seen in the electron microscopy data suggest that the condensation–adenylation domain platform may be more dynamic than previously proposed13.

The new structures confirm the hypothesis7 that the adenylation domain conformational change is a structural mechanism to guide the PCP between active sites in the context of complete NRPS modules. The rotation of the adenylation domain C-terminal subdomain from the adenylate-forming conformation in AB3403 to the thioester-forming conformation of EntF delivers the PCP into the adenylation domain for loading. The recent structure of loaded holo-PCP has shown the interaction of the substrate with the PCP core which may help to promote release of the substrate from the adenylation domain11. This interaction also alters the surface electrostatic potential of regions that interact with the neighbouring catalytic domains, including α2 and α3, and may influence the PCP delivery to neighbouring catalytic domains. Finally, this transfer is further assisted by the linker region that joins the adenylation C-terminal subdomain with the PCP domain, which includes important contacts that are preserved in the adenylate- and thioester-forming conformations22, as well as the open conformation of SrfA-C.

The basic NRPS catalytic cycle requires that the PCP visits three adjacent catalytic domains in a coordinated manner. The two catalytic conformations of the adenylation domain7 require that the full cycle has four catalytic structural states (Fig. 4). Specifically, (I) the adenylation domain catalyses amino-acid adenylation, (II) the PCP is delivered to the adenylation domain for thioester-formation to load the PCP, (III) the PCP is delivered to the condensation domain to receive the upstream peptide, and finally (IV) the peptide is delivered to a downstream condensation, thioesterase, or reductase domain for release.

Figure 4: Dynamics of the NRPS cycle.
figure 4

The four-stage catalytic cycle of an NRPS module. The pantetheine cofactor is represented by the wavy line with a terminal thiol, SH. The aminoacyl adenylate intermediate is represented by AA-AMP. The thioester between the amino acid and the cofactor is shown as S-AA. Finally, the peptide bound to the upstream carrier protein (purple) is abbreviated Pep. Following the condensation reaction, the peptide is extended by one amino acid (Pep + 1) and presented to the thioesterase domain. The revised NRPS structural cycle is highlighted in yellow showing that only three structural states are required.

PowerPoint slide

Our results show that states I and III are identical and only three distinct conformations are required to accommodate the four catalytic states of the NRPS cycle (Fig. 4, yellow). The protein first adopts an adenylate-forming conformation, seen in AB3403, state III, to catalyse amino-acid adenylation. Through the domain rotation of the adenylation C-terminal subdomain, the PCP is delivered to the adenylation domain to load the pantetheine cofactor, as seen in the crystal structure of EntF, state II. Return of the PCP to the condensation domain delivers the loaded PCP for receipt of the upstream peptide, state III. Critically, as seen in AB3403, the adenylation domain can activate a second amino acid to prime the system for another cycle. The ability to simultaneously catalyse peptide bond formation and amino-acid adenylation at two active sites significantly increases the overall catalytic efficiency and throughput of the NRPS module. Finally, although no structure exists of a full NRPS module with the PCP directed into the thioesterase or other downstream domain in state IV, the structure of AB3403 also offers a new view of the thioesterase domain and suggests the peptide-loaded PCP could be delivered to the downstream thioesterase domain through a simple rotation.

The modular architecture of NRPSs as well as their capacity to catalyse unusual chemistry23,24 offer the potential for generating novel products through engineering enzyme activity and the combination of heterologous domains. These efforts have been limited by deficiencies in our understanding of the functional interactions between domains and within active sites. The new views of two essential catalytic states in the NRPS cycle, an appreciation of the greater dynamics of NRPS systems, and the structures of holo-NRPS proteins with relevant ligands will provide the necessary insights to guide these engineering efforts. In addition, these studies complement the recent visualization of modular polyketide synthases by cryo-electron microscopy25 to set the stage for investigations of the structural foundation of even larger, multi-modular biosynthetic proteins.

Methods

No statistical methods were used to predetermine sample size. The experiments were not randomized. The investigators were not blinded to allocation during experiments and outcome assessment.

Expression, purification, and crystallization of AB3403

The human pathogen A. baumannii contains an uncharacterized NRPS cluster that has been implicated in motility and biofilm formation; the product of this operon is unknown. This operon contains eight genes. In strain AB307-0294 (ref. 26), from which the NRPS gene was cloned, this operon consists of genes ABBFA_003399 to ABBFA_003406. In the more commonly used ATCC17978 strain, the same genes are encoded by A1S_0119 to A1S_0112. The ABBFA_003403 (designated AB3403 herein) protein sequence is available at GenBank under accession number ACJ56070.1.

The gene encoding AB3403 was PCR-amplified from AB307-0294 genomic DNA26 (courtesy of T. A. Russo). The amplified fragment was cloned into the pET15b-TEV expression vector27 and confirmed by DNA sequencing. The vector provides a His5-tag, linker, and tobacco etch virus (TEV) protease recognition site that, upon treatment with TEV protease, yields a final recombinant product with glycine and histidine preceding the initial methionine residue.

The AB3403 pET15b-TEV construct was transformed into E. coli (BL21-DE3) cells. Transformed cells were grown in LB media to an absorbance at 600 nm (A600 nm) of 0.6 at 37 °C. Protein expression was induced by addition of 0.5 mM isopropyl-β-d-thiogalactoside (IPTG) and cells were incubated overnight at 16 °C. Cells were harvested by centrifugation, flash-frozen in liquid nitrogen, and stored at −80 °C. Selenomethionine-labelled protein was generated in M9 minimal media using a metabolic inhibition protocol28. All purification steps were identical to the native protein.

For purification, cells were resuspended in a buffer containing 50 mM HEPES (pH 7.5), 250 mM NaCl, 10 mM imidazole, 0.2 mM TCEP. Cells were lysed by mechanical disruption (Branson Sonifier) and the resulting lysate was clarified by centrifugation at 235,000g for 45 min. The cell lysate was passed over a His-trap (GE Healthcare) immobilized metal ion affinity column and washed with lysis buffer containing 50 mM imidazole. Bound proteins were eluted with the same buffer containing 300 mM imidazole. The protein was incubated with TEV protease and dialysed against a TEV cleavage buffer (50 mM HEPES (pH 8.0), 250 mM NaCl, 0.2 mM TCEP, and 0.5 mM EDTA) for 16 h at 4 °C. This partly purified protein was then phosphopantetheinylated by incubation with His6-tagged non-specific phosphopantetheinyl transferase Sfp (10 nM), 12.5 mM MgCl2, and 1 mM CoA for 60 min at 20 °C. The clarified protein was then passed over the His-trap column a second time to remove uncleaved protein, the TEV protease, Sfp, and other contaminating proteins. The holo-AB3403 protein in the column flow-through was pooled, dialysed against a size exclusion buffer containing 50 mM HEPES (pH 7.5), 150 mM NaCl, 0.2 mM TCEP, and further purified by gel filtration (Superdex200). Protein concentration was assessed after dialysis against a crystallization buffer (25 mM HEPES (pH 7.5), 50 mM NaCl, 0.2 mM TCEP) using an extinction coefficient at 280 nm of 157,570 M−1 cm−1.

Crystallization conditions for holo-AB3403 were initially identified from a sparse matrix screen at 20 °C. Final crystals for native and SeMet-labelled holo-AB3403 were grown at 14 °C by hanging-drop vapour diffusion against 0.75–0.95 M potassium citrate, 0.01–0.025 M glycine, and 0.05 M bis-tris propane (BTP) (pH 8.0). Highest-quality native crystals were obtained using a protein concentration of 5.5 mg ml−1 with a protein to cocktail ratio of 1.5:1. SeMet protein was crystallized in the same manner with a protein concentration of 7.5 mg ml−1 and 1:1 protein to cocktail ratio. To obtain crystals in the presence of ligands, the protein was pre-incubated for 45 min at 4 °C with 2 mM MgCl2, and 1.5-fold molar excess of ATP and glycine.

Structure determination of AB3403

Crystals of holo-AB3403 were cryoprotected by stages using either ethylene glycol or potassium citrate for native and SeMet protein, respectively. The native protein crystals were cryo-protected with cocktails containing 1.0 M potassium citrate, 0.3 M glycine, 0.05 M BTP (pH 8.0), and increasing (8, 16, and 24%) v/v ethylene glycol. The SeMet-labelled protein was cryo-protected with cocktails containing 0.3 M glycine, 0.05M BTP (pH 8.0) and increasing (1.0, 1.2, 1.4, and 1.6 M) potassium citrate. Crystals derived from protein co-crystallized with ligands included the same concentration of MgCl2, ATP, and glycine in the cryo-protectant cocktails.

Diffraction data were collected on APS beamline 23-IDB. The native data (2.7 Å) were collected using a multi-crystal, multi-data set strategy using two crystals. A complete low-resolution scan was taken for one crystal followed by a higher-resolution scan of the best diffracting crystal. A high-resolution region of the second crystal was combined with the two scans from the first crystal. The optimal regions were identified with the JBLU-ICE software at the GM/CA beamline. A single peak wavelength data set (3.35 Å) was collected for SeMet-labelled protein. The liganded protein data set was collected with a single crystal.

Diffraction data were indexed, merged, and scaled using iMOSFLM29 in space group P4x212. Structure determination was performed with PHENIX30 using a combination of experimental single-wavelength anomalous diffraction (SAD) phasing and phased molecular replacement. A partial molecular replacement solution was positioned through PHASER with a sculpted (PHENIX sculptor) model derived from PheA (PDB accession number 1AMU)8 and CytC1 (PDB 3VNR). Using this partial molecular replacement model, the selenium sites were identified with the SAD data from SeMet-labelled crystals. An initial model was produced with PHENIX Autobuild that contained ~65% of the protein molecule, spread across multiple symmetry related molecules. This model was combined into a single protein chain, built and refined iteratively against native data using ARP-WARP31, COOT32, and PHENIX refine.

The final refinements were performed with translation-libration-screw-rotation (TLS) parameterization33 with groups consisting of residues 1:191, 191:445, 446:480, 481:862, 863:959, 960:973, 974:1044, and 1054:1318, roughly defining the NRPS domain (or subdomain) boundaries. The protein is complete from residues Asn2 to Pro1319 with two small disordered loops in the adenylation domain at Asn500–Asp501 and Gly627–Gly630. The latter loop is part of the conserved serine/threonine- and glycine-rich P-loop that is involved in binding the triphosphate of the nucleotide7. Additionally, the condensation domain contains electron density for a diacylglycerol lipid molecule that co-purified with the protein and potentially derived from the bacterial membrane during cell disruption. Diffraction and refinement statistics are presented in Extended Data Table 1. Experimental electron densities of the ligands of both structures are presented in stereo format in Extended Data Fig. 3.

Purification of EntF

The enterobactin biosynthetic cluster of E. coli has been used as a model system in many studies. The full-length EntF, containing the condensation, adenylation, PCP, thioesterase domain architecture, loads serine onto the PCP domain. The condensation domain then recognizes the external carrier protein EntB that has been loaded with 2,3-dihydroxybenzoate (DHB) by the activity of the freestanding adenylation domain EntE. The DHB-serine amide is then transferred to the thioesterase domain while two additional cycles of synthesis complete the enterobactin trilactone.

The EntF protein used in this study (GenBank P11454) was described previously22,34. The entf gene was PCR amplified from E. coli JM109 and cloned into a pET15-TEV vector with a N-terminal 5× His-tag and a TEV protease cleavage site22. The entf vector was transformed into E. coli (BL21-DE3) cells for protein expression. Cells were grown in lysogeny broth (LB) media to A600 nm = 0.6 at 37 °C before protein induction with 1 mM IPTG. Cells were grown overnight at 16 °C and collected by centrifugation. The cell pellets were flash frozen in liquid nitrogen. Selenomethionine-labelled EntF was expressed in M9 minimal media as described28.

For purification both of native and of SeMet-labelled protein, cells were resuspended in lysis buffer containing 50 mM Tris-HCl pH 7.5, 400 mM NaCl, 0.2 mM TCEP, 10% glycerol, and 10 mM imidazole. Cells were lysed via sonication and centrifuged at 235,000g for 45 min. Initial purification was achieved with a His-trap immobilized metal ion affinity column. Protein was eluted using lysis buffer with 300 mM imidazole. EntF was incubated with TEV protease overnight at 4 °C in a cleavage buffer containing 50 mM Tris pH 7.5, 400 mM NaCl, 0.2 mM TCEP, 10% glycerol, and 0.5 mM EDTA. Although expressed in E. coli, phosphopantetheinylation was assured by the addition of 10 nM Sfp, 1 mM CoA, and 12.5 mM MgCl2. The reaction was incubated at room temperature (22 °C) for 1–2 h. The holo-EntF was run over an immobilized metal ion affinity column once more to remove uncleaved protein along with Sfp. A final polishing step was performed with a Superdex 200 16/600 column in a final dialysis buffer containing 50 mM EPPS pH 8.0, 150 mM NaCl, 0.2 mM TCEP, 1 mM MgCl2, and 10% glycerol. Before crystallization, the Ser-AVS inhibitor was added at a concentration four times that of EntF and allowed to incubate for 2–4 h at room temperature.

For electron microscopy, native EntF was purified as above with the exception that a minimal dialysis buffer was used, which contained 50 mM EPPS pH 8.0, 100 mM NaCl, and 0.2 mM TCEP. No inhibitor was added.

Crystal conditions for the Ser-AVS inhibited EntF were first identified using the Hauptman-Woodward high-throughput screen35. Large diffraction-quality native and SeMet crystals were grown using hanging drop vapour diffusion at 20 °C. A crystallization cocktail, consisting of 100 mM BTP pH 7.5, 125–150 mM MgCl2, and 22–28% PEG 4000, was diluted 1:1 with the final dialysis buffer. The hanging drops then combined protein at 30 mg ml−1 and the undiluted crystallization cocktail at a ratio of 1:2. This ‘batch mimic’ limited the differences between the drop and reservoir and has been successful with other protein samples in our laboratory36.

Structure determination of EntF

Native EntF crystals were cryoprotected by that addition of 2,3-butanediol directly to the crystallization drop to a final concentration of ~10%. SeMet crystals were cryoprotected similarly except with glycerol to a final concentration of ~20%. Diffraction data were collected on APS beamline 23-IDB using the rastering option to find the optimal spots on both the native the SeMet crystals. Diffraction data were indexed, merged, and scaled using iMOSFLM29 in space group P4x212. Structure determination for the SeMet inflection data was performed in PHENIX30 using a PhaserEP MR-SAD with a partial molecular replacement solution that was obtained using a sculpted model (generated with PHENIX sculptor) derived from the Pseudomonas aeruginosa bidomain adenylation-PCP protein PA1221 (PDB 4DG9)9. Automated model building with BUCCANEER was used to build ~65% of the structure37. This partial model from the SeMet data was used as a molecular replacement model for the native data, and the remaining portion of the protein was built by hand (excluding the thioesterase domain, which was unresolved and constitutes about 19%). This model was built and refined iteratively using COOT32 and PHENIX refine. TLS refinement33 was used in final stages with groups consisting of residues 5:186, 187:429, 430:444, 445:857, 858:964, 965:971, and 972:1045.

The final model showed density for the condensation, adenylation, and PCP domains of EntF; no density was observed for the thioesterase domain. Diffraction and refinement statistics are presented in Extended Data Table 2.

In general, the overall quality of the density was weaker for the N-terminal subdomain of the condensation domain, residues 1–186, probably reflecting the higher mobility of this region of the protein. The average B-factors for different regions of the protein (Extended Data Table 2) support this conclusion.

Negative-stain electron microscopy analysis of EntF

EntF, purified as described above, was prepared for electron microscopy using the conventional negative staining protocol38, and imaged at room temperature with a Tecnai T12 electron microscope operated at 120 kV using low-dose procedures. Images were recorded at a magnification of ×71,138 and a defocus value of ~1.5 μm on a Gatan US4000 CCD camera. All images were binned (2 pixels × 2 pixels) to obtain a pixel size of 4.16 Å on the specimen level. Particles were manually excised using e2boxer (part of the EMAN 2 software suite)39. Two-dimensional reference-free alignment and classification of particle projections was performed using ISAC40. A total of 17,431 projections of EntF were subjected to ISAC, producing 133 classes consistent over two-way matching and accounting for 5,344 particle projections (Extended Data Fig. 8B).

Synthesis of serine adenosine vinylsulfonamide

Ser-AVS was synthesized using the protocol summarized in (Extended Data Fig. 6). All reactions were performed under an inert atmosphere of dry Ar in oven-dried (150 °C) glassware. 1H and 13C NMR spectra were recorded on a Varian 600 MHz spectrometer. Proton chemical shifts are reported in parts per million from an internal standard of residual chloroform (7.26 p.p.m.) or methanol (3.31 p.p.m.), and carbon chemical shifts are reported using an internal standard of residual chloroform (77.3 p.p.m.) or methanol (49.1 p.p.m.). Proton chemical data are reported as follows: chemical shift, multiplicity (s, singlet; d, doublet; t, triplet; m, multiplet; br, broad), integration, coupling constant. High-resolution mass spectra were obtained on an Agilent TOF II time of flight/mass spectrometry (TOF/MS) instrument equipped with either an ESI or APCI interface. Thin-layer chromatography (TLC) analyses were performed on TLC silica gel 60F254 from EMD Chemical, and were visualized with ultraviolet light or 10% PMA solution. Purifications were performed by flash chromatography on silica gel (Dynamic Adsorbents, 60A).

Materials. Chemicals, reagents, and solvents were purchased from Sigma Aldrich, Chem-Impex, or Acros Organic Fischer, and were used as received. An anhydrous solvent-dispensing system (J. C. Meyer) using two packed columns of neutral alumina was used for drying tetrahydrofuran (THF), Et2O, while two packed columns of molecular sieves were used to dry DMF and the solvents were dispensed under argon. Compound 1 was purchased from Chem-Impex and used as received. Compounds 2 (ref. 41) and 4 (ref. 10) were synthesized according to the reported procedures.

tert-Butyl (R,E)-4-(2-(N-(tert-butoxycarbonyl)sulfamoyl)vinyl)-2,2-dimethyloxazolidine-3-carboxylate (3). To a solution of tert-butyl (2) (395 mg, 1.0 mmol, 2.0 equiv) in 1:3 DMF–THF (4 ml) at −78 °C, was added a 1 M solution of LiHMDS in THF (2.0 ml, 4.0 equiv) dropwise over 15 min and the solution was stirred at -78 °C for an additional 15 min. Next, Garner’s aldehyde (1) (115 mg, 0.5 mmol, 1.0 equiv) in THF (1 ml) was added to the reaction over 15 min. The solution was gradually warmed to 25 °C and stirred for 15 h. The solvent was removed in vacuo and the mixture was taken up in H2O (30 ml). The pH was adjusted to 3–4 with 1 N aqueous HCl, then was extracted with ethyl acetate (EtOAc) (3 × 20 ml). The combined organic layers were washed with H2O (30 ml), saturated aqueous NaCl (30 ml), dried (MgSO4), and concentrated. Purification by flash chromatography (10% EtOAc–hexane to 50% EtOAc–hexanes) afforded the title compound 3 as colourless oil (150 mg, 74%): retardation factor (Rf) = 0.50 (50:50 EtOAc–hexanes); [α] +0.9 (c 0.02, CH2Cl2); 1H NMR (600 MHz, CD3OD) 1H NMR (600 MHz, CD3OD) δ 1.45 (s, 3H), 1.48 (m, 9H), 1.51 (s, 9H), 1.60 (s, 3H), 3.83–3.85 (m, 1H), 4.15 (dd, J = 12.0, 6.0 Hz, 1H), 4.56–4.58 (m, 1H), 6.64 (d, J = 18 Hz, 1H), 6.77–6.81 (m, 1H); 13C NMR (150 MHz, CD3OD) δ 28.41, 28.47, 28.80, 28.81, 58.7, 68.3, 84.19, 84.22, 95.8, 130.6, 145.7, 152.2, 152.7; HRMS (ESI–) calculated for C17H29N2O7S [M − H] 405.1701, found 405.1721 (error 4.9 p.p.m.).

Ser-AVS

To a solution of N6, N6-bis(tert-butoxycarbonyl)-2′,3′-O-isopropylideneadenosine (4) (73 mg, 0.14 mmol, 1.1 equiv), vinylsulfonamide (3) (52 mg, 0.13 mmol, 1.0 equiv) and PPh3 (56 mg, 0.21 mmol, 1.7 equiv) in THF (1 ml) at 0 °C, was added a solution of DIAD (42 μl, 0.21 mmol, 1.7 equiv) in THF (1 ml) over 1 h using a syringe pump. The solution was gradually warmed up to 23 °C and stirred overnight. The mixture was filtered over a short pad of silica gel, which was washed with 20% EtOAc-hexanes (100 ml). The filtrate was concentrated to afford crude 5 (Rf = 0.45, 50:50 EtOAc–hexanes), which was used in the next step without further purification. To a solution of crude 5 from the previous step was added 80% aqueous trifluoroacetic acid (1 ml) at 0 °C. The solution was stirred for 6 h at 0 °C then concentrated. Recrystallization from 1:20 MeOH–Et2O (5 ml) afforded the title compound (32 mg, 47%) as colourless film: [α] -10.3 (c 0.600, MeOH); 1H NMR (600 MHz, CD3OD) δ 3.30–3.39 (m, 2H), 3.67–3.70 (m, 1H), 3.83 (dd, J = 11.6, 4.1 Hz, 1H), 4.05–4.08 (m, 1H), 4.22–4.25 (m, 1H), 4.34–4.35 (m, 1H), 4.77–4.81 (m, 1H), 5.94 (d, J = 6.2 Hz, 1H), 6.70 (dd, J = 15.4, 6.5 Hz, 1H), 6.77 (d, J = 15.4 Hz, 1H), 8.27 (s, 1H), 8.29 (s, 1H); 13C NMR (150 MHz, CD3OD) δ 45.8, 54.1, 62.3, 72.9, 74.8, 85.8, 91.7, 121.3, 134.8, 137.0, 143.2, 149.9, 151.3, 156.1; HRMS (ESI+) calculated for C14H22N7O6S [M + H]+ 416.1347, found 416.1339 (error 1.9 p.p.m.).

Kinetic analysis of AB3403

Substrate preference for the adenylation domain of holo-AB3403 was established by the pyrophosphate exchange assay42 allowing radiolabelled PPi to be incorporated into ATP in the reverse reaction. One micromolar holo-AB3403 was added to 2 mM ATP, 0.2 mM NaPPi, 50 mM HEPES (pH 7.5), 100 mM NaCl, 10 mM MgCl2, 0.15 μCi [32P]PPi, and 5 mM substrate. Reactions (100 μl) were incubated for 10 min at 37 °C, then quenched with 0.5 ml 1.2% charcoal, 0.1 M unlabelled PPi, and 0.35 M perchloric acid. The charcoal was pelleted by centrifugation, washed twice with 1 ml H2O, and resuspended in 0.5 ml H2O for scintillation counting.

To determine the apparent kinetic constants for ATP and glycine for the holo-AB3403 adenylation domain, the NADH+ consumption assay monitored at A340 nm (refs 43, 44) with full-length AB3403. Hydroxylamine was used as a surrogate for the pantetheine in the second partial reaction to displace AMP for use in the coupled consumption assay45. Standard reactions contained 50 mM HEPES (pH 7.5), 15 mM MgCl2, 2 mM ATP, 3 mM phosphoenolpyruvate, 0.2 mM NADH+, 5 U myokinase, 5 U pyruvate kinase, 6.5 U lactate dehydrogenase, and 150 mM buffered hydroxylamine. Apparent kinetic constants were determined by varying concentrations of ATP or glycine with the one or the other in excess. Reactions were initiated by the addition of 0.001 mM enzyme. Calculations were done using PRISM software.