Accelerating the design of pili-enabled living materials using an integrative technological workflow

Huang, Yuanyuan; Wu, Yanfei; Hu, Han; Tong, Bangzhuo; Wang, Jie; Zhang, Siyu; Wang, Yanyi; Zhang, Jicong; Yin, Yue; Dai, Shengkun; Zhao, Wenjuan; An, Bolin; Pu, Jiahua; Wang, Yaomin; Peng, Chao; Li, Nan; Zhou, Jiahai; Tan, Yan; Zhong, Chao

doi:10.1038/s41589-023-01489-x

Article
Published: 27 November 2023

Accelerating the design of pili-enabled living materials using an integrative technological workflow

Yuanyuan Huang^1,2,3,
Yanfei Wu¹,
Han Hu⁴,
Bangzhuo Tong ORCID: orcid.org/0000-0002-5463-5699⁴,
Jie Wang^1,2,
Siyu Zhang⁵,
Yanyi Wang^1,2,
Jicong Zhang^1,2,
Yue Yin ORCID: orcid.org/0000-0002-7693-3910⁶,
Shengkun Dai¹,
Wenjuan Zhao ORCID: orcid.org/0000-0002-4353-9554¹,
Bolin An^1,2,
Jiahua Pu^1,2,
Yaomin Wang^1,2,
Chao Peng ORCID: orcid.org/0000-0002-6814-2676⁶,
Nan Li¹,
Jiahai Zhou ORCID: orcid.org/0009-0000-6573-3203^1,3,
Yan Tan ORCID: orcid.org/0000-0001-9656-8861⁴ &
…
Chao Zhong ORCID: orcid.org/0000-0002-6638-3652^1,2,3

Nature Chemical Biology volume 20, pages 201–210 (2024)Cite this article

2588 Accesses
1 Citations
14 Altmetric
Metrics details

Subjects

Abstract

Bacteria can be programmed to create engineered living materials (ELMs) with self-healing and evolvable functionalities. However, further development of ELMs is greatly hampered by the lack of engineerable nonpathogenic chassis and corresponding programmable endogenous biopolymers. Here, we describe a technological workflow for facilitating ELMs design by rationally integrating bioinformatics, structural biology and synthetic biology technologies. We first develop bioinformatics software, termed Bacteria Biopolymer Sniffer (BBSniffer), that allows fast mining of biopolymers and biopolymer-producing bacteria of interest. As a proof-of-principle study, using existing pathogenic pilus as input, we identify the covalently linked pili (CLP) biosynthetic gene cluster in the industrial workhorse Corynebacterium glutamicum. Genetic manipulation and structural characterization reveal the molecular mechanism of the CLP assembly, ultimately enabling a type of programmable pili for ELM design. Finally, engineering of the CLP-enabled living materials transforms cellulosic biomass into lycopene by coupling the extracellular and intracellular bioconversion ability.

You have full access to this article via your institution.

The β-subunit of tryptophan synthase is a latent tyrosine synthase

Article 14 May 2024

Natural proteome diversity links aneuploidy tolerance to protein turnover

Article Open access 22 May 2024

Biocomposite thermoplastic polyurethanes containing evolved bacterial spores as living fillers to facilitate polymer disintegration

Article Open access 30 April 2024

Main

The emerging field of engineered living materials (ELMs) seeks to create engineered biomaterials with distinctive ‘living’ attributes such as autonomous growth, self-healing and environmental responsiveness that are only found in natural living materials^1,2. The recent advances and integration of synthetic biology and materials science tools had led to the development of a wide range of remarkable ELMs with applications in biosensors^3,4, bioremediation^5,6, biomedicine⁷, biomanufacturing^8,9, wearable devices¹⁰ and electronics¹¹. Depending on the source of their structural components, ELMs can be produced either by harnessing engineered cells to simultaneously make the material and incorporate new functionalities into it (known as self-organizing living materials or biological ELMs)¹ or by embedding living cells in an organic or inorganic matrix (referred to as hybrid living materials)¹². Self-organizing living materials aim to recapitulate the autonomous, adaptive and versatile properties of natural living materials, and represent opportunities to harness engineered biological systems for new capabilities¹. For example, by building on endogenous biopolymers from microorganism systems, such as intracellular membraneless organelles^13,14, extracellular amyloid fibers of bacterial biofilms^15,16, bacterial cellulose (BC) from Komagataeibacter rhaeticus³ and fungal mycelium¹⁷, a diversity of self-organizing living materials have been created, with emerging functionalities ranging from acoustic properties⁸ to underwater adhesion¹⁸ and mechanical strengthening¹⁹.

Despite the advances in ELMs, further development and application of self-organizing living materials faces severe challenges due to the lack of engineerable chassis and corresponding programmable endogenous biopolymers in microorganisms, particularly the nonpathogens. At present, only model microbial systems, such as Escherichia coli and Bacillus subtilis along with their extracellular amyloid fibers¹, and several nonmodel systems including BC-producing K. rhaeticus²⁰, the surface-layer protein-containing Caulobacter crescentus²¹ and the dominant bacterial component of Pantoea agglomerans in native feedstocks of fungus¹⁷ have been successfully harnessed in ELM design.

Here, we develop an integrative technological workflow for ELMs by rationally combining tools from bioinformatics, structural biology and synthetic biology (Fig. 1). Building on those tools, our technological workflow facilitates mining, understanding and harnessing of bacterial biopolymers of interest and appropriate biopolymer-producing bacteria, as structural building blocks and hosts, for living material design in a systematic and sequential manner (Fig. 1). Our workflow was initially motivated by the state-of-the-art methodologies applied to search for metabolic bioactive molecules in microorganisms, for which the combination of genome mining, structure elucidation and pathway engineering has greatly accelerated the discovery and engineering of bioactive chemical compounds in diverse microorganisms²². We set out to develop bioinformatics software, which we termed Bacteria Biopolymer Sniffer (BBSniffer), by harnessing diverse bioinformatics tools. From a large library of bacterial genome sequences, BBSniffer can assist in searching a specific functional biopolymer of interest (including proteins, polysaccharides and other biopolymers), and correspondingly generate a screened list of biopolymer-producing bacteria, including functionally useful, experimentally tenable nonpathogens (Fig. 1a).

Fig. 1: An integrative technological workflow for mining, understanding, designing and engineering structural building blocks for living materials by combining tools from bioinformatics, structural biology and synthetic biology.

As a proof-of-principle study, using the well-studied Gram-positive pilus structure widely found in pathogenic bacteria as a reference²³, we uncovered the biosynthetic gene cluster (BGC) of the covalently linked pili (CLP) fiber in the BBSniffer-screened industrial workhorse Corynebacterium glutamicum. Through genetic manipulation, bioimaging and structural characterization, we identified Spa2 protein as major composition of the CLP fiber structure and gained insights into the molecular mechanism of CLP assembly (Fig. 1b). Using structure-guided design, we ultimately developed a type of engineerable extracellular protein scaffold that can be genetically appended with diverse functional peptides or proteins at multiple sites of the Spa2 protein (Fig. 1c). Finally, we rationally engineered the CLP fibers as programmable living materials in the industrial C. glutamicum ATCC 14067 strain with synthetic biology tools, enabling efficient use of cellulose biomass for lycopene production by coupling extracellular enzymatic degradation capacity with intracellular bioconversion ability. Leveraging the synergy power of bioinformatics, structural biology and synthetic biology, we demonstrate the rational design of pili-enabled living materials from scratch. Our systematic approach opens up new opportunities for designing self-organizing living materials with tailored functionalities.

Results

Mining biopolymer-producing bacteria through BBSniffer

Through billions of years, a large and diverse library of natural microbial biopolymers has evolved, such as extracellular polysaccharides, proteinaceous fibers and intracellular membraneless organelles²⁴. These biopolymers possess outstanding physicochemical and/or mechanical properties²⁴ and are ripe for harvesting as functional building blocks for biomaterials and living materials design. However, information about the producers, synthetic pathways and assembly mechanisms for most biopolymers, particularly those in nonpathogens, remains elusive, making it difficult to fully exploit them²⁴. Genome mining by various software tools such as antiSMASH²⁵, PRISM²⁶ and CLUSEAN²⁷ has been demonstrated as a powerful approach to detect and characterize BGCs of bioactive chemical compounds in microorganisms. However, these software programs cannot be used to mine BGCs of biopolymers in microorganisms because the rule-based screening algorithm in these tools is limited to only identifying microbial secondary metabolites.

To facilitate efficient mining of functional biopolymer-producing chassis, we set out to develop software, BBSniffer, by integrating several bioinformatics tools (Fig. 2a). The BBSniffer software, which couples the rule-based BGC identification algorithm of antiSMASH with an additional rule specified to define the BGC of interest, is specifically designed to detect BGCs of biopolymers in bacteria (for details, see the Methods section), and can automatically classify the strains based on the internal bacterial database into pathogens, industrial microorganisms and other nonpathogens. In the classification, a given strain, usually containing well-known information about the BGC and assembly mechanism of a specific biopolymer of interest, is used as a reference. Based on the defined reference, BBSniffer can further build a distance-based phylogenetic tree using JolyTree and generate a list of distance scores for all mined industrial microorganisms using the reference strain as a benchmark (Fig. 2a). Finally, BBSniffer uses the list of distance scores as a ranking reference to recommend candidate strains with uncovered BGCs for biopolymers of interest in genomes. Additionally, the software can output useful information for all the candidate strains such as anaerobes and/or aerobes, condition of growth and availability of genetic manipulation tools, therefore providing guidance for users in choosing strains for further engineering of living materials.

**Fig. 2: Mining functional biopolymer-producing bacteria in nature through BBSniffer.**

To demonstrate the usefulness of BBSniffer, we initially applied this software to mine ideal producers of the sortase-assembled CLP, which has frequently been found to be associated with Gram-positive pathogenic bacteria²³. The CLP is well known for its robust structure with strong tensile strength²⁸, providing a promising building block for living materials design. To search potential nonpathogenic industrial strains that might contain similar pili structures, we used the technical terms, the pilin and sortase, as input for initial exploration (Supplementary Fig. 1). We found 2,665 probable bacterial genomes associated with CLP in the UniProt database and downloaded them from the National Center for Biotechnology Information (NCBI). This number was further reduced to 1,162 through a one-step screening process with modified antiSMASH v.6.0, which uses specific rules to define the identification of a CLP-BGC based on the presence of both pilin and sortase proteins in a region of specified length in a genome (Supplementary Fig. 1). After all CLP-containing bacteria were classified, 102 industrial microorganisms distributed in Bacillus, Bifidobacterium, Corynebacterium, Lacticaseibacillus and Lactococcus were identified (Fig. 2b and Supplementary Data 1). These industrial microorganisms, and the reference strain of the pathogenic Corynebacterium diphtheriae NCTC 13129 (ref. ²³), were used to build a distance-based phylogenetic tree using JolyTree (Fig. 2c) and then generated with a list of distance scores (Fig. 2d and Supplementary Data 1). Finally, the top five scored strains including C. glutamicum BE, ATCC 14067, YI, ATCC 13869 and AJ1511 (with their genomes closest to that of the reference strain), along with other corresponding information about culture conditions and the availability of genetic manipulation tools (for example, CRISPR–Cas system), were outputted as candidates for further CLP engineering (Fig. 2e).

To demonstrate the generalizability of BBSniffer, we next explored the bacterial producers of three typical biopolymers (BC, gas vesicle (GV) and bacterial microcompartment (BMC)). Dependent on the input technical terms, BBSniffer recommended the industrial strains, Zymomonas mobilis ATCC 29192, ATCC 10988, ATCC 31821 and several others, from 7,109 mined strains for engineering bacteria cellulose (Extended Data Fig. 1a and Supplementary Data 2); the antibiotic producers of Streptomyces venezuelae ATCC 14585, Streptomyces lydicus 103, Streptomyces lavendulae subsp. lavendulae Del-LP and several others, from 489 mined strains for engineering GVs (Extended Data Fig. 1b and Supplementary Data 3) and the nonpathogenic E. coli M70, M24, M11957 and several others, from 4,241 mined strains for engineering BMCs (Extended Data Fig. 1c and Supplementary Data 4), according to the list of distance scores relative to the input reference strain of Komagataeibacter xylinus E25, Halobacterium salinarum 91-R6 and Salmonella typhimurium LT2, respectively.

To validate the performance of our BBSniffer software for mining the aforementioned four different types of biopolymer, we took the overall two cases of the searching results into account (for details, see the Methods section). First, we checked the detection accuracy of the BBSniffer software in searching all the literature-reported and experimentally well-characterized strains (usually containing well-defined information about the specific biopolymers and corresponding BGCs) by calculating their coverage rate. Second, for those searching results that have not yet been identified or reported in the literature, we applied the existing knowledge and features of biopolymer BGC as the major reference and then manually inspected the annotation results of those predicted biopolymer BGC strains. According to these cases, the success rate of the BBSniffer software for detecting strains containing BGC of CLP, BC, GV and BMC is 93.6, 80.5, 93.3 and 93.8% (Supplementary Data 1–4 and Supplementary Table 1), respectively. These results demonstrated that the BBSniffer software is applicable for mining various biopolymer-producing bacteria of specific interest.

Probing the CLP assembly in C. glutamicum

As a proof of concept, we next investigated the CLP assembly in the industrial workhorse C. glutamicum ATCC 14067 (referred to as ^CgCLP) by combining the approaches of genetic manipulation, morphological characterization, mass spectrometry analysis and X-ray crystallography (Supplementary Fig. 2). The industrial workhorse C. glutamicum is a ‘generally recognized as safe’ strain with well-established gene editing tools that is widely used for the industrial-scale production of valued products such as amino acids, diamines, terpenoids and other chemicals²⁹. In C. glutamicum, the CLP-BGC contains three predicted pilin-encoding genes, spa1 (NCBI locus tag: CEY17_01465), spa2 (NCBI locus tag: CEY17_01470) and spa3 (NCBI locus tag: CEY17_01485), as well as two sortase coding genes of srtC1 (NCBI locus tag: CEY17_01475) and srtC2 (NCBI locus tag: CEY17_01480) (Fig. 3a), which is similar to the SpaH-type (a relatively less well-studied pili type) CLP gene cluster in the pathogenic C. diphtheriae³⁰. Confirming that ^CgCLP-BGC are responsible for fiber formation, we observed no filamentous structures at the C. glutamicum cell surface on deletion of the CLP-BGC, while the filamentous structure phenotype was rescued on complementing the BGC of CLP (Fig. 3a).

**Fig. 3: Probing the composition and molecular assembly of the CLP in *C. glutamicum*.**

We next generated polyclonal antibodies against recombinant Spa1, Spa2 and Spa3 to determine the composition of ^CgCLP. Transmission electron microscopy (TEM) images of the ^CgCLP with immunogold labeling showed that the ^CgCLP fibers comprise two minor pilins of Spa1 and Spa3 and a major pilin of Spa2 (Supplementary Fig. 3). A whole-cell filtration enzyme-linked immunosorbent assay (ELISA)¹⁵, TEM and atomic force microscopy (AFM) imaging used to assess the specific roles of the three pilins in the ^CgCLP assembly showed that the cells, which were defective for Spa1 (Δspa1), Spa3 (Δspa3) or both (Δspa1Δspa3), could still produce fibers (Fig. 3a and Extended Data Fig. 2a). By contrast, cells lacking Spa2 (Δspa2) could not produce any fiber, and overexpression of Spa2 (Spa2) promoted the formation of abundant long fibers surrounding the cell surface (Fig. 3a and Extended Data Fig. 2a). TEM and AFM images also showed that cells lacking both SrtC1 and SrtC2 (ΔsrtC1ΔsrtC2) completely blocked fiber formation (Supplementary Fig. 4). Collectively, these findings verified that the major pilin of Spa2 protein is an indispensable building block for the sortase-catalyzed ^CgCLP assembly and production, similar to the role of the most highly studied SpaA in pili assembly in the pathogenic C. diphtheriae³¹. Despite this similarity, the wide variation in the size and sequences of major pilin protein from diverse Gram-positive pathogens³² makes it challenging to predict whether the structural principles characterized for the CLP of other hosts are also appliable in ^CgCLP.

Unlike the non-CLP produced in Gram-negative bacteria²³, the CLP monomer subunits are typically joined via intermolecular isopeptide bond catalyzed by sortase conferring enormous tensile strength²⁸. Furthermore, the CLP subunits contain auto-catalyzed intramolecular isopeptide bonds that are less susceptible to proteolytic cleavage and can dissipate mechanical energy²³ imparting the robustness of CLP. In addition, several pilin proteins in the CLP structure of different strains contain additional disulfide bonds that further enhance stability³². Having identified the Spa2 major pilin as the essential building block for ^CgCLP fiber production, we next explored whether any intermolecular isopeptide bond, disulfide bond or intramolecular isopeptide bond forms during the ^CgCLP assembly.

First, the purified ^CgCLP polymers were excised from Coomassie blue-stained SDS–PAGE gels (Supplementary Fig. 5) and then digested in-gel with trypsin and AspN endoproteinase. Liquid chromatography with tandem mass spectrometry was used to analyze the digestion products, and verify the presence of the intermolecular isopeptide bond, as indicated by the elimination of a water molecule and thus a slight decrease of molecular weight as a result of isopeptide bond formation. Specifically, the peptide peak with m/z 832.9²⁺ (Fig. 3b and Supplementary Table 2) suggested that the major pilin of Spa2 was cross-linked between K194 in the N terminus of Spa2_i and T477 in the C terminus of Spa2_i+1 (Lys194-Thr477).

Quadrupole time-of-flight mass spectrometry analysis of a recombinant variant of Spa2 (Spa2^cut) (Supplementary Fig. 6) secreted by C. glutamicum cells indicated a molecular weight of 46,504.6 Da (Supplementary Fig. 7), which is about 54.7 Da less than the expected value calculated from the secreted Spa2^cut amino acid sequence. This detected mass is consistent with the loss of three NH₃ units and two 2H units, which can be explained by the formation of three intramolecular isopeptide bonds (loss of one molecule of ammonia, 17 Da) and two disulfide bonds (loss of two hydrogen atoms, 2 Da) in Spa2.

To probe the structural features of major pilin Spa2, we solved its X-ray crystal structure at 2.73 Å resolution (Protein Data Bank (PDB) ID 7WOI) (Extended Data Fig. 3a and Supplementary Table 3) by the molecular replacement method using PHASER with the coordinates predicted by Alphafold Colab as a template (Supplementary Fig. 8a). Spa2 is arranged in three tandem Ig-like domains, including N-domain (residues 36–197, pink), M-domain (residues 198–343, blue), and C-domain (residues 344–469, green), giving an elongated molecule of roughly 125 Å in length (Extended Data Fig. 3a). These three tandem Ig-like domains of Spa2 are similar to the major pilin of SpaA (PDB 3HR6, root-mean-square deviation (r.m.s.d.) 6.5 Å over 270 alpha-carbon (C_α) atoms, Supplementary Fig. 8b) and SpaD (PDB 4HSS, r.m.s.d. 4.0 Å over 311 C_α atom, Supplementary Fig. 8c) from human pathogen C. diphtheriae^32,33. The crystals of the Spa2 adopt head-to-tail stacking such that the N-domain in Spa2_i abuts against the C-domain in Spa2_i+1 (Extended Data Fig. 3a and Supplementary Fig. 9), which is consistent with the result that the Spa2 monomers are connected via the intermolecular isopeptide bond between K194 in the N terminus of Spa2_i and T477 in the C terminus of Spa2_i+1 (Fig. 3b). Together, these results indicate that the biological assembly of ^CgCLP fiber occurs via the head-to-tail polymerization of Spa2 monomers.

Furthermore, interpretation of electron-density maps clearly showed three common isopeptide bonds and two unique disulfide bonds in the structure of Spa2 (Extended Data Fig. 3b,c and Supplementary Fig. 10). Formation of multiple covalent bonds was also verified by liquid chromatography–tandem mass spectrometry analysis of the pepsin-digested Spa2^cut products (Extended Data Fig. 4). The isopeptide bonds linked Lys57 and Asn195 with catalytic Glu158 in the N-domain; Lys203 and Asn318 with catalytic Asp246 in the M-domain and Lys355 and Asn466 with catalytic Glu435 in the C-domain (Extended Data Fig. 3b and Supplementary Fig. 10a). Notably, the presence of three intramolecular isopeptide bonds distributed in three domains of major pilin Spa2 in C. glutamicum is similar to the feature of the major pilin SpaD from the pathogenic C. diphtheriae³³, but is different from the major pilin SpaA from the pathogenic C. diphtheriae lacking isopeptide bonds in the N-terminal domain³². In addition, two disulfide bonds were formed in the N-domain between Cys97 and Cys128 and the C-domain between Cys380 and Cys432, respectively (Extended Data Fig. 3c and Supplementary Fig. 10b). Notably, the presence of two disulfide bonds in Spa2 is very unique in comparison with other major pilins in human pathogens, such as Spy0128 (PDB 3B2M) from Streptococcus pyogenes³⁴ and BcpA (PDB 3KPT) from Bacillus cereus³⁵ lacking a disulfide bond, and the SpaA and SpaD from C. diphtheriae containing only one disulfide bond in the C-terminal domain^32,33.

To explore the intermolecular polymerization between Spa2 monomers, we next conducted functional assays with various Spa2 mutant variants to explore their roles for ^CgCLP formation in vivo. Indeed, mutagenesis experiments with K194A and LPLTG_474LALAA478 variants blocked ^CgCLP production, confirming that both Lys194 in the N-domain and LPLTG_474-478 in the C-domain participate in Spa2 monomer polymerization (Extended Data Fig. 3d,e). To further test how the intramolecular isopeptide bond and disulfide bond in Spa2 monomer contribute to the formation and stabilization of ^CgCLP, a series of Spa2 variants were generated. Substitutions of Glu158, Asp246 and Glu435 (E158A, D246A, E435A) with alanine that abolished one or two intramolecular isopeptide bonds (Supplementary Fig. 11a–c) had no notable impact on ^CgCLP production (Extended Data Fig. 3d,e). Only the double mutation variants of D246A/E435A abolished all three intramolecular isopeptide bonds in Spa2 (Supplementary Fig. 11d), and produced only 44.9% of ^CgCLP compared to Spa2 cells (Extended Data Fig. 3e). Abrogation of the disulfide bonds in the N and C domains of Spa2 with C97A and C380A variants, respectively, dramatically reduced the extent of ^CgCLP formation (Extended Data Fig. 3d,e). The C97A/C380A double mutant variant completely blocked ^CgCLP formation (Extended Data Fig. 3d,e). Taken together, these results indicate that both isopeptide and disulfide bonds contribute to the formation of CLP in C. glutamicum, with the disulfide bond appearing as the most important element for its stabilization.

Programming ^CgCLP as an extracellular protein scaffold

As an extracellular matrix, the CLP fibers can be conveniently and reliably positioned directly outside cells (Fig. 4a). Because these extracellular fibers not only possess extraordinarily high tensile strength owing to their extensive inter- and intramolecular isopeptide bonds²⁸, but also contain unique amino acids such as cysteine residues, the CLP structure may serve as an attractive building block for various applications. For example, our preliminary study revealed that the Spa2 protein of the ^CgCLP, containing four cysteine residues, could automatically promote local mineralization of CdS on the fibers (Extended Data Fig. 5), potentially resulting in photocatalytic applications similar to previous work based on photocatalyst-mineralized living biofilms³⁶.

**Fig. 4: In situ functionalization of CLP as a programmable extracellular protein scaffold.**

In addition, the proteinaceous nature of the CLP fibers makes them potentially amenable for elaboration using genetic engineering. To determine suitable fusion sites to append peptides and/or proteins to Spa2, guided by both the Spa2 crystal structure and our characterization of specific functional domains within Spa2, we selected four different positions to test the fusion of a protein of interest (POI), with one site at the N terminus of Spa2 and three sites in the M-domain lacking a disulfide bond (Fig. 4a). TheΔspa2 strain with abrogated extracellular ^CgCLP formation was harnessed to harbor the exogenous expression plasmid for Spa2 fusion protein expression to test the restored ^CgCLP fiber production (Fig. 4a).

Using the fluorescent reporter protein mCherry, we set out to identify the interrogated positions for generating functional fusion proteins while retaining the sortase-catalyzed CLP formation capacity of Spa2. We explored four sites for mCherry insertion, including at Q35 (E1) in the N terminus of Spa2 (after the cleavage site of peptidase, Supplementary Fig. 6c), G215 in loop 1 of the M-domain (E2), G236 in the loop 2 of the M-domain (E3) and G336 in the β23-sheet of the M-domain (E4) (Fig. 4a). Fluorescence intensity and quantitative ELISA both showed that the cells expressing the fusion protein in each site fluoresced and produced corresponding fibers at varied levels (Fig. 4b). Confocal microscopy showed that mCherry fluorescence was detected for all engineered variants, with fluorescence evident at extracellular sites on the C. glutamicum cells (Fig. 4c), consistent with TEM imaging results showing that the mCherry-functionalized ^CgCLP proteins formed extracellular fibers surrounding the cells (Supplementary Fig. 12). Combining the results of fluorescence intensity, ELISA quantification analysis, confocal microscopy and TEM imaging, we concluded that both E1 and E2 are more ideal sites for fusion of a functional POI yielding abundant amount of functionalized ^CgCLP fibers.

To explore how the functionalized peptides and/or proteins of varied sequence length affect the secretion and assembly of ^CgCLP, we next assessed the expression of a variety of Spa2 fusion proteins (six POIs, each fused at the E1 position) (Fig. 4d), including one with a 6-His tag; one containing a SpyCatcher protein and one with its partner, SpyTag³⁷; one with Mfp3S peptide that can promote interfacial adhesion¹⁹; one with Venus fluorescence reporter protein³⁸ and one with catalytic protein of endo-1,4-β-glucanase from Clostridium cellulolyticum (CcEgl)³⁹. All of these fusion proteins were successfully expressed, secreted and formed ^CgCLP (Fig. 4e and Extended Data Fig. 2b). Appropriate assays, such as imaging and enzyme activity assays, also confirmed that each of the POIs was functional even after CLP fiber formation (Extended Data Fig. 6).

To further quantify how the fused polypeptides and/or proteins of varied sequence length and different features affect the amount of the CLP biopolymers produced by the strains, we applied a whole-cell filtration ELISA¹⁵. The analysis showed that strains of 6His-Spa2, SpyTag-Spa2, Mfp3Spep-Spa2, SpyCatcher-Spa2 produced 1.53-, 1.35-, 1.45- and 1.27-fold CLP fibers when compared to the wild type (Extended Data Fig. 2b), and strains of Venus-Spa2 and CcEgl-Spa2 produced only 22.2 and 34.7% amount of the CLP fibers, respectively (Extended Data Fig. 2b). These results indicate that polypeptides with longer sequence length or certain secondary structures may affect the conformation of Spa2 and decrease the catalytic efficiency of sortase-catalyzed polymerization. However, the sortase-mediated polymerization is not completely abolished by fusion of POIs to Spa2 monomers, indicating that various types and sizes of proteins can be engineered into a generally programmable extracellular protein scaffold of ^CgCLP.

To assess whether our programmable ^CgCLP extracellular protein scaffold can support the coassembly of multiple heterologous proteins, we conducted experiments in the Δspa2 strain with the well-established spilt-Venus system⁴⁰ (Fig. 4f). Coassembly of two distinct proteins did not disturb ^CgCLP assembly as indicated by TEM images (Supplementary Fig. 13) and the results of ELISA quantification analysis (Extended Data Fig. 2c). The highest fluorescence intensity was observed in cells where the split-Venus components were simultaneously fused with Spa2 (Fig. 4g,h). Almost no fluorescence was detected when only N-Ven and C-Ven were simultaneously secreted without anchoring to the ^CgCLP scaffold (Fig. 4g,h). These results indicated that the split components can be coassembled in the extracellular ^CgCLP scaffold. With this established programmable CLP in C. glutamicum, we thus opened up a new possibility for the design of pili-enabled living materials.

Pili-enabled ELMs for biomass-to-chemical conversion

Our positive results building on the split-Venus system showing successful coassembly of two types of fusion protein within Spa2 suggested the potential for metabolic channeling applications. Cellulosic biomass is an abundant source of fixed, renewable carbon that represents a promising alternative to fossil petroleum as a feedstock for producing a wide range of chemicals. C. glutamicum cells do not have an endogenous capacity to degrade cellulose chains into sugar monomers⁴¹. We next turned to coassemble multiple cellulases into a catalytic cascade for extracellular degradation of cellulose into glucose to support production of specific chemicals of interest (for example, lycopene) in C. glutamicum (Fig. 5a). We tested this hypothesis by coassembly of the endo-1,4-β-glucanase from Trichoderma reesei (TrEgl)³⁹ and β-glucosidase from Saccharophagus degradans (SdBgl)⁴² in the ^CgCLP fiber, as these two enzymes are known to work together to degrade cellulose into glucose via enzyme cascade reactions (Fig. 5a).

**Fig. 5: ELMs based on the programmable *C. glutamicum* pilus structure for lycopene production from biowastes.**

We next turned to create a catalytic cascade of multiple cellulases to degrade cellulose into glucose to support lycopene production in C. glutamicum. We first constructed a C001 chassis (Δspa2Δdec) with deletion of both the spa2 gene (Δspa2, for the abrogation ^CgCLP formation) and a 43,702 basepair (bp) region between CEY17_RS03380 and CEY17_RS03560 (Δdec, for accumulation of the precursor)⁴³ (Extended Data Fig. 7a). To construct the basal lycopene-producing strain C002, we introduced a plasmid of P1 for isopropyl-β-d-thiogalactoside (IPTG)-inducible expression of the dxs gene and crtEBI gene cluster (Extended Data Fig. 7b,c). We then constructed the C003 strain for cellulose degradation by simultaneous expression of TrEgl-spa2 and SdBgl-spa2 genes in the C002 strain transformed with a plasmid P2 encoding the two genes (Extended Data Fig. 7b,c). The C003 strain coassembled TrEgl and SdBgl in ^CgCLP fiber on the extracellular scaffold (Extended Data Figs. 2d and 7d) and enabled the degradation of carboxymethylcellulose sodium (CMC-Na, the ether derivate of cellulose) in medium, as indicated by the medium turning from a viscous gel to a thin solution (Fig. 5b). By contrast, the C004 strain, which only simultaneously secreted both TrEgl and SdBgl without anchoring to the ^CgCLP scaffold, did not show similar behavior (Fig. 5b and Extended Data Figs. 2d and 7d).

By using the CLP-based extracellular protein scaffold for multi-enzyme colocalization in the presence of C003 strain (Fig. 5c), the ELMs could drastically improve the degradation efficiency and produce a four-fold higher yield of glucose compared to the simultaneously secreted cellulases (C004 strain, Fig. 5c). As shown in Fig. 5d, the lycopene production titer in C003 strain reached 0.83 mg g⁻¹ dry cell weight after 36 h of culture in a M63 medium with CMC-Na as the sole carbon resource. Admittedly, this yield is substantially lower than that achieved by C. glutamicum using a combined pathway engineering approach⁴⁴. Nevertheless, our work here illustrated an example of applying ELMs for lycopene production by coupling the extracellular enzymatic degradation capacities with intracellular bioconversion ability. In addition, the C. glutamicum-based ELMs could simultaneously sustain self-growth and lycopene production using cellulose biomass waste as the sole carbon source, therefore expanding both the functionalities and application scenarios of existing engineered strain systems^15,16. Future yield improvement in lycopene production with our constructed ELMs can be achieved by integrating the state-of-the-art directed enzyme evolution⁴⁵ and machine-learning approaches⁴⁶ to enhance the performance of coassembled cellulases in ^CgCLP fiber to improve the conversion efficiency of cellulose to glucose. In addition, further efforts in pathway optimization⁴⁴ are expected to enhance the metabolic flux from glucose to lycopene.

Collectively, we here demonstrate a new type of ELMs by programming our engineerable ^CgCLP biopolymer for anchoring multiple cellulases on the extracellular scaffold. Our work thus offers an example of direct ‘upcycling’ of a renewable waste resource into a value-added chemical by combining extracellular and intracellular bioconversion abilities. This programmable CLP-enabled living material would push the boundary of self-organizing living biofilm materials, which were mainly based on the programmable amyloid fibers of E. coli and B. subtilis biofilms^15,16. Future reengineering the CLP-based extracellular protein scaffold with other types of enzyme or by rewiring other types of intracellular pathway synthesis route in our engineered strain, one can envision that our ELMs would exhibit more tunable functionalities (such as degrading of polyethylene terephthalate with PETase⁴⁷ and degrading chitin with chitinases⁴⁸) or enable direct ‘upcycling’ of a renewable waste resource of cellulose into other valued chemicals.

Discussion

Recent studies have harnessed both model and nonmodel microorganisms with their endogenous biopolymers to create self-organizing living materials with diverse functionalities. Despite those advances, all of the previous efforts were mainly based on existing molecular biology knowledge for a limited number of endogenous biopolymers in certain bacteria. Further bottom-up design and application of self-organizing living materials has been hampered due to the lack of engineerable chassis and limited access to programmable building blocks in nonpathogens.

In this work, we have established an integrative technological workflow for efficient design of living materials by rationally combining tools from bioinformatics, structural biology and synthetic biology. The workflow enables rapid mining of biopolymer-producing nonpathogenic chassis, understanding of the molecular assembly mechanism of biopolymers and engineering of biopolymer building blocks for living materials in a systematic and sequential manner. By using this approach, one can search for biopolymer-producing strains in nonpathogenic industrial workhorses among a large library of bacteria and accordingly perform new ELMs design from scratch, regardless of the current knowledge of endogenous biopolymers. For example, uncovering the biosynthesis pathway of polysaccharides, such as surface capsular polysaccharides in probiotics, can be harnessed to engineer bacteria for living therapeutics, while mining the biosynthesis pathway of the highly negatively charged polyamides, such as poly-γ-d-glutamic acid, in nonpathogens may enable bioremediation of heavy metal ion-contaminated rivers.

Our BBSniffer software, based on bacterial databases, currently is only limited to searching for biopolymer-producing bacterial strains; however, it can be improved for fungi and mammalian cell-enabled living materials design by incorporating more relevant databases. In addition, although our BBSniffer software proves useful in recommending various biopolymer-producing strains with output information such as culture conditions and genetic manipulation tools, the performance of the software for searching different types of biopolymer BGC may vary to a certain degree owing to the variable levels in deciphering the biopolymer BGC and assembly mechanisms in different biopolymer systems. In the future, to improve the performance of the BBSniffer software, machine-learning approaches can be applied to refine the characteristics of a specific biopolymer BGC and its assembly mechanisms to generate more tailored and accurate detection rule for each type of biopolymer.

To further improve the design of ELMs in the future, our established workflow can be coupled with machine-learning methods and protein computation tools, such as Alphafold2, which excels in protein structure prediction from amino acid sequences⁴⁶, and trRosetta, which enables the de novo design of proteins with new functions⁴⁹. For example, using such combined approaches, various proteinaceous building blocks with predictable behaviors can be designed to generate ELMs with tailored properties. We envision that our approach will guide future efforts in mining and deciphering diverse biopolymers of interest in nonpathogens or even industrial workhorse bacteria, and eventually accelerate the design of living materials with tailored functionalities.

Methods

BBSniffer software and associated workflow

Open-source BBSniffer software was developed to uncover target biopolymer biosynthesis in nature and generate a screened list of the biopolymer-producing bacteria. Specifically, the complete workflow of BBSniffer is provided below.

Genome and protein profile extraction

Input protein family IDs of related structural materials, obtained from InterPro database and Pfam database, were searched against the UniProt database to obtain proteins of interest and the genomes where proteins were located. Query results were saved in table format under the following columns: protein family ID, entry name, reviewed, protein names, genes, organism, organism ID, length and sequence of amino acids.

For genome extraction, genome sequences and GenBank files were downloaded based on organism IDs (NCBI taxa IDs) in the table using the NCBI-genome-download tool (https://github.com/kblin/ncbi-genome-download). UniProt entries were accessed via API queries (https://www.uniprot.org/help/api_queries). In UniProt query results, the organisms IDs can either be a species-level ID or a strain-level ID. If an organism ID from the results was given at the species level, all complete and chromosome-level genomes of the strains under the same species ID were downloaded. All the downloaded genomes were saved in .fasta format.

For protein profile extraction, proteins sharing the same input protein family IDs were aligned with Clustal Omega, then related hidden Markov model (HMM) profiles were generated by HMMER with default parameters. Fasta and genebank files were processed with Biopython.

Biopolymer BGC detection

AntiSMASH²⁵ was modified to identify the biopolymer BGC of interest in downloaded genomes. Modified algorithm of antiSMASH uses a new rule to identify the target biopolymer cluster type. To be eligible for a target biopolymer BGC, the genes encoding functional proteins for biopolymer biosynthesis (using the ‘HMM-profiles’) in the downloaded genomes (using the ‘genome sequences file’) should be present in a specified length region of a genome defined by a ‘rule file’. The rule file stipulates that the maximum DNA length between the coding sequences of protein families in a genome should be 20 kbp and the DNA length of the extra region outside the coding sequence of protein families in a genome should be exactly 8 kbp. In addition, the rule file also contains a description file specifying the path of the profile HMMs. After that, all downloaded genome sequences were analyzed for the target biopolymer BGC by the modified antiSMASH. Finally, the detected genomes with biopolymer BGC information were integrated into table format by a Python script.

Building an internal bacterial database

Next, identified strains containing the target biopolymer BGC were classified into ‘pathogens’, ‘industrial microorganisms’ and ‘other nonpathogens’ based on the annotation of an integrated bacterial database. The bacterial database was created in the following manner. First, all bacteria information was downloaded from the PATRIC database using PATRIC’s P3 scripts (https://docs.patricbrc.org/cli_tutorial/index.html). Second, strains were classified as ‘pathogens’ if the strains were found associated with any disease information in the PATRIC database. Third, the label ‘industrial microorganisms’ was assigned to bacteria included in PROBIO (http://bidd.group/probio/download.htm), MicrobiomePost.com (https://microbiomepost.com/probiotic-strain-database/), Generally Recognized as Safe organisms (https://www.cfsanappsexternal.fda.gov/scripts/fdcc/index.cfm?set=GRASNotices) or the strains or species used for chemical or biomaterial production (extracted from several publications). Finally, strains that were neither pathogens nor industrial microorganisms were classified as others. In addition, we used the BacDive database (https://bacdive.dsmz.de/), which is a free and the worldwide largest database for standardized bacterial information, to annotate the strains with additional information including anaerobes and/or aerobes and condition of growth (for example, medium and growth conditions). Following a similar strategy of the literature-reported SynBioStrainFinder database⁵⁰, which is a microbial strain database of manually curated CRISPR–Cas genetic manipulation system information), we have annotated our own bacterial database with additional information regarding the available genetic manipulation tools (such as the CRISPR–Cas system) of the industrial microorganisms. The internal bacterial database we built for BBSniffer is available at GuitHub (https://github.com/xbiome/BBSniffer/blob/publish/Database/Bacteria_database.xlsx).

Phylogenetic tree construction

After classification, genomes of industrial microorganisms and the reference strain were used to construct a distance-based phylogenetic tree using JolyTree⁵¹. The generated tree was saved in Newick format by default and then visualized using the ggtree R package⁵².

Candidate strain generation

The candidate strains were ranked with two rules: (1) whether they belong to the ‘industrial microorganisms’ class and (2) the distance to the selected reference strain, which can be extracted from the distance matrix of the JolyTree results. The lower the score of a strain, the closer it is to the reference strain and, thus, the higher its ranking in the recommendation list. Finally, the top five strains (distance ranked from low to high) are recommended as the candidate strains for further engineering of targeted biopolymers.

All the steps mentioned above were wrapped as a standalone Python program named BBSniffer, accessible at GitHub (https://github.com/xbiome/BBSniffer/).

Assessment of the performance of BBSniffer

We performed more bioinformatics experiments for all the four types of biopolymer, sortase-assembled CLP, BC, GV and BMC, and assessed the success rate of the BBSniffer software to illustrate the power of our established BBSniffer workflow. Specially, we assessed the final success rate of the BBSniffer software by taking two general cases into account. First, we checked the detection accuracy of the BBSniffer software in searching all the literature-reported and experimentally well-characterized strains (usually containing well-defined information about the specific biopolymers and corresponding BGCs) by calculating their coverage rate. Second, for those searching results that have not yet been identified or reported in literature, we applied the existing knowledge and features of biopolymer BGC as the main reference and then manually inspected the annotation results of those predicted biopolymer BGC strains. We eventually assessed the overall performance of our established workflow based on the results of the aforementioned four types of biopolymer.

We defined the final success rate of the BBSniffer software for searching the target biopolymer containing strains based on the equation:

$$\begin{array}{ccc}{{\eta }} & = & ({\rm{A}}1{\prime} +{\rm{A}}2{\prime} )/({\rm{A}}1+{\rm{A}}2)\\ {\rm{R}}1 & = & \frac{{\rm{the}}\; {\rm{coverage}}\; {\rm{number}}\; {\rm{based}}\; {\rm{on}}\; {\rm{BBSniffer}}\; {\rm{detection}}\; {\rm{results}}}{{\rm{the}}\; {\rm{number}}\; {\rm{of}}\; {\rm{BGC}}-{\rm{containing}}\; {\rm{strains}}\; {\rm{reported}}\; {\rm{in}}\; {\rm{literature}}}=\frac{{\rm{A}}1{\prime} }{{\rm{A}}1}\\ {\rm{R}}2 & = & \frac{{\rm{the}}\; {\rm{number}}\; {\rm{of}}\; {\rm{verified}}\; {\rm{strains}}\; {\rm{that}}\; {\rm{contain}}\; {\rm{the}}\; {\rm{BGC}}\; {\rm{based}}\; {\rm{on}}\; {\rm{manual}}\; {\rm{inspection}}}{{\rm{the}}\; {\rm{number}}\; {\rm{of}}\; {\rm{randomly}}\; {\rm{picked}}\; {\rm{up}}\; {\rm{strains}}\; {\rm{based}}\; {\rm{on}}\; {\rm{BBSniffer}}\; {\rm{detection}}\; {\rm{results}}}=\frac{{\rm{A}}2{\prime} }{{\rm{A}}2}\end{array}$$

where η is the success rate of BBSniffer software for searching the target biopolymer containing strains. R1 is defined as the detection accuracy rate of the BBSniffer software using the BGC-containing strains reported in literature as references. We first search a certain number of published papers that contain clear information about the target biopolymer BGC-containing strains. Using the information as a reference, we calculate the coverage rate of the BBSniffer software by comparing the information of experimentally characterized biopolymer BGC with BBSniffer detection results.

R2 is defined as the detection accuracy rate of the BBSniffer software in predicting those software-mined but unidentified strains (corresponding information about the strains containing target biopolymer BGC is unknown). Several software-mined strains were randomly chosen and, for each selected strain, we performed more bioinformatics experiments to analyze and annotate the functional genes in biopolymer BGC of BBSniffer-mined strains. Then, using the well-defined biopolymer BGC features as a reference, we determined whether the strain indeed contains the related biopolymer BGC. Note that the BBSniffer-detected strains with target BGC were functionally annotated through emapper⁵³ and NCBI blast.

To determine whether the predicted strains contain the target CLP-BGC, we use the well-defined CLP-BGC character as a reference: there are at least one pilin and one sortase in the detected BGC²³. We determined whether the predicted strains possess the capability for the biosynthesis of BC by following the well-studied rule: BcsA and BcsB are necessary and sufficient for forming the BC polysaccharide chain in vitro⁵⁴. GV formation involves the primary structural protein GvpA protein and additional required Gvp proteins, all of them are encoded in the gvp gene clusters¹⁴. Therefore, we determined whether the predicted strains have the capability for the formation of GV with the following rule: GvpA proteins and at least one other Gvp protein must exist in the detected GV-BGC. BMCs comprise multiple shell proteins surrounding enzyme cargos, typically encoded on a single gene cluster⁵⁵, we use these characters as a reference to determine whether the predicted strains contains the BMC-BGC.

General methods

The original DNA sequence was fully synthesized (Genewiz) or PCR generated. All PCR products were generated by KOD DNA polymerase (Toyobo). All plasmid construction was performed using the T4 DNA ligase (New England BioLabs) for ligations or the NEB Builder HiFi DNA Assembly Master Mix (New England BioLabs) for assembly. All plasmids or markerless strains were confirmed by DNA sequencing (GENEWIZ). Primers and protein sequences are listed in Supplementary Tables 4 and 5.

Growth media

C. glutamicum ATCC 14067 was provided by S. Zheng’s research group at the South China University of Technology. C. glutamicum ATCC 14067 was grown in BHI (brain, heart infusion) liquid medium for recovery (37 g l⁻¹ BHI, Becton, Dickinson and company) at 30 °C, 250 rpm, overnight. For ^CgCLP formation, C. glutamicum ATCC 14067 was inoculated into M63 liquid medium (15.6 g l⁻¹ M63 Broth (Sangon Biotech, Guangzhou, China), supplemented with 1 mM MgSO₄, 0.2% (wt/vol) glucose) and cultivated in an incubator at 30 °C without shaking for 2–3 days. Antibiotics for C. glutamicum culture were kanamycin (25 μg ml⁻¹) and chloramphenicol (7.5 μg ml⁻¹).

IPTG at 1 mM or theophylline at 1 mM was used to induce gene expression. Trans1-T1 (TransGen Biotech) was used as the cloning host for plasmid manipulation, and E. coli BL21 (DE3) (New England BioLabs) was used for protein expression. E. coli was cultured in Luria-Bertani medium (10 g l⁻¹ peptone, 5 g l⁻¹ yeast extract, 10 g l⁻¹ NaCl) at 37 °C or 16 °C when applicable for protein expression. Antibiotics for E. coli culture were kanamycin (50 μg ml⁻¹) and chloramphenicol (30 μg ml⁻¹). IPTG at 0.5 mM was used to induce gene expression. All bacterial strains used in this study are listed in Supplementary Table 6.

TEM and immunogold labeling

TEM imaging

C. glutamicum cells cultured for 2–3 days in M63 medium were collected and washed twice in PBS buffer and 20 μl with an optical density (OD₆₀₀) of roughly 1 of M63 liquid cultures were deposited onto carbon-coated TEM grids for 5–10 min. The samples were washed two times with 50 μl of PBS buffer and three times with 20 μl of water and then the excessive solution were quickly wicked away with filter paper. The fixed cells were negatively stained with 15 μl of 2 w/v% uranyl acetate solutions for 1 min and dried for 10 min under an infrared lamp. Samples were examined in a JEOL JEM-1400 TEM at an accelerating voltage of 120 kV.

Immunogold labeling

Partial of the coding sequences of ^CgCLP pilins of Spa1 (residues 44–473 of Spa1, Spa1_ab), Spa2 (residues 35–469 of Spa2, Spa2_ab) and Spa3 (residues 31–235 of Spa3, Spa3_ab), were expressed in E. coli, purified and injected into rabbits to prepare the specific polyclonal antibodies of α-Spa1, α-Spa2 and α-Spa3 (ChinaPeptides), respectively. For immunogold labeling, 20 μl with an OD₆₀₀ of roughly 1 of M63 liquid cultures were placed on carbon-coated grids for 10 min, washed twice with PBS buffer and three times with water. The samples were blocked with PBS with 1% bovine serum albumin (BSA) for 30 min. The solution was wicked off with filter paper and the fixed cells were stained with a pilin primary antibody diluted 1:200 in PBS with 1% BSA for 1 h, followed by washing and blocking. Samples were stained with 10 nm gold-decorated goat antirabbit IgG (Bioss) diluted 1:50 in PBS with 1% BSA for 45 min, followed by washing three times with PBS and five times with water. Then, negative staining, drying and imaging were performed. Double immunogold labeling experiments were performed according to a previous publication with some modification⁵⁶. Briefly, after the primary antibody incubation, samples were incubated with PBS containing 3% paraformaldehyde and 2% glutaraldehyde for 2 h. Samples were washed three times with PBS and incubated with 0.02 M glycine in PBS for 10 min. The immunogold labeling process was performed with the second pilin antibody and different sizes (5, 15 or 30 nm) of gold-decorated goat antirabbit IgG, followed by negative staining, drying and imaging.

Quantitative assay of CLP via whole-cell filtration ELISA

A method of whole-cell filtration ELISA to detect the presence of extracellular amyloids was adopted¹⁵ for quantitative analysis of CLP. Briefly, C. glutamicum strains were cultured for 48 h in M63 liquid medium, and cultures were collected, washed and diluted to an OD₆₀₀ of 0.1 in Tris-buffered saline with 0.1% Proclin 300 (TBS + 0.1% Proclin 300) on ice. Then, 25 μl of the diluted culture was loaded in a Multiscreen-GV96-well filter plate, followed by washing, blocking, incubating with α-Spa2 (diluted to 1:5,000) and washing, blocking and incubating with goat antirabbit horseradish peroxidase-conjugated secondary antibody (diluted to 1:5,000) (Sangon Biotech). Subsequently, a chromogenic reaction was performed via ultra-3,30,5,50-tetramethyl-benzidine, which was terminated by the addition of 2 M H₂SO₄. Finally, the product was measured absorbance at 450 nm (a reference wavelength of 650 nm) with a Cytation reader (BioTek).

Protein crystallization and structure determination

The final purified protein was concentrated to 20 mg ml⁻¹ in 10 mM Tris-HCl pH 8.0 and 50 mM NaCl for crystallization. The sitting drop vapor diffusion technique was used to crystallize the Spa2 protein. Crystals were obtained by mixing 4 μl of Spa2 protein with 4 μl of reservoir solution (0.2 M sodium sulfate, 0.1 M Bis-Tris propane pH 7.5, 20% (w/v) polyethylene glycol 3350) after the mixture was incubated at 18 °C for 1–2 weeks (Supplementary Fig. 14a). The crystals were soaked in a cryogenic-protectant solution consisting of the reservoir solution and 20% (v/v) glycerol, and then quickly frozen with liquid nitrogen. Diffraction data were collected on the BL18U1 beamline at the Shanghai Synchrotron Radiation Facility with flash frozen crystals (at 100 K in a stream of nitrogen gas). The data were processed by XDS software and then further processed using STARANISO (a server of Global Phasing Company).

The recombinant Spa2 crystal form diffracted to 2.73 Å resolution (Supplementary Fig. 14b) and belongs to the space group P2₁2₁2₁, with unit-cell parameters a = 45.7 Å, b = 64.1 Å, c = 442.0 Å, α = β = γ = 90.0° and two molecules in the asymmetric unit. The structure was solved by the molecular replacement method using PHASER⁵⁷ and the predicted Spa2 coordinates by Alphafold Colab⁴⁶ as template. Further manual model building was carried out using COOT. The model was refined by PHENLX. Data collection, phasing and refinement statistics are given in Supplementary Table 3. Structure figures were prepared using PyMOL v.2.3.4 (https://pymol.org/2/).

Enzymatic activity assay

The enzyme activity of cellulases against carboxymethylcellulose sodium salt (CMC-Na, Sigma) was detected using a 3,5-dinitrosaloculoc acid (DNS) assay⁵⁸. Cells of TrEgl-Spa2_SdBgl-Spa2 (C003 strain) and TrEgl_SdBgl (C004 strain) at an OD of 10 were concentrated to 500 μl and incubated in 2 ml of 50 mM acetic acid (pH 4.8) with 1% (w/v) CMC-Na substrate at 50 °C for 30 min. The reaction was stopped by adding DNS and boiling for 10 min; reducing sugars were detected at 540 nm. One unit of enzyme activity was defined as the number of cells that released 1 μmol of glucose from cellulose at 50 °C in 1 min. The enzyme activity of endo-1,4-β-glucanase was determined using the manual assay kit (K-CellG5-2V, Megazyme).

Quantitative analysis of lycopene by HPLC

The lycopene-producing plasmid of pZ9-dxs_crtEBI was transferred into strain TrEgl_SdBgl to construct the recombinant strains of C003 and C004 for the use of cellulose to produce lycopene. C003 and C004 strains were inoculated into 10 ml of BHI with 25 μg ml⁻¹ kanamycin and 7.5 μg ml⁻¹ chloramphenicol, and cultured for 12 h at 30 °C at a stirring speed at 200 rpm. Then cells were transformed into 50 ml of modified M63 medium (15.6 g l⁻¹ M63 broth, supplemented with 1 mM MgSO₄, 2% (wt/vol) CMC-Na) with initial OD₆₀₀ of 3 for 2 days at 30 °C and 1 mM IPTG was added or not.

A previous approach was adopted for the quantitative analysis of lycopene production⁴⁴. IPTG induced and uninduced cells (1 ml) were separately collected into 2 ml tubes of lysing matrix Y (M.P. Biomedicals) by centrifugation at 13,523g for 5 min. The pellets were resuspended in a 60% hexane and 40% acetone mixture and lysed using the FastPrep^R-24 5G bead beating grinder and lysis system (M.P. Biomedicals) for lycopene extraction. The lysis condition is 30 s once with a 1 min interval six times.

The samples were centrifuged at 18,406g for 10 min at 4 °C, and the resulting supernatant was then transferred to brown 2 ml screw cap glass vials (Agilent Technologies) and directly subjected to high-performance liquid chromatography (HPLC) analysis. The quantification of lycopene was performed on an Agilent 1260 series HPLC system (Agilent Technologies) using YMC Carotenoid (250 × 4.6 mml.D., YMC) and detected via a diode array detector at 450 nm. For separation, binary gradient elution was applied to change the eluent from 100% eluent A of methanol:methyl tert-butyl ether:water (81:15:4) to 100% eluent B of methanol:methyl tert-butyl ether:water (7:90:3) over 90 min at a flow rate of 1.0 ml min⁻¹ at 20 °C with an injection volume of 10 μl.

Statistics and reproducibility

All results presented in graphs show the mean data ± s.d. using at least three technical replicates. A two-tailed Student’s t-test was used to calculate P values. Sample sizes for all the micrograph data (confocal images, TEM and AFM) were at least three independently biological replicates, and the replicate experiments yielded similar results.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

All produced data that support the findings of this study are included in this published article. The internal bacterial database used for BBSniffer is available at GuitHub (https://github.com/xbiome/BBSniffer/blob/publish/Database/Bacteria_database.xlsx). The structural factors and coordinates of the Spa2 have been deposited in the PDB under ID no. 7WOI. Source data are provided with this paper.

Code availability

The code of BBSniffer software has been deposited in GitHub: https://github.com/xbiome/BBSniffer/.

References

Tang, T. C. et al. Materials design by synthetic biology. Nat. Rev. Mater. 6, 332–350 (2021).
Article CAS Google Scholar
Huang, Y. et al. Engineering microbial systems for the production and functionalization of biomaterials. Curr. Opin. Microbiol. 68, 102154 (2022).
Article CAS PubMed Google Scholar
Gilbert, C. et al. Living materials with programmable functionalities grown from engineered microbial co-cultures. Nat. Mater. 20, 691–700 (2021).
Article CAS PubMed Google Scholar
Cao, Y. et al. Programmable assembly of pressure sensors using pattern-forming bacteria. Nat. Biotechnol. 35, 1087–1093 (2017).
Article CAS PubMed PubMed Central Google Scholar
Pu, J. et al. Virus disinfection from environmental water sources using living engineered biofilm materials. Adv. Sci. 7, 1903558 (2020).
Article CAS Google Scholar
Liu, A. P. et al. The living interface between synthetic biology and biomaterial design. Nat. Mater. 21, 390–397 (2022).
Article CAS PubMed PubMed Central Google Scholar
Praveschotinunt, P. et al. Engineered E. coli Nissle 1917 for the delivery of matrix-tethered therapeutic domains to the gut. Nat. Commun. 10, 5580 (2019).
Article CAS PubMed PubMed Central Google Scholar
Dai, Z. et al. Versatile biomanufacturing through stimulus-responsive cell-material feedback. Nat. Chem. Biol. 15, 1017–1024 (2019).
Article CAS PubMed Google Scholar
Dai, Z. et al. Living fabrication of functional semi-interpenetrating polymeric materials. Nat. Commun. 12, 3422 (2021).
Article CAS PubMed PubMed Central Google Scholar
Nguyen, P. Q. et al. Wearable materials with embedded synthetic biology sensors for biomolecule detection. Nat. Biotechnol. 39, 1366–1374 (2021).
Article CAS PubMed Google Scholar
Bird, L. J. et al. Engineered living conductive biofilms as functional materials. MRS Commun. 9, 505–517 (2019).
Article CAS Google Scholar
Rodrigo-Navarro, A., Sankaran, S., Dalby, M. J., Del Campo, A. & Salmeron-Sanchez, M. Engineered living biomaterials. Nat. Rev. Mater. 6, 1175–1190 (2021).
Article Google Scholar
Bracha, D., Walls, M. T. & Brangwynne, C. P. Probing and engineering liquid-phase organelles. Nat. Biotechnol. 37, 1435–1445 (2019).
Article CAS PubMed Google Scholar
Bourdeau, R. W. et al. Acoustic reporter genes for noninvasive imaging of microorganisms in mammalian hosts. Nature 553, 86–90 (2018).
Article CAS PubMed PubMed Central Google Scholar
Nguyen, P. Q., Botyanszki, Z., Tay, P. K. R. & Joshi, N. S. Programmable biofilm-based materials from engineered curli nanofibres. Nat. Commun. 5, 4945 (2014).
Article CAS PubMed Google Scholar
Huang, J. et al. Programmable and printable Bacillus subtilis biofilms as engineered living materials. Nat. Chem. Biol. 15, 34–41 (2019).
Article CAS PubMed Google Scholar
McBee, R. M. et al. Engineering living and regenerative fungal-bacterial biocomposite structures. Nat. Mater. 21, 471–478 (2021).
Article PubMed Google Scholar
An, B. et al. Programming living glue systems to perform autonomous mechanical repairs. Matter 3, 2080–2092 (2020).
Article Google Scholar
Wang, Y. et al. Living materials fabricated via gradient mineralization of light-inducible biofilms. Nat. Chem. Biol. 17, 351–359 (2021).
Article CAS PubMed Google Scholar
Caro-Astorga, J., Walker, K. T., Herrera, N., Lee, K. Y. & Ellis, T. Bacterial cellulose spheroids as building blocks for 3D and patterned living materials and for regeneration. Nat. Commun. 12, 5027 (2021).
Article CAS PubMed PubMed Central Google Scholar
Charrier, M. et al. Engineering the S-layer of Caulobacter crescentus as a foundation for stable, high-density, 2D living materials. ACS Synth. Biol. 8, 181–190 (2018).
Article Google Scholar
Kim, L. J. et al. Prospecting for natural products by genome mining and microcrystal electron diffraction. Nat. Chem. Biol. 17, 872–877 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ramirez, N. A., Das, A. & Ton-That, H. New paradigms of pilus assembly mechanisms in gram-positive actinobacteria. Trends Microbiol. 28, 999–1009 (2020).
Article CAS PubMed PubMed Central Google Scholar
Moradali, M. F. & Rehm, B. H. Bacterial biopolymers: from pathogenesis to advanced materials. Nat. Rev. Microbiol. 18, 195–210 (2020).
Article CAS PubMed PubMed Central Google Scholar
Blin, K. et al. antiSMASH 6.0: improving cluster detection and comparison capabilities. Nucleic Acids Res. 49, W29–W35 (2021).
Article CAS PubMed PubMed Central Google Scholar
Skinnider, M. A., Merwin, N. J., Johnston, C. W. & Magarvey, N. A. PRISM 3: expanded prediction of natural product chemical structures from microbial genomes. Nucleic Acids Res. 45, W49–W54 (2017).
Article CAS PubMed PubMed Central Google Scholar
Weber, T. et al. CLUSEAN: a computer-based framework for the automated analysis of bacterial secondary metabolite biosynthetic gene clusters. J. Biotechnol. 140, 13–17 (2009).
Article CAS PubMed Google Scholar
McConnell, S. A. et al. Protein labeling via a specific lysine-isopeptide bond using the pilin polymerizing sortase from Corynebacterium diphtheriae. J. Am. Chem. Soc. 140, 8420–8423 (2018).
Article CAS PubMed PubMed Central Google Scholar
Zhao, N. et al. Development of a transcription factor-based diamine biosensor in Corynebacterium glutamicum. ACS Synth. Biol. 10, 3074–3083 (2021).
Article CAS PubMed Google Scholar
Mandlik, A., Swierczynski, A., Das, A. & Ton-That, H. Pili in gram-positive bacteria: assembly, involvement in colonization and biofilm development. Trends Microbiol. 16, 33–40 (2008).
Ton‐That, H., Marraffini, L. A. & Schneewind, O. Sortases and pilin elements involved in pilus assembly of Corynebacterium diphtheriae. Mol. Microbiol. 53, 251–261 (2004).
Article PubMed Google Scholar
Kang, H. J., Paterson, N. G., Gaspar, A. H., Ton-That, H. & Baker, E. N. The Corynebacterium diphtheriae shaft pilin SpaA is built of tandem Ig-like modules with stabilizing isopeptide and disulfide bonds. Proc. Natl Acad. Sci. USA 106, 16967–16971 (2009).
Article CAS PubMed PubMed Central Google Scholar
Kang, H. J. et al. A slow-forming isopeptide bond in the structure of the major pilin SpaD from Corynebacterium diphtheriae has implications for pilus assembly. Acta Crystallogr. D. Biol. Crystallogr. 70, 1190–1201 (2014).
Article CAS PubMed PubMed Central Google Scholar
Kang, H. J., Coulibaly, F., Clow, F., Proft, T. & Baker, E. N. Stabilizing isopeptide bonds revealed in gram-positive bacterial pilus structure. Science 318, 1625–1628 (2007).
Article CAS PubMed Google Scholar
Budzik, J. M. et al. Intramolecular amide bonds stabilize pili on the surface of bacilli. Proc. Natl Acad. Sci. USA 106, 19992–19997 (2009).
Article CAS PubMed PubMed Central Google Scholar
Wang, X. et al. Photocatalyst-mineralized biofilms as living bio-abiotic interfaces for single enzyme to whole-cell photocatalytic applications. Sci. Adv. 8, eabm7665 (2022).
Reddington, S. C. & Howarth, M. Secrets of a covalent interaction for biomaterials and biotechnology: SpyTag and SpyCatcher. Curr. Opin. Chem. Biol. 29, 94–99 (2015).
Article CAS PubMed Google Scholar
Lau, Y. H., Giessen, T. W., Altenburg, W. J. & Silver, P. A. Prokaryotic nanocompartments form synthetic organelles in a eukaryote. Nat. Commun. 9, 1311 (2018).
Article PubMed PubMed Central Google Scholar
Tang, H. et al. Efficient yeast surface-display of novel complex synthetic cellulosomes. Microb. Cell. Fact. 17, 122 (2018).
Article PubMed PubMed Central Google Scholar
Kodama, Y. & Hu, C. D. An improved bimolecular fluorescence complementation assay with a high signal-to-noise ratio. BioTechniques 49, 793–805 (2010).
Article CAS PubMed Google Scholar
Lin, K., Han, S. & Zheng, S. Application of Corynebacterium glutamicum engineering display system in three generations of biorefinery. Microb. Cell. Fact. 21, 14 (2022).
Article CAS PubMed PubMed Central Google Scholar
Anusree, M., Wendisch, V. F. & Nampoothiri, K. M. Co-expression of endoglucanase and β-glucosidase in Corynebacterium glutamicum DM1729 towards direct lysine fermentation from cellulose. Bioresour. Technol. 213, 239–244 (2016).
Article CAS PubMed Google Scholar
Heider, S. A., Peters-Wendisch, P. & Wendisch, V. F. Carotenoid biosynthesis and overproduction in Corynebacterium glutamicum. BMC Microbiol. 12, 198 (2012).
Article CAS PubMed PubMed Central Google Scholar
Li, C. et al. Heterologous production of α-Carotene in Corynebacterium glutamicum using a multi-copy chromosomal integration method. Bioresour. Technol. 341, 125782 (2021).
Article CAS PubMed Google Scholar
Wang, Y. et al. Directed evolution: methodologies and applications. Chem. Rev. 121, 12384–12444 (2021).
Article CAS PubMed Google Scholar
Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021).
Article CAS PubMed PubMed Central Google Scholar
Lu, H. et al. Machine learning-aided engineering of hydrolases for PET depolymerization. Nature 604, 662–667 (2022).
Article CAS PubMed Google Scholar
Ma, X. et al. Upcycling chitin-containing waste into organonitrogen chemicals via an integrated process. Proc. Natl Acad. Sci. USA 117, 7719–7728 (2020).
Article CAS PubMed PubMed Central Google Scholar
Anishchenko, I. et al. De novo protein design by deep network hallucination. Nature 600, 547–552 (2021).
Article CAS PubMed PubMed Central Google Scholar
Cai, P. et al. SynBioStrainFinder: a microbial strain database of manually curated CRISPR/Cas genetic manipulation system information for biomanufacturing. Microb. Cell. Fact. 21, 87 (2022).
Article CAS PubMed PubMed Central Google Scholar
Criscuolo, A. A fast alignment-free bioinformatics procedure to infer accurate distance-basedphylogenetic trees from genome assemblies. Res. Ideas Outcomes 5, e36178 (2019).
Article Google Scholar
Yu, G., Lam, T. T.-Y., Zhu, H. & Guan, Y. Two methods for mapping and visualizing associated data on phylogeny using ggtree. Mol. Biol. Evol. 35, 3041–3043 (2018).
Article CAS PubMed PubMed Central Google Scholar
Huerta-Cepas, J. et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-Mapper. Mol. Biol. Evol. 34, 2115–2122 (2017).
Article CAS PubMed PubMed Central Google Scholar
Römling, U. & Galperin, M. Y. Bacterial cellulose biosynthesis: diversity of operons, subunits, products, and functions. Trends Microbiol. 23, 545–557 (2015).
Article PubMed PubMed Central Google Scholar
Kennedy, N. W., Mills, C. E., Nichols, T. M., Abrahamson, C. H. & Tullman-Ercek, D. Bacterial microcompartments: tiny organelles with big potential. Curr. Opin. Microbiol. 63, 36–42 (2021).
Article CAS PubMed Google Scholar
Budzik, J. M., Marraffini, L. A. & Schneewind, O. Assembly of pili on the surface of Bacillus cereus vegetative cells. Mol. Microbiol. 66, 495–510 (2007).
Article CAS PubMed Google Scholar
Mccoy, A. J. et al. PHASER crystallographic software. J. Appl. Crystallogr. 40, 658–674 (2007).
Article CAS PubMed PubMed Central Google Scholar
Dong, C. et al. Engineering Pichia pastoris with surface-display minicellulosomes for carboxymethyl cellulose hydrolysis and ethanol production. Biotechnol. Biofuels 13, 108 (2020).
Article CAS PubMed PubMed Central Google Scholar

Download references

Acknowledgements

This work was partially sponsored by the National Key R&D Program of China (grant no. 2020YFA0908100) (C.Z.), the National Science Fund for Distinguished Young Scholars (grant no. 32125023) (C.Z.), the Shenzhen Science and Technology Program (grant no. ZDSYS20220606100606013) (C.Z.), the National Natural Science Foundation of China (grant no. U1932204) (C.Z.), the Guangdong Basic and Applied Basic Research Foundation (grant no. 2021A1515110149) (Y.H.), the National Natural Science Foundation of China (grant no. 32301226) (Y.H.), the Shenzhen Science and Technology Program (grant no. ZDSYS20210623091810032) (J.Z.) and the National Natural Science Foundation of China (grant nos. 32271501 and 31971354) (N.L.). We thank the Electron Microscopy Center and the Analytical Instrumentation Center at the School of Physical Science and Technology, ShanghaiTech University; the Core Facility at Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences. We also thank the staff of beamlines BL18U1 of Shanghai Synchrotron Radiation Facility for access and help with the X-ray data collection. We also thank S. Zheng (South China University of Technology) for the kind gift of the C. glutamicum ATCC 14067 strain and the RecET/Cre-loxP system related plasmids.

Author information

Authors and Affiliations

Key Laboratory of Quantitative Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Yuanyuan Huang, Yanfei Wu, Jie Wang, Yanyi Wang, Jicong Zhang, Shengkun Dai, Wenjuan Zhao, Bolin An, Jiahua Pu, Yaomin Wang, Nan Li, Jiahai Zhou & Chao Zhong
Center for Materials Synthetic Biology, Shenzhen Institute of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
Yuanyuan Huang, Jie Wang, Yanyi Wang, Jicong Zhang, Bolin An, Jiahua Pu, Yaomin Wang & Chao Zhong
Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Shenzhen, China
Yuanyuan Huang, Jiahai Zhou & Chao Zhong
Shenzhen Xbiome Biotech Co. Ltd, Shenzhen, China
Han Hu, Bangzhuo Tong & Yan Tan
School of Physical Science and Technology, ShanghaiTech University, Shanghai, China
Siyu Zhang
National Facility for Protein Science in Shanghai, Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai, China
Yue Yin & Chao Peng

Authors

Yuanyuan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yanfei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Han Hu
View author publications
You can also search for this author in PubMed Google Scholar
Bangzhuo Tong
View author publications
You can also search for this author in PubMed Google Scholar
Jie Wang
View author publications
You can also search for this author in PubMed Google Scholar
Siyu Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yanyi Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jicong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yue Yin
View author publications
You can also search for this author in PubMed Google Scholar
Shengkun Dai
View author publications
You can also search for this author in PubMed Google Scholar
Wenjuan Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Bolin An
View author publications
You can also search for this author in PubMed Google Scholar
Jiahua Pu
View author publications
You can also search for this author in PubMed Google Scholar
Yaomin Wang
View author publications
You can also search for this author in PubMed Google Scholar
Chao Peng
View author publications
You can also search for this author in PubMed Google Scholar
Nan Li
View author publications
You can also search for this author in PubMed Google Scholar
Jiahai Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Yan Tan
View author publications
You can also search for this author in PubMed Google Scholar
Chao Zhong
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

C.Z. and Y.H. conceived the concept and directed the research. C.Z. and Y.H. designed and conducted the experiments and data analysis. J.Z. and Y.W. carried out crystallographic studies. Y.T., H.H. and B.T. performed the development of software. J.W., Y.M.W. and J.P. participated in plasmids and strains construction. S.Z. contributed to AFM imaging experiments. Y.Y.W., J.Z. and B.A. assisted with TEM imaging experiments. N.L., C.P., Y.Y., S.D. and W.Z. helped perform the mass spectrometry analysis. C.Z. and Y.H. wrote the paper with help from all authors.

Corresponding authors

Correspondence to Jiahai Zhou, Yan Tan or Chao Zhong.

Ethics declarations

Competing interests

C.Z. and Y.H. are co-inventors on patent applications (no. PCT/CN2022/130033) filed by Shenzhen Institute of Advanced Technology, based on the Spa2-based work covered in this article. C.Z., Y.H., Y.T., H.H. and B.T. are co-inventors on patent applications (no. 202310091110.2) and software copyright applications (no. 2023SR0280226) filed by Shenzhen Institute of Advanced Technology and Shenzhen Xbiome Biotech Co. Ltd., based on the BBSniffer software reported in this article. The other authors declare no competing interests.

Peer review

Peer review information

Nature Chemical Biology thanks Anna Duraj-Thatte and the other anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 The use of the BBSniffer software for mining biopolymer-producers of bacterial cellulose (BC), gas vesicle (GV) and bacterial microcompartment (BMC).

(a) For bacteria cellulose-producer mining, 7,109 strains were detected, and 1,767 strains are grouped in industrial microorganisms based on our internal bacterial database for classification. BBSniffer scored the distance between the mined industrial microorganisms and the reference strains of Komagataeibacter xylinus E25 using a distance-based phylogenetic tree. The top five mined industrial workhorses along with recommended information about culture conditions and a gene editing tool of CRISPRi system were provided for further engineering. (b) For gas vesicle-producer mining, 489 strains were detected, and 191 strains are industrial microorganisms. BBSniffer scored the distance between the mined industrial microorganisms and the reference strains of Halobacterium salinarum 91-R6 using a distance-based phylogenetic tree. The top five mined industrial workhorses along with recommended information about culture conditions and a gene editing tool of CRISPR-Cas system were provided for further engineering of gas vesicle. (c) For bacterial microcompartment mining, 4,241 strains were detected, and 1,381 strains are industrial microorganisms. BBSniffer scored the distance between the mined industrial microorganisms and the reference strains of pathogenic Salmonella typhimurium LT2 by a distance-based phylogenetic tree. The top five mined industrial workhorses along with recommended information about culture conditions and a gene editing tool of CRISPR-Cas system were provided for further engineering of bacterial microcompartment.

Extended Data Fig. 2 Quantitative analysis of the amount of ^CgCLP fiber for engineered C. glutamicum using the whole-cell filtration ELISA.

(a) ELISA quantification analysis (detection by anti-Spa2 antibody) of the cells defective for Spa1 (Δspa1 strain), Spa3 (Δspa3 strain), or both (Δspa1Δspa3 strain) and the cells overexpression of Spa2 (Spa2 strain). P values are indicated above the bars (Not significant (NS) P > 0.05, **P < 0.01, ***P < 0.001; engineered C. glutamicum versus wild-type C. glutamicum using two-tailed t-test, from three biologically independent samples, mean ± s.d.). (b) ELISA quantification analysis of the engineered cells of Spa2 fused with various domains. (c) ELISA quantification analysis of the engineered cells for validation of the co-assembly of split-Venus components into the ^CgCLP fibers. (d) ELISA quantification analysis of the engineered cells of TrEgl+SdBgl (the case of simultaneously secreted free enzymes of TrEgl and SdBgl) and TrEgl-Spa2+SdBgl-Spa2 (the case of the TrEgl and SdBgl enzymes were coassembled into the ^CgCLP fibers). All samples were cultivated in an incubator at 30 °C without shaking for 2 days in M63 medium and measured the normalized CLP production for each engineered C. glutamicum (OD₆₀₀ = 0.1, 25 μL), using the amount of CLP produced by wild-type C. glutamicum as a benchmark. N=3 biologically independent experiments were performed in b, c and d; data are presented as mean ± s.d.

Source data

Extended Data Fig. 3 Probing the structure and essential residues of Spa2 for CLP assembly in C. glutamicum.

(a) The X-ray crystal structure of Spa2 is arranged in three tandem Ig-like domains, N-domain (pink), M-domain (blue), and C-domain (green). Residues involved in the formation of three intramolecular isopeptide bonds (yellow) and two disulfide bonds (red) are shown as sticks. b, c, Three intramolecular isopeptide bonds (b) and two disulfide bonds (c) in Spa2 monomer are rendered from the Spa2 structure with a 2Fo-Fc electron-density map contoured at 1.0 σ. Hydrogen bonds are shown as black dashed lines. d, e, Genetic manipulation in Δspa2 strains (harboring a plasmid that expressed Spa2 or Spa2 variants of K194A, LPLTG_474LALAA478, E158A, D246A, E435A, D246A/E435A, C97A, C380A, and C97A/C380A, respectively) to assess the key residues promoting the formation of inter- and intramolecular isopeptide bonds, and disulfide bonds, in Spa2 by TEM bio-imaging (d) and quantitative analysis of the amount of ^CgCLP fiber by whole-cell filtration ELISA (detection by anti-Spa2 antibody) (e). P values are indicated above the bars. (Not significant (NS) P > 0.05, **P < 0.01, ***P < 0.001, ****P < 0.0001; Spa2 mutated strains versus the Spa2 using two-tailed t-test, from three biologically independent samples, mean ± s.d.). The bars in the TEM images are 200 nm.

Source data

Extended Data Fig. 4 Identification of the disulfide bonds and intramolecular isopeptide bonds formation at appropriate sequence locations in Spa2 by LC–MS/MS analysis.

(a) The cartoon shows the critical features in Spa2, including three intramolecular isopeptide bonds in individual domains, two disulfide bonds in the N-domain (C97-C128) and the C-domain (C380-C432), the pilin motif of YPKN in N-domain, and the sortase cleavage sorting signal motif of LPLTG in C-domain. (b) MS/MS spectrum of the peptide with m/z 1407.4⁴⁺ generated from pepsin digest of Spa2 containing the disulfide bond between Cys97 and Cys128. (c) MS/MS spectrum of the peptide with m/z 1583.7²⁺ generated from pepsin digest of Spa2 containing the disulfide bond between Cys380 and Cys432. (d) MS/MS spectrum of the peptide with m/z 1326.9⁴⁺ generated from pepsin digest of Spa2 containing the Internal isopeptide bond between Lys57 and Asn195. (e) MS/MS spectrum of the peptide with m/z 1324.6³⁺ generated from pepsin digest of Spa2 containing the Internal isopeptide bond between Lys203 and Asn318. (f) MS/MS spectrum of the peptide with m/z 754.6⁴⁺ generated from pepsin digest of Spa2 containing the Internal isopeptide bond between Lys355 and Asp466. For (b)–(f), predicted b- and y-type ions (not all included) are listed above and below the peptide sequence, respectively; the disulfide bonds and intramolecular isopeptide bonds are shown as red and yellow bars, respectively.

Extended Data Fig. 5 Characterization of the ^CgCLP fibers mineralized with CdS nanoparticles.

(a) The immunogold labelling and TEM images of CLP fibers. (b) TEM images of the CdS-mineralized CLP fibers. (c) HRTEM images of the nanofiber/CdS composites in the mineralized CLP fibers. The mineralized CLP fibers exhibited a clear space lattice of 0.36 nm for the CdS nanoparticles. The bars are shown in the panels.

Extended Data Fig. 6 Functional characterization of engineered ^CgCLP with various fusion domains.

(a)TEM images showed that Ni-NTA-decorated AuNPs were anchored onto 6His-Spa2 fibers. (b) Confocal microscopic images showed the green fluorescence emitted from SpyTag-Spa2 cells to which SpyCatcher-EGFP protein binding partners were covalently attached via Spytag-SpyCatcher interaction pairs. (c) Confocal microscopic images show the green fluorescence emitted from SpyCatcher-Spa2 cells to which SpyTag-EGFP protein binding partners were covalently attached via Spytag-SpyCatcher interaction pairs. (d) Fluorescent images and quantification analysis of the immobilization ability of Mfp3Spep-Spa2 cells. Immobilized microspheres (left) on the substrates before (top) and after (bottom) challenge with water jetting at a constant discharge pressure of 5 psi. Quantification analysis of the relative capabilities of different cells (right) with immobilized PS microspheres on the substrate. The number of immobilized microspheres was set to 100% before subjecting them to mechanical challenge with water jetting. P values are indicated above the bars. Not significant (NS) P > 0.05, ***P < 0.001; Mfp3Spep-Spa2 strain versus Spa2 strain using two-tailed t-test, from three biologically independent samples, mean ± s.d. (e) Confocal microscopic images show the green fluorescence emitted from Venus-Spa2 cells. (f) Endo-1,4-β-glucanase activity of CcEgl-Spa2 cells. The P value of CcEgl-Spa2 strain versus Spa2 strain is P = 7.0E-10. ****P < 0.0001. Statistically significant differences were calculated by using a two-tailed t-test, from three biologically independent samples, mean ± s.d. Scale bars, 200 nm in a, 2 μm in b, c, and e, 100μm in d.

Source data

Extended Data Fig. 7 Engineering C. glutamicum as living material for cellulose degradation and lycopene production.

(a) Construction of Δspa2Δdec chassis in which both spa2 gene (defective for ^CgCLP formation) and gene fragment between CEY17_RS03380 and CEY17_RS03560 (with a function of producing the precursor for lycopene production, resulting in the color change of the cells from yellow to white) were deleted. The colony PCR identification indicates that the spa2 gene and genes between CEY17_RS03380 and CEY17_RS03560 were markerless deleted. (b) Construction of P1 plasmid for lycopene production, and P2 and P3 plasmids for cellulose degradation. P1 plasmid co-expressed the dxs gene and crtEBI gene cluster under the control of an IPTG-inducible promoter; P2 plasmid simultaneously expressed Spa2 pilin fusion proteins of TrEgl-Spa2 and SdBgl-Spa2 under two independent expression cassettes with the same transcription and translation elements; P3 plasmid simultaneously expressed proteins of TrEgl and SdBgl under the same genetic parts in the P2 plasmid. (c) The red color of different engineered cells indicates the lycopene accumulation. The C001 strain showed white due to the deletion region between CEY17_RS03380 and CEY17_RS03560, which lacked the synthesis of decaprenoxanthin. The red cells of C002, C003, and C004 indicated that lycopene was successfully accumulated. The C002 cells harbor P1 plasmid; the C003 cells harbor both P1 and P2 plasmids; the C004 cells have both P1 and P3 plasmids. (d) TEM images show that cells of C003, which contain the P2 plasmid, enabled co-assembly of TrEgl and SdBgl into ^CgCLP structure, while the cells of C001, C002, and C004 did not. ^CgCLP was labeled with 10 nm gold particles by immunogold labelling. Scale bars, 200 nm.

Source data

Supplementary information

Supplementary Information

Supplementary Text, Figs. 1–21 and Tables 1–6.

Reporting Summary

Supplementary Data 1

Files including detailed information for BBSniffer-detected CLP-BGC-containing strains. Files as follows: Sheet 1, strain classification for the BBSniffer-detected CLP-BGC-containing strains, indicated as ‘Strain classification’; Sheet 2, BBSniffer-detected industrial microorganisms with information on distance scores, condition of growth and genetic manipulation tools, indicated as ‘Candidate strains’; Sheet 3, annotation results of BBSniffer predicted CLP-BGC in various strains, indicated as ‘BBSniffer predicted strains’; Sheet 4, annotation results of BBSniffer for the experimentally characterized CLP-BGC strains, indicated as ‘Identified strains’; Sheet 5, calculation the success rate of the established workflow via the results detected by BBSniffer for predicted CLP-BGC in various strains and the experimentally characterized CLP-BGC strains, indicated as ‘Calculation of success rate’.

Supplementary Data 2

Files including detailed information for BBSniffer-detected BC-BGC-containing strains. Files as follows: Sheet 1, strain classification for the BBSniffer-detected BC-BGC-containing strains, indicated as ‘Strain classification’; Sheet 2, BBSniffer-detected industrial microorganisms with information on distance scores, condition of growth and genetic manipulation tools, indicated as ‘Candidate strains’; Sheet 3, annotation results of BBSniffer predicted BC-BGC in various strains, indicated as ‘BBSniffer predicted strains’; Sheet 4, annotation results of BBSniffer for the experimentally characterized BC-BGC strains, indicated as ‘Identified strains’; Sheet 5, calculation the success rate of the established workflow via the results detected by BBSniffer for predicted BC-BGC in various strains and the experimentally characterized BC-BGC strains, indicated as ‘Calculation of success rate’.

Supplementary Data 3

Files including detailed information for BBSniffer-detected GV-BGC-containing strains. Files as follows: Sheet 1, strain classification for the BBSniffer-detected GV-BGC-containing strains, indicated as ‘Strain classification’; Sheet 2, BBSniffer-detected industrial microorganisms with information on distance scores, condition of growth and genetic manipulation tools, indicated as ‘Candidate strains’; Sheet 3, annotation results of BBSniffer predicted GV-BGC in various strains, indicated as ‘BBSniffer predicted strains’; Sheet 4, annotation results of BBSniffer for the experimentally characterized GV-BGC strains, indicated as ‘Identified strains’; Sheet 5, calculation the success rate of the established workflow via the results detected by BBSniffer for predicted GV-BGC in various strains and the experimentally characterized GV-BGC strains, indicated as ‘Calculation of success rate’.

Supplementary Data 4

Files including detailed information for BBSniffer-detected BMC-BGC-containing strains. Files as follows: Sheet 1, strain classification for the BBSniffer-detected BMC-BGC-containing strains, indicated as ‘Strain classification’; Sheet 2, BBSniffer-detected industrial microorganisms with information on distance scores, condition of growth, and genetic manipulation tools, indicated as ‘Candidate strains’; Sheet 3, annotation results of BBSniffer predicted BMC-BGC in various strains, indicated as ‘BBSniffer predicted strains’; Sheet 4, annotation results of BBSniffer for the experimentally characterized BMC-BGC strains, indicated as ‘Identified strains’; Sheet 5, calculation the success rate of the established workflow via the results detected by BBSniffer for predicted BMC-BGC in various strains and the experimentally characterized BMC-BGC strains, indicated as ‘Calculation of success rate’.

Source data

Source Data Fig. 4

Statistical source data.

Source Data Fig. 5

Statistical source data.

Source Data Extended Data Fig. 2

Statistical source data.

Source Data Extended Data Fig. 3

Statistical source data.

Source Data Extended Data Fig. 6

Statistical source data.

Source Data Extended Data Fig. 7

Unprocessed gel.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huang, Y., Wu, Y., Hu, H. et al. Accelerating the design of pili-enabled living materials using an integrative technological workflow. Nat Chem Biol 20, 201–210 (2024). https://doi.org/10.1038/s41589-023-01489-x

Download citation

Received: 28 September 2022
Accepted: 17 October 2023
Published: 27 November 2023
Issue Date: February 2024
DOI: https://doi.org/10.1038/s41589-023-01489-x

Subjects

Abstract

Similar content being viewed by others

Main

Results

Mining biopolymer-producing bacteria through BBSniffer

Probing the CLP assembly in C. glutamicum

Programming CgCLP as an extracellular protein scaffold

Pili-enabled ELMs for biomass-to-chemical conversion

Discussion

Methods

BBSniffer software and associated workflow

Genome and protein profile extraction

Biopolymer BGC detection

Building an internal bacterial database

Phylogenetic tree construction

Candidate strain generation

Assessment of the performance of BBSniffer

General methods

Growth media

TEM and immunogold labeling

TEM imaging

Immunogold labeling

Quantitative assay of CLP via whole-cell filtration ELISA

Protein crystallization and structure determination

Enzymatic activity assay

Quantitative analysis of lycopene by HPLC

Statistics and reproducibility

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding authors

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Extended data

Supplementary information

Source data

Rights and permissions

About this article

Cite this article

Share this article

Search

Quick links

Programming ^CgCLP as an extracellular protein scaffold