Main

In total, we analysed 768 samples from 96 sample sites of 29 different organs from 4 germ-free and 4 colonized mice by liquid chromatography–tandem mass spectrometry (LC–MS/MS) and 16S rRNA gene sequencing (Supplementary Table 1). Mapping the first principal coordinate position of each sample from specific-pathogen-free (SPF) mice onto a three-dimensional (3D) mouse model13 enabled us to visualize the similarity of the microbiome and metabolome through all organs and organ systems (Fig. 1a, b; the 3D model is available as Supplementary Data). Different sections through the gastrointestinal tract had unique microbiome and metabolome profiles. There was a distinct difference between the similarity of the two data types in mouse faecal samples. The metabolome differed between faecal samples and the distal gastrointestinal tract, whereas the microbiome was more similar between faeces and colon or caecum samples.

Fig. 1: Global effect of the microbiome on the chemistry of an entire mammal.
figure 1

a, Three-dimensional model of mouse organs mapped with the mean first principal coordinate (Extended Data Fig. 1) as a heat map (according to the colour scale), from the germ-free and SPF mice (n = 4 mice each). Ad, adrenal gland; bl, bladder; br, brain; caec, caecum; col, colon; cx, cervix; duo, duodenum; er, ear; f, faeces; ft, feet; hd, hand; jej, jejunum; kd, kidney; lg, lung; lv, liver; mo, mouth; oes, oesophagus; ov, ovary; sto, stomach; tr, trachea; ut, uterus; vg, vagina. b, Mean percentage and total number of unique spectra in each organ sampled from the two mouse groups. c, Relative abundance (normalized to total ion current (TIC)) of the 30 most differential metabolites between the guts of germ-free and SPF mice. The metabolites are coloured as secondary bile acids (blue), primary bile acids (red), soyasaponins (pink), peptides (yellow) and unknown (brown). Annotations are based on spectral matching or molecular network propagation (level two or three16). Stereochemistry of the annotated molecules cannot be discerned using these methods. d, Mean and 95% confidence interval of the Shannon–Weiner diversity of the metabolomic data in each sample from the gastrointestinal tracts of germ-free and SPF mice. Statistical significance between metabolome diversity in the same sample location between germ-free and SPF mice was tested with a two-sided Mann–Whitney U-test, n = 4. *P = 0.028, #P = 0.057. e, Results of meta-mass-shift chemical profiling17 showing the spectral counts of known mass differences between unique nodes in either germ-free or SPF mice. Each mass difference corresponds to the node-to-node gain or loss of a particular chemical group.

Source Data.

Molecular networking of mouse data

To characterize the chemical effect of the microbiome, we subjected the mass spectrometry data to molecular networking12. The algorithm identified 7,913 spectra, of which 14.7 ± 2.2% were observed in colonized mice and 10.0 ± 0.7% were exclusive to germ-free mice (Fig. 1c, Extended Data Fig. 1). Although the overall profiles showed that the strongest differences between germ-free and SPF mice were in the gastrointestinal tract, molecular networking identified unique chemical signatures from the microbiome in all organs—ranging from 2% in the bladder to 44% in stools (Fig. 1b). The metabolome of the caecum, the main site of microbial fermentation of food, was most-markedly affected by the microbiota. Spectral library searching enabled the annotation of 8.9% of nodes in the molecular network11,15 (level two or three, according to previously published standards16). Many of the changes attributed to the microbiome were location-specific, resulting from the metabolism of plant natural products from food and bile acids (Fig. 1c, Extended Data Figs. 24, Supplementary Data).

In the upper gastrointestinal tract, the Shannon diversity of the metabolomes of germ-free mice mirrored those of SPF mice; in both sets of mice, diversity was low in the oesophagus and higher in the stomach and duodenum. Upon transition to the caecum, however, the diversity of the two groups of mice began to separate (Fig. 1d). The molecular diversity in the caecum and colon of colonized mice was higher than that of germ-free mice, but this was not the case in the stool samples (Fig. 1d). In the duodenum (the location at which the gallbladder adds bile to the intestine), there was a contrast in microbiome and metabolome diversity: a high metabolome diversity corresponded to a low microbial diversity (Fig. 1d, Extended Data Fig. 1).

Molecular networking enabled meta-mass-shift chemical profiling17 (an analysis of chemical transformations on the basis of parent mass shifts between related spectra without the requirement of knowing the molecular structures) of the gastrointestinal tracts of germ-free and SPF mice. In colonized mice, there was a signature for water loss in the duodenum and jejunum and the loss of H2, acetyl and methyl groups in later parts of the gastrointestinal tract (Fig. 1e). Of all the H2 shifts, 23.1% were associated with bile acids, which indicates that colonization resulted in the oxidation of bile acids (a known microbial transformation)18. Deacetylations were also prevalent in colonized mice, although the metabolites on which this occurred remain unidentified. Germ-free mice had mass gains that corresponded to saccharides in all regions of the gastrointestinal tract (Fig. 1e); these gains were primarily associated with plant natural products, such as soyasaponins and flavonoids. The absence of these sugars in SPF mice implicates the microbiome in their metabolism (Extended Data Figs. 2, 3). A unique mass gain of C4H8 was detected in the jejunum and ileum of SPF mice (Fig. 1e) and 18.2% of spectra with this mass gain were derived from an unknown molecule related to the conjugated bile acid glycocholic acid (GCA) (Fig. 2a). Overall, both germ-free and SPF mice had frequent and diverse mass losses between related molecules, but in colonized mice there were fewer molecules that gained a molecular group (Fig. 1e). This indicates that the microbiome contributed more to the catabolic breakdown of molecules, and less to anabolism. However, we found the addition of C4H8 to GCA to be a particularly interesting anabolic reaction that was dependent on the gut microbiome, and we sought to investigate this further.

Fig. 2: Newly identified microbial bile-acid conjugates.
figure 2

a, Structures and molecular networks of newly identified microbiome-conjugated bile acids, with host-conjugated GCA shown for comparison. The molecular network is coloured by mapping to germ-free or SPF mice (according to the colour legend). Inset highlights the parent masses and mass differences between the newly discovered molecules and GCA. Each node represents a clustered tandem mass spectrum; connections between the nodes indicate relationships through the cosine score with their width scaled by the cosine size (cut-off minimum of 0.7). Circular nodes are unknown molecules, and arrowheads are spectra with matches in the GNPS libraries. b, Dot plot of the area-under-the-curve abundance of the newly identified and host-synthesized bile- acid conjugates in each SPF mouse (n = 4), through the mouse gastrointestinal tract and its subsections.

Source Data

Discovery of new conjugated bile acids

Glycine- and taurine-conjugated bile acids were detected in both germ-free and SPF mice. The glycine and taurine amino acids were removed as they passed through the gastrointestinal tract in SPF mice only, which is a known microbial transformation19 (Fig. 1b, Extended Data Fig. 4). The molecular network of conjugated bile acids had several modified forms of these compounds that were present only in colonized mice, including the C4H8 addition that was related to the tandem mass spectra of GCA (Fig. 2a). Our analysis of the tandem mass spectra of three of these SPF-mouse nodes (m/z 556.363, 572.358 and 522.379) showed the maintenance of the core cholic acid, but with a fragmentation pattern that was characteristic of the presence of phenylalanine, tyrosine or leucine through an amide bond at the conjugation site in place of glycine or taurine (Extended Data Fig. 5, Supplementary Table 2). This represents a set of unique amino acid amide conjugations to cholic acid that are mediated by the microbiome, which create the newly identified bile acids phenylalanocholic acid (Phe-chol), tyrosocholic acid (Tyr-chol) and leucocholic acid (Leu-chol). These structures were validated with synthesized standards by retention time and MS/MS matching on several instrument platforms including targeted mass spectrometry (level one matches16) (Extended Data Figs. 5, 6, Supplementary Tables 2, 5). These molecules were detected in the duodenum, jejunum and ileum of SPF mice only, with tenfold-lower levels found in the caecum and colon after targeted mass spectrometry analysis using isotopically labelled internal standards (Supplementary Table 4). The liver-synthesized glycine and taurine conjugates were not only found in these same gut locations, but were also observed in the gall bladder and liver (Fig. 2b, Extended Data Fig. 6). Phe-chol was the most abundant microbial conjugate, on average, across the gastrointestinal tract; it was present at 147.0 nmol g−1 tissue (s.d. ± 99.9) in the jejunum, 83.6 nmol g−1 tissue (s.d. ± 81.3) in the ileum, 4.7 nmol g−1 tissue (s.d. ± 3.4) in the caecum and 11.6 nmol g−1 tissue (s.d. ± 12.2) in the colon. Phe-chol was present at its highest concentration at 447.2 nmol g−1 tissue in a single sample from the jejunum (limit of detection (LOD) in Supplementary Tables 4, 6, 7).

The decreased abundance of these unique bile conjugates in the lower gastrointestinal tract prompted us to investigate whether there was reabsorption in the ileum or further metabolism by the microbiota. We collected portal and peripheral blood from an additional four SPF and six germ-free mice, and screened for the presence of conjugated bile acids. Both taurocholic acid and GCA were present in the portal and peripheral blood of colonized and sterile mice, but the newly identified amino acid amide conjugates were not detected (Extended Data Fig. 6). Furthermore, incubation of these molecules with an actively growing human faecal batch culture showed that the Tyr-, Phe- and Leu-conjugated bile acids were not deconjugated by the microbiota—even when deconjugation readily occurred on the host-synthesized GCA control, a well-known amidate hydrolase activity of bile acids that is mediated by the human microbiota20 (Extended Data Fig. 6). However, oxidation of the cholate core occurred on all three of the newly identified conjugates, which indicates that they could be modified by microbial enzymes even when no concurrent oxidation of GCA was observed (Extended Data Fig. 6).

In the extensive literature relating to bile acids (comprising more than 42,000 publication records in PubMed21,22,23,24,25,26,27), descriptions of unusual conjugations of bile acids are rare. Through 170 years of research into bile-acid chemistry, the accepted standard has been that mammalian bile acids are amide conjugated by a host liver enzyme (known as bile acid–CoA:amino acid N-acyltransferase (BAAT)) with either glycine or taurine. Here we report amide conjugations with phenylalanine, tyrosine and leucine associated with the microbiome in mice, and show that these compounds are common in humans.

Translation to humans

We performed a search using the Mass Spectrometry Search Tool (MASST) of 1,004 public datasets available in the Global Natural Products Social Molecular Networking (GNPS) database, which revealed spectral matches that correspond to Phe-chol, Tyr-chol and Leu-chol in 28 studies comprising samples from the gastrointestinal tract of both mice (3.2 to 59.4% of all samples) and humans11 (1.6 to 25.3% of all samples) (Extended Data Fig. 7). In data from faecal samples collected for the American Gut Project28, at least one of these unique bile acids was found in 1.6% of human faecal samples; Tyr-chol was the most prevalent (n = 490 samples) (Fig. 3a). These bile acids were found in higher frequency in samples from patients with inflammatory bowel disease or cystic fibrosis, or from infants, than in samples from the American gut project (Fig. 3a).

Fig. 3: Presence, synthesis and function of microbial bile-acid conjugates.
figure 3

a, Percentage of samples that were positive for the newly identified bile acids from GNPS public datasets and from paediatric patients with cystic fibrosis (compared to controls without cystic fibrosis). AGP, American gut project28; CF, cystic fibrosis; IBD, inflammatory bowel disease; PS, pancreatic-sufficient; PI, pancreatic-insufficient. The colour coding of the bile acids applies to ac. b, Abundance of the newly identified conjugates in the PRISM and iHMP (NIH Integrative Human Microbiome Project) datasets31. The statistical significance for the PRISM data was tested using the Wald’s test (Crohn’s disease (CD), n = 68 individuals; ulcerative colitis (UC), n = 53 individuals; noninflammatory bowel disease, n = 34 individuals) and for the iHMP dataset with a linear two-sided mixed-effects model. The iHMP comparisons are separated by type of inflammatory bowel disease, and by dysbiotic or nondysbiotic state (for ulcerative colitis, n = 12 dysbiotic and 110 nondysbiotic metabolomes; for Crohn’s disease, n = 48 dysbiotic, and 169 nondysbiotic metabolomes; for noninflammatory bowel disease, n = 15 dysbiotic and 107 nondysbiotic metabolomes). Significance is shown using Benjamini–Hochberg-corrected P values. Leu-chol, q = 0.031; Tyr-chol, q = 0.0074; Phe-chol, q = 0.0043. *q < 0.05, **q < 0.05. Boxes represent the interquartile range, notch is the 95% confidence interval of the mean, centre is the median and whiskers are 1.5× the interquartile range. c, Extracted ion chromatograms of Phe-chol from cultured isolates of C. bolteae compared to medium control at 0 h and 96 h (top). Experiment was performed twice. d, The ratio of 13C-Phe-chol:12C-Phe-chol in faecal samples of a mouse fed a high- fat diet with 13C-labelled phenylalanine (blue line) or unlabelled phenylalanine (black line) over time. Grey area indicates a three-day period during which a high-fat diet was fed; red indicates when the high-fat diet was supplemented with Phe. e, Quantitative PCR with reverse transcription data showing the mean and s.e.m. of the gene-expression ratio (ΔΔCt) of Fgf15, Shp, Cyp8b1 and Cyp7a1 to the 36B4 (also known as Rplp0) reference control in the ileum and/or liver of mice gavaged with different bile acids, compared to a mock control (corn oil) after 72 h. Statistical significance was tested against the mock control with a two-tailed t-test (n = 4 or 5 mice per group). CA, cholic acid.

Source Data

We reanalysed data deposited in the GNPS/MassIVE repository from a previously published study of the mouse microbiome and liver cancer, which enabled us to compare the abundance of the newly identified bile acid conjugates in mice fed a high-fat diet in comparison to their abundance when the mice were fed a normal chow with or without antibiotics29 (Extended Data Fig. 7). The Phe, Tyr, and Leu amino acid conjugates were undetectable upon exposure to antibiotics, whereas GCA remained—supporting the role of the microbiome in the newly identified conjugation. In the same study29, Phe-chol and Leu-chol were more abundant in mice fed a high-fat diet, with no change observed in the host-conjugated GCA (Extended Data Fig. 7). We further validated this association in data from a separate study in which atherosclerosis-prone mice fed a high-fat diet also had increased levels of the microbial conjugates, without a corresponding change in the host-produced taurocholic acid (Extended Data Fig. 7). Cystic fibrosis is known to result in insufficient production of pancreatic lipase, microbial dysbiosis and the build-up of fat in the gut30. Reanalysis of the public data from a cohort of paediatric patients, we found that these compounds were more prevalent in patients with cystic fibrosis (particularly in those with pancreatic insufficiency) than in healthy controls (Fig. 3a). Finally, detection of the newly identified conjugates in patients with inflammatory bowel disease led us to mine metabolome data from the second stage of the human microbiome project (HMP2)31, which focused on differences between controls and patients with inflammatory bowel disease, including patients with Crohn’s disease or ulcerative colitis—subtypes of inflammatory bowel disease31 (Fig. 3b, Supplementary Table 8). All three metabolites were significantly higher in the dysbiotic state associated with patients with Crohn’s disease, but not in patients with ulcerative colitis (Fig. 3b, Supplementary Data). Our MASST-based mining of public data from the GNPS database showed that these compounds are not only found in healthy humans but are also enriched in individuals with fatty guts and inflammatory bowel disease, which suggests that these compounds may have a potential role in (or be symptoms of) gut dysbiosis and human disease.

Microorganisms make the new bile acids

There was a strong positive correlation between the presence of a species of Clostridium and all three bile acids when mice were fed a high-fat diet (Pearson’s r for Phe-chol, r = 0.73; for Tyr-chol, r = 0.50; and for Leu-chol, r = 0.74) (Extended Data Fig. 7, Supplementary Table 3). The clostridia are known to oxidize, epimerize and deconjugate bile acids32,33. We therefore cultured 20 human gut microorganisms (with an emphasis on Clostridium species) in faecal culture medium34 that contained amino acids and cholic acid precursors to screen for production of the newly identified conjugates. The Clostridium bolteae strains WAL-14578 and CC43001B both synthesized both Phe-chol and Tyr-chol (Extended Data Fig. 8). The addition of labelled 13C-phenylalanine to the medium verified that WAL-1457 could synthesize Phe-chol from the amino acid and cholate precursors (Extended Data Fig. 8). Similarly, we fed mice a high-fat diet with 13C-phenylalanine and were able to detect labelled Phe-chol in their faeces, which demonstrates microbial synthesis in vivo and shows that the amino acid precursors could come from the diet (Fig. 3d). C. bolteae is a bile-resistant gut bacterium that is more common in children with autism spectrum disorder35, is associated with abdominal infections36 and—together with Blautia producta—prevented colonization by vancomycin-resistant Enterococcus species in mice37. The production of these bile acids by C. bolteae further verifies their association with the microbiota of the mouse gut, and implicates them as potentially important for intermicrobial interactions in the gut microbiome. However, addition of the newly identified conjugates to batch cultures of human faecal samples did not affect community structure (Extended Data Fig. 8), which led us to investigate how these compounds may affect gut physiology through host receptor signalling.

New bile acids and the farnesoid X receptor

The farnesoid X receptor (FXR) is a key receptor for bile acids that is expressed in the intestine, liver and other tissues. The most-potent naturally occurring agonistic ligand of FXR is chenodeoxycholic acid, whereas tauro-β-muricholic acid is an FXR antagonist38. To assess the ability of the newly identified bile acids to affect human FXR signalling, we established a luciferase reporter assay in human embryonic kidney (HEK)293 cells39. Phe-chol and Tyr-chol were strong human-FXR agonists (Extended Data Fig. 9, Supplementary Table 9). The phenylalanine conjugate (R2 = 0.92, half maximal effective concentration (EC50) = 4.5 μM) was twice as strong of an agonist as chenodeoxycholic acid (R2 = 0.88, EC50 = 9.7 μM), and the tyrosine conjugate was the most potent of them all (R2 = 0.93, EC50 = 0.14 μM). Furthermore, gavage of mice with these compounds increased expression of the FXR effector genes Fgf15 and Shp (also known as Nr0b2) in the intestine (12.2- and 13.3-fold with Tyr-chol at 24 h, P = 0.029 and 0.009; 6.2 and 9.3-fold at 72 h, P = 0.009 and 0.019) (Fig. 3e, Extended Data Fig. 9). Although Shp expression did not change detectably in the liver at 24 h after gavage, levels were increased 2.3-fold after 72 h (P = 0.017) (Fig. 3e, Extended Data Fig. 9). Changes in expression of the bile-acid synthesis genes Cyp7a1 and Cyp8b1 also showed a time-dependent effect. Cyp7a1 was at 9% of control levels at 24 h (P = 0.001) and Cyp8b1 was at 69% (P = 0.004) (Extended Data Fig. 9). At 72 h (after 4 gavages), Cyp7a1 expression was at 8% of control levels (P = 0.004), and for Cyp8b1 the transcript was further reduced to 2% (P = 0.0002) (Fig. 3e). The strong time-dependent reduction in liver Cyp7a1 and Cyp8b1 transcripts indicates that—similar to the primary bile acid cholic acid—gavage of mice with the newly identified compounds reduced the expression of downstream FXR-target genes that are responsible for bile-acid synthesis in the liver. However, the possibility that this effect was due to FXR agonism through release of cholate from amide conjugate hydrolysis cannot be excluded.

Bile-acid metabolism by the microbiome was first described in the 1960s40. The four known mechanisms of microbial metabolism are dehydroxylation, dehydration and epimerization of the cholesterol backbone, and deconjugation of the amino acids glycine or taurine1,41,42. Here, we identify bile-acid transformation by the microbiome mediated by a fifth and completely different mechanism: amide conjugation of the cholate backbone with the amino acids phenylalanine, tyrosine and leucine. Although there are homologues of the human bile-acid-conjugation gene BAAT in clostridial genomes, the microbial enzyme in question remains unknown. Regardless of the mechanism of their synthesis, the newly identified conjugates stimulate the human FXR receptor in a cell-based system and the expression of FXR-target genes that are responsible for bile-acid production in the liver were reduced when administered to mice. Additional studies are needed to understand the health implications of bile-acid reconjugation by the human microbiome and its potential effects on FXR-related diseases.

Conclusion

This study shows that the chemistry of all organ systems is affected by the presence of the microbiome. The strongest signatures come from the gut, particularly via the breakdown of plant natural products from food and the manipulation of bile acids. The microbiome is primarily a catabolic entity, breaking down compounds through the enzymatic removal of chemical groups. However, we found an anabolic reaction that represents a fifth mechanism of bile-acid metabolism by the microbiome, which operates through unique amino acid conjugations of cholic acid. As the connections between humans and our microbial symbionts become increasingly appreciated, a combination of globally untargeted approaches and the development of tools that interlink these datasets (such as the GNPS and MASST analysis infrastructure) will enable the more-efficient characterization of microbial molecules and efficient translation between model animals and human studies, leading to a better understanding of the deep connection between our microbiota, our metabolites and our health.

Reporting summary

Further information on research design is available in the Nature Research Reporting Summary linked to this paper.