Introduction

Photosynthesis is the process that harvests solar energy to synthesize organic compounds that can ultimately be utilized to drive cellular processes by all forms of life. Photosynthesis is known from cyanobacteria to their descendants including algae and vascular plants1. There are three different photosynthetic pathways in terrestrial plants for fixation of carbon dioxide (CO2): C3, C4, and CAM. C3 photosynthesis is employed by most vascular plants. C4 plants represent about 3% of vascular plants2, while CAM plants represent about 6%3. Both C4 and CAM are add-ons to the C3 pathway. C4 and CAM metabolisms are similar in biochemistry but CO2 concentration steps are spatially separated in C4 rather than temporally as in CAM. C4 minimizes photorespiration by concentrating CO2 in bundle sheath cells, which relies in part on the unique cellular structure (Fig. 1a). Many C4 plants are agronomically important species, such as maize and sugarcane4. CAM plants have high water-use efficiency (WUE, expressed as mmol CO2 mol−1 H2O), which is a direct consequence of the fact that they open their stomata at night and keep them closed during the daytime5. WUE for carbon assimilation in CAM plants is much higher than in C3 or C4 plants. It will be 2–10 times higher than that of C4 plants and 2.6–20 times higher than that of C3 plants6. One of the major differences between C4 and CAM photosynthesis centers on the temporal regulation of CO2 absorption and fixation (Fig. 1).

Fig. 1: Photosynthetic reactions in C4 and CAM plants.
figure 1

a NADP-malic enzyme type of C4 pathway. b Carbon fixation in CAM plants.

CAM is found in over 400 genera across 36 families of vascular plants7 and has evolved multiple times independently from diverse ancestral C3 plants3. While we know ecological factors such as drought condition and CO2 concentration drive the evolution of CAM8,9,10, far less is known about genetics. Gene family duplication was previously proposed as the driver of CAM metabolism evolution through neofunctionalization of newly duplicated paralogous genes3, while others proposed that C4 and CAM photosynthesis may have arisen through the re-organization of metabolic processes already present in C3 plants11,12,13.

The past few years have seen the rapid progresses in genomics, transcriptomics, proteomics, and metabolomics for an increasing number of plant species including CAM species. The orchid Phalaenopsis equestris was the first CAM species for which the genome was assembled in 201514. In the same year, the genome of fruit pineapple (Ananas comosus var. comosus) cultivar ‘F153’, which has been cultivated by Del Monte for 50 years, was sequenced, and the evolution of CAM photosynthesis was investigated15. In a following work, temporal and spatial transcriptomic profiles of CAM-performing mature leaves have also been studied in fruit pineapple16. Recently, we sequenced the bracteatus pineapple (Ananas comosus var. bracteatus) accession CB5 genome, and assembled to chromosomal level17. The genome of Kalanchoë fedtschenkoi, a eudicot CAM species was also available in 201718. Furthermore, multi-dimensional omics data were available for CAM species, such as Agave americana19,20, ochids21,22, and Talinum triangulare23. These progresses for species that evolved CAM independently provide an excellent resource for comparative analyses, which will help us have a better understanding on the evolution of CAM photosynthesis.

Cis-elements of stomatal movement-related genes

Stomata were first described as ‘pore-like’ structures on the surface of leaves over three centuries ago24. Since the earliest examples of stomata were discovered in the leaf fossil record, plants have been evolving in terms of size and density of stomata to maintain the maximum leaf conductance as the atmosphere CO2 changed25. Stomata play an essential role in controlling of transpiration rate and water homeostasis in plants26. Stomatal movement can be stimulated by different environmental factors, such as light, abscisic acid (ABA), pathogens, CO2 and air humidity27. Among them, air humidity and ABA are directly related to water status in plants28. In CAM plants, the diel rhythms of stomatal conductance and transpiration are closely linked to the net CO2-uptake rhythm5.

CAM plants present a reverse stomatal conductance pattern by assimilating CO2 during the night when the temperature is low resulting in lower evapotranspiration rate compared to C3 and C4 plants29,30. This unique pattern of stomatal movement leads to the higher WUE in CAM plants31. The reverse stomatal rhythm has aroused curiosity and investigation for centuries32. Understanding the regulation of stomatal movement-related genes in CAM species may provide promising opportunities for engineering crops with higher WUE32.

We identified 118 stomatal movement-related genes in A. comosus var. comosus, 95 in A. comosus var. bracteatus, 121 in P. equestris, 140 in Arabidopsis, 123 in rice, and 121 in sorghum (Supplementary Table S1). Based on the GO annotation, the stomatal movement-related genes were divided into three categories, including genes involved in stomatal opening, stomatal closure, and regulation of stomatal movement. For genes involved in stomatal movement, the CIRCADIAN CLOCK ASSOCIATED 1 (CCA1)-binding site (CBS; AAAAATCT) and G-box binding site (CACGTG) showed more than 10% or higher frequency than the expected frequencies based on random chance in A. comosus var. comosus (Table 1). The G-box element was enriched in genes involved in all three categories in A. comosus var. comosus (Table 1). The evening element (EE; AAAATATC) and CBS were enriched in 123 stomatal movement-related genes in rice, whereas the morning element (MOE; CCACAC) was only enriched in stomatal opening category in rice (Table 1). When comparing with non-CAM species, Motif ERF73, ERF7, and ABR1 were enriched in CAM species (Supplementary Table S2). Based on these in silico findings, we propose the hypothesis that these different sets of cis elements regulate stomatal opening during the day and closure during the night for these three C3 and C4 species. Interestingly, stomatal movement-related genes in A. comosus var. comosus have higher frequencies of circadian clock cis-regulatory elements than A. comosus var. bracteatus (Table 1).

Table 1 Frequency of circadian clock-associated motifs (per kb) in 2 kb promoter regions of genes involved in stomatal movementa.

In A. americana, the temporal re-programming of particular genes, including CO2 and ABA signaling and turgor pressure regulating genes are essential to regulate stomatal movement19. Comparative transcriptomic analyses between the C3 and CAM Erycina species also showed that genes involved in light and ABA signaling are altered22. The numbers of genes that contain cis-elements involved in several key stomatal movement pathways, such as light, ABA, and stress, are summarized (Table 2). ABA responsiveness-related motif (ABRE) appeared most frequently compared to other signaling pathways in the three species (Arabidopsis, P. equestris, and sorghum) with different photosynthetic pathways. From previous studies, exogenous ABA can induce stomatal closure and the expression and activity of CAM33,34. Moreover, stress-related motif (STRE) was the most frequent in the stomatal related genes of rice and pineapple (Table 2). The stress-induced stomatal movement signaling pathway is closely related to the water status of the plant35. Further genomic and molecular analysis of potential stomatal movement genes will enable us to have a comprehensive understanding of stomatal biology of CAM plants, and might provide candidate genes for engineering crop plants with higher sustainable production32,36.

Table 2 The number of genes and their percentages to the total genes of the genomes that contain cis-elements involved in partial key stomatal movement pathways annotated at promoter regions of orthologs in A. comosus var. comosus, A. comosus var. bracteatus, P. equestris, Arabidopsis, rice, and sorghum.

Diurnal transcript abundance patterns of CAM pathway genes: pineapple as an example

The pineapple genome assembly also allowed the identification of full- and partial-length predicted amino-acid sequences of the key metabolic enzymes comprising the core carboxylation module of CAM responsible for nocturnal fixation of CO215,31,37. Carbonic anhydrase (CA), catalyzing the conversion of CO2 into HCO3–, is responsible for the first step in CO2 assimilation both in C4 and CAM plants. All three CA subfamily (α, β, and γ) enzymes were identified in pineapple genome (Supplementary Table S3). Only βCA genes (AccβCA2–1 and AccβCA2–2) implicated in CAM-specific roles due to their mRNA abundance in green leaf tissue15, indicating that βCA may acts as the enzyme in the initiation of CO2 fixation.

Three genes encoding the key enzyme PPC responsible for nocturnal CO2 fixation were identified in the genome assembly, all of which are predicted to be localized to the cytosol as expected15. Three PPC genes were identified in comosus pineapple genome (Supplementary Table S3, Supplementary Fig. S1). Among these three PPC genes, AccPPC1 is the most abundant transcript (>3000 FPKM, fragments per kilobase of exon per million fragments mapped) and displayed highest abundance at 6 pm (>5500 FPKM) in CAM-performing leaf tissues. In T. triangulare, a facultative CAM species, PPC was upregulated 25-fold (to 15,510 rpm, reads per million) at midnight on day 9 and 12 of water limitation when indicative of CAM was observed23. Comparative transcriptomic analyses between the C3 and CAM Erycina species also demonstrated that PPC gene in CAM Erycina displayed higher abundance than in C3 Erycina22. These results suggest that high levels of PPC transcripts are important for CAM.

PPC undergoes reversible N-terminal phosphorylation by a circadian clock-controlled PPC kinase (PPCK), which reduces the sensitivity of the enzyme to allosteric inhibition by L-malate and increases its affinity for its substrate phosphoenolpyruvate (PEP)38,39. In A. americana, which is an obligate CAM plant, PPCK1 gene displayed diel transcripts abundance pattern, suggesting its important role in temporal re-programming of CAM20. In K. fedtschenkoi, PPCK1 is also essential for nocturnal CO2 fixation; moreover, knock-down of oscillations in the transcript abundance of PPCK1 will lead to the altered accumulation and periodicity of core circadian clock-related transcripts40. In pineapple, AccPPCK2 was found to exhibit greater mRNA abundance than AccPPCK1, and AccPPCK2 also displayed diel mRNA abundance with high levels at night, suggesting that it functions in CAM15.

In the final metabolic step of phase I, the OAA formed as a result of PEP carboxylation is reduced to malate by NAD(P)-dependent malate dehydrogenase (MDH). Fourteen genes in pineapple encode MDH: three genes (AccMDH4, AccMDH5, and AccMDH8) are predicted to be cytosolic-localized and strongly expressed in leaves, suggesting their potential to perform functional roles in CAM; four genes (AccMDH10, AccMDH11, AccMDH12, and AccMDH13) are tandemly duplicated and lowly expressed except AccMDH1315.

In Arabidopsis, the malate is transported into the vacuole by an inward-rectifying anion-selective ion channel belonging to the aluminium-activated malate transporter (ALMT) family41. In K. fedtschenkoi, a putative ALMT6 gene (Kaladp0062s0038) displays diel mRNA abundance in leaves18. There are eight candidate ALMT family genes in pineapple, including three ALMT9 genes (AccALMT9-1–3) and five ALMT1 genes (AccALMT1-1–5). Only two ALMT9 genes (AccALMT9-1 and AccALMT9-3) showed high abundant transcript levels in photosynthetic leaf tissues. ALMT1 only has higher steady-state transcript levels at the midday on day 9 of water limitation in T. triangulare23. The malate then undergoes protonation, with protons supplied by the tonoplast H+-ATPase and H+-PPiase, and is stored as malic acid. In the daytime, malic acid is effluxed out of the vacuole possibly through a putative tonoplast dicarboxylate transporter (tDT)42. There are five DT genes (AccDT1–5) in the pineapple genome, and AccDT2 and AccDT3 display specifically high abundant transcripts in daytime in photosynthetic leaf tissues, indicating that they may play a role in malic acid efflux in CAM. Decarboxylation of the malate during phase III of the CAM cycle occurs in pineapple primarily via PEP carboxykinase (PCK)30,43, which, following oxidation of malate to OAA by NAD(P)-dependent MDH, decarboxylates OAA to PEP. A single PCK gene (AccPCK1) is present in the pineapple genome and it is predicted to encode a cytosolic enzyme15. It is an ortholog of AtPCK1 (AT4G37870.1), one of two PCK genes in Arabidopsis, which is expressed in guard cells and is implicated in stomatal closure44. Despite the fact that extractable PCK activity from pineapple leaves is over 15 times higher than that of the malic enzymes (MEs)45, and it remains possible that malate may also be decarboxylated, in part, by ME in pineapple46. The comosus pineapple genome contains five ME genes encoding both NAD- and NADP-ME (Supplementary Table S3): two NADP-ME genes (AccNADP-ME1 and AccNADP-ME3) exhibit higher mRNA levels during the daytime in photosynthetic leaf tissues and one additional NADP-ME gene (AccNADP-ME2) shows none mRNA transcript in leaves; two NAD-ME genes (AccNAD-ME1 and AccNAD-ME2) encoding isoforms predicted to be localized to the mitochondria exhibit moderate abundant mRNA expression and AccNAD-ME2 also displayed higher mRNA level during the daytime15.

The abundant transcript level for ME genes in pineapple suggests that malate decarboxylation also results in the formation of pyruvate, which must then be phosphorylated to PEP by pyruvate phosphate dikinase (PPDK). Consistent with this supposition, a single candidate PPDK1 gene (AccPPDK1) was identified in the pineapple genome15, providing the metabolic flexibility to allow gluconeogenesis via both the PCK and ME/PPDK routes47. AccPPDK1 displayed higher transcript abundance during the daytime. The AtPPDK1 gene encodes an enzyme predicted to be localized to the cytosol, but this enzyme might be localized to either the chloroplast or the cytosol depending upon the production of alternative transcripts arising from two different promoters48. More detailed examination of this locus in pineapple is needed to verify this possibility. Overall, the enzymes making up the carboxylation and decarboxylation pathways in the CAM cycle in pineapple are encoded by gene families that are generally smaller than those encoded by the A. thaliana genome, because pineapple has one fewer whole-genome duplications than that have been reported for Arabidopsis and the grass family49.

Circadian clock-associated cis-elements in CAM genes

In most living organisms, internally synchronized circadian clocks make it possible for them to coordinate behavior and physiology corresponding with the 24 h light-dark cycle. CCA1 and LATE ELONGATED HYPOCOTYL (LHY), two single-MYB domain transcription factors, are central to the circadian oscillator of angiosperms50,51. CCA1 and LHY are morning expressed genes. They act to suppress the expression of the DNA sequence they bind to. CCA1 and LHY are partially redundant, and they can directly bind to the TIMING OF CAB EXPRESSION 1 (TOC1) also known as PRR1 (PSEUDO-RESPONSE REGULATOR 1) promoter to negatively regulate its expression52.

Circadian control of CAM has been implicated as a core component in diel re-programming of metabolism in CAM plants20,53. A comprehensive spatial and temporal survey of gene co-expression clusters in pineapple leaf tissues reveals CAM pathway genes are enriched with clock-associated cis-elements, suggesting circadian regulation of CAM15,16. At dawn, CCA1 and LHY repress evening-phased genes by binding to CBS and EE49. In addition to CBS and EE, the G-box is also enriched in the CCA1 binding regions54,55. TOC1 can bind to MOE as a negative regulator56. In pineapple, all of the three βCA genes contain CBS in their promoter regions (Table 3), suggesting they may have function in βCA genes’ nighttime and early-morning transcripts abundance pattern in photosynthetic leaf tissues. All three copies of PPC genes also contain CBS in their promoter regions, along with MOE or G-box (Table 3). Interestingly, CAM pathway genes in A. comosus var. comosus, contain more circadian clock cis-regulatory elements than A. comosus var. bracteatus (Table 3). Besides the core CAM genes, more than 40% of transcription factors and transcription co-regulators displayed diel rhythmic expression in pineapple, suggesting it is a global adaptation57. In a recent work by Heyduk and colleagues (2018), they demonstrated that some canonical CAM genes were unaltered by comparative transcriptomic analyses between the C3 and CAM Erycina species. However, 149 gene families, including genes involved in light and ABA signaling, had significant differences in network connectivity, indicating that transcriptional cascades changes are critical for the transition from C3 to CAM in Erycina22.

Table 3 Cis-elements annotated at promoter regions of selected CAM photosynthetic genes in pineapple.

Evolution of CAM photosynthesis

C4 and CAM photosynthesis are innovations that evolved in response to decreasing atmospheric levels of CO2 and water-limiting environments2,9. CAM has a higher incidence3, and mutation of CAM genes in CAM species is not lethal40,58. Both C4 and CAM have evolved independently multiple times, even within individual families, or even genera during angiosperm evolution59,60,61. For example, in the Neotropical family Bromeliaceae, to which pineapple belongs, CAM photosynthesis evolved independently at least four, and probably five times59.

Recruitment of pre-existing mechanisms underlying C3 photosynthesis is adopted in Gynandropsis gynandra (referred to previously as Cleome gynandra), a C4 plant which is relatively closely related to Arabidopsis62. Furthermore, gene duplication also plays a profound role in the evolution of C4. For example, βCA genes are tandemly duplicated in sorghum63. After duplication, some C4 genes, such as C4 PPC genes, NADP-MDH genes, and PPDK genes, underwent adaptive evolution63.

Comparative analyses demonstrated signatures of convergence in protein sequence and re-scheduling of diel transcript abundance of genes involved in nocturnal CO2 fixation, stomatal movement, heat tolerance, the circadian clock, and carbohydrate metabolism14,18,21. Firstly, convergent evolution has been detected in terms of diel cycles of gene transcript abundance18. PPCK is a key regulator of PPC, which can activate PPC by phosphorylating it. Both AccPPCK2 and KfPPCK2 showed diel expression patterns18. Secondly, a convergent amino-acid change in PPC2 was discovered to be shared by K. fedtschenkoi and P. equestris and the PPC2 gene in K. fedtschenkoi is a much lower abundance transcript relative to the CAM-associated PPC1 gene, so the function of PPC2 has yet to be linked to CAM directly in either K. fedtschenkoi or P. equestris18.

These findings are consistent with the hypothesis that the CAM photosynthesis evolved as a result of a re-organization of pre-existing metabolic pathways11,15. These different features were later coordinated to form the functional CAM photosynthesis.

Concluding remarks

Genomic studies have led to a renaissance in CAM research. Recent genomic and transcriptomic information from CAM species has improved our understanding of the evolution of CAM photosynthesis14,15,16,17,18,19,20,21. The identified candidate genes provide initial targets for detailed functional studies of how the CAM genes have evolved through regulation of gene expression to gain the observed spatial and temporal expression patterns, and loss of repressors is certainly involved. It may be possible for us to apply genome editing to verify functions of candidate CAM genes. CRISPR/Cas9 technology will be a powerful tool to get higher order mutants of tandemly duplicated genes in the same chromosome, which is impossible to generate by traditional mutagenesis methods.

Water loss from stomata for C3 plants can be very substantial under hot and dry condition. Adjusting the temporal pattern of stomatal movement genes may be a key evolutionary step for switching stomatal opening from the light period to dark32. Enrichment of different sets of circadian clock regulatory cis elements may have played a role in this dramatic shift in gene regulation in pineapple and P. equestris. CAM photosynthesis and its associated high WUE are key evolutionary innovations that adapted to arid environments and/or low CO2 environment and this valuable trait is a direct consequence of stomatal closure throughout hottest and driest part for the 24 h cycle, and leaf succulence.