Introduction

Since the discovery of the first modified nucleoside, over 150 different RNA modifications in all kinds of RNA molecules have been identified.1,2 Among them, N6-methyladenosine (m6A) is the most common and abundant modification present in eukaryotic messenger RNAs (mRNAs), and has been identified within RRACH motif (R = G or A; H = A, C, or U) and distributed in 3′UTR regions near stop codons.3,4 m6A, mainly catalyzed by the METTL3/METTL14/WTAP methyltransferase core complex,5,6,7 is recognized by several specific readers such as YTH-domain containing family proteins,8,9,10,11,12 hnRNP proteins13,14 and IGF2BPs,15 and functions in various biological processes like splicing, mRNA export, regulation of mRNA stability, mRNA degradation, translation, etc.16 5-methylcytosine (m5C) and N1-methyladenosine (m1A) are also two modifications occurring along eukaryotic mRNAs; m5C is located in downstream regions of translation initiation site (TIS) and m1A is enriched in upstream of the first splice site around the start codon.17,18 mRNA m5C modification, catalyzed by NSUN2 and specifically bound by ALYREF and YBX1 proteins, has been showed to be necessary for mRNA stability, nuclear-cytoplasmic shuttling and translation, and also in oxidative stress response.17,19,20,21,22 mRNA m1A, although its specific writers and functional readers still remain unclear,23,24 has been suggested to facilitate the translation initiation and be a potential important regulator in response to serum starvation and heat shock conditions.18

Besides the above RNA methylations, N7-methylguanosine (m7G) is also one of the most prevalent modifications occurring in transfer RNA (tRNA) variable loop,25 eukaryotic 18S ribosomal RNA (rRNA)26 and the cap position of mRNA molecules,27 and meanwhile, conserved among prokaryotes, eukaryotes and archaea.

The tRNA subset harboring m7G has been primarily identified in yeast2 and recently in mouse.28 This modification is catalyzed by Trm8/Trm82 complex in yeast and METTL1/WDR4 complex in human,29 and is found at the position 46 (m7G46) in several tRNAs variable loops, which have been reported to stabilize tRNA tertiary fold.30,31 Moreover, in Saccharomyces cerevisiae, m7G46 has been identified to be crucial for tRNA level maintenance in the rapid tRNA decay pathway.32 Interestingly, in Thermus thermophilus, m7G46 acts as a tRNA precursor marker and promotes reaction rate of other tRNA modification enzymes.33 Furthermore, a recent study reported that knockout of the m7G46 tRNA methyltransferase complex in mouse embryonic stem cells impairs neural lineage differentiation and affects translation on a global scale.28

m7G is also conserved in eukaryotic 18S rRNA, at the position 1575 in yeast and 1639 in human.34,35 This modification is synthesized by Bud23/Trm112 complex in yeast and WBSCR22/TRMT112 complex in human.36,37 Surprisingly, these methyltransferases are involved in the pre-18S rRNA processing and required for efficient nuclear export of pre-40S ribosomal subunit without the necessity of their catalytic activity,35,36 suggesting that binding of these m7G methyltransferases to pre-18S rRNA serves as a conserved quality control mechanism during 40S ribosomal subunit maturation.

m7G cap modification was firstly identified on eukaryotic mRNAs27 and is evolutionarily conserved.38 This modification is catalyzed by RNMT/RAM methyltransferase complex on nearly all RNA polymerase II (Pol II) target genes during the early stages of transcription.39,40 m7G cap is a critical feature required for stability, splicing, efficient translation and several other regulations of mRNAs.41,42 Indeed, m7G cap is recognized in the nucleus by a Cap Binding Complex (CBC) composed of CBP80 and CBP20, which mainly mediates m7G cap functions for mRNA export, stability, splicing, transcription, pioneer round of translation and nonsense-mediated decay.43 Moreover, m7G-capped mRNAs, once exported into the cytoplasm, recruit a second m7G cap binding complex composed of eukaryotic translation initiation factors such as eIF4E for the steady-state rounds of translation.44

All these studies highlight the fundamental and significant role of the m7G in fates of mRNA, rRNA and tRNA in cells. Notably, one recent study identified the existence of m7G modification within internal mRNAs in higher eukaryotes, using a differential enzymatic digestion (S1 nuclease or phosphodiesterase I) combined with liquid chromatography-tandem mass spectrometry analysis.45 Zhang et al. also reported the distribution features of the internal mRNA m7G methylome in human cells using both antibody-based m7G-methylated RNA immunoprecipitation sequencing (MeRIP-seq) and chemical-assisted m7G-seq methods.46

In this study, we attempt to establish a high-throughput technique for detecting m7G through individual-nucleotide-resolution cross-linking and immunoprecipitation with sequencing (miCLIP-seq). Combining anti-m7G antibody immunoprecipitation enrichment with ultraviolet (UV) cross-linking, we have successfully profiled internal mRNA m7G modification with enrichment at AG-rich region in 5′UTR in mammalian cells and brain tissues. Strikingly, internal mRNA m7G tends to be dynamically regulated with features of substantially enhanced enrichment in the coding sequence (CDS) and 3′UTR regions versus a decrease in 5′UTR regions upon heat shock and oxidative stress. Moreover, internal m7G promotes mRNA translation, as demonstrated by translation efficiency analyses on m7G-modified mRNAs of both HeLa and 293T cells as well as endogenous PCNA and its exogenous minigene reporter. All together, these findings revealed the internal mRNA m7G methylome in higher eukaryotes and that m7G might serve as a novel epitranscriptomic marker potentially significant for translation and stress response.

Results

18S rRNA m7G1639 identification by a modified miCLIP-seq

To establish a sensitive and specific method to study the m7G profiling along transcriptome, we referred to two antibody-dependent library strategies of MeRIP-seq4,47 and miCLIP-seq for m6A.48 We first used a primary mouse anti-m7G antibody to pull down m7G-containing RNAs and then constructed libraries for m7G MeRIP, m7G miCLIP and input control (Fig. 1a and Supplementary information, Fig. S1a, b). We then validated their reliability by testing the previously reported m7G modification at position 1639 (m7G1639) conserved in human 18S rRNA.35 Expectedly, the m7G MeRIP library showed enrichment around m7G1639 at a ~200 nt resolution, while we didn’t observe any preference along the rRNA in input library (Supplementary information, Fig. S1c). Besides, as ultraviolet (UV) cross-linking can induce covalent binding of antibody-RNA complexes leading to single-nucleotide mutation or truncation,48,49 we observed that fragments from miCLIP library showed much higher specificity to m7G1639 and less background than those from MeRIP library (Fig. 1b). Nevertheless, owing to the possible difference in the optimal UV conditions between anti-m6A and anti-m7G antibodies, we tested 12 different UV conditions for constructing m7G miCLIP-seq library using different combinations of cross-linking times, wavelength or energy input in order to obtain higher specificity (Fig. 1c, d). Intriguingly, the increases in all the three factors substantially improve the resolution of m7G miCLIP-seq. However, the increased wavelength (UV5-8) and energy input (UV9-12) also induced more unspecific backgrounds (Fig. 1c), suggesting that increased cross-linking time enhanced the m7G miCLIP specificity.

Fig. 1
figure 1

miCLIP-seq provides higher resolution than conventional MeRIP-seq. a Illustration of m7G miCLIP-seq. Fragmented RNAs are incubated with anti-m7G antibody. After UV cross-linking, covalently bound antibody-RNA complexes are recovered by protein A affinity purification. RNAs are then released by proteinase K digestion and reverse transcribed. During this step, peptide fragments that remain on RNAs lead to cDNA truncations or mutations. The cDNA library is then amplified by PCR and subjected to deep sequencing. b Integrative Genomics Viewer (IGV) tracks showing reads from three different libraries along 18S rRNA. The red arrow represents m7G residue at position 1639 of 18S rRNA. c IGV tracks showing reads from miCLIP libraries under 12 different UV conditions along 18S rRNA. d Parameter settings of cross-linking times, wave length and energy input for each UV condition. e IGV tracks showing reads from miCLIP libraries under 12 different UV conditions around m7G1639. The red bar along each track represents the truncation reads number at each position

Determination of 18S rRNA m7G1639 using cross-linking-induced truncation and mutation models

Considering that truncation or mutation around binding site induced by cross-linking provides a single-nucleotide resolution way to identify RNA binding protein or antibody targets,50 we examined both cross-linking-induced truncation sites (CITSs) and mutation sites (CIMSs) as previous studies.51,52 Although truncations around m7G1639 can be observed under all UV conditions, truncation from UV3 showed the highest specificity to m7G (Fig. 1e and Supplementary information, Fig. S1d), highlighting UV3 as the optimal UV condition. In addition, we also performed CIMS analysis to detect significant mutations but most significant mutation sites displayed as false positive sites along 18S rRNA (Supplementary information, Fig. S1e). Unlike m6A, the mutations around m7G1639 identified by m7G miCLIP library showed relatively low mutation ratio and no dominate mutation type was observed under all UV conditions (less than 1%; Supplementary information, Fig. S1f). Collectively, we chose UV3 and truncation analysis model to perform m7G miCLIP-seq and subsequent bioinformatic analyses, through which we have successfully obtained the accurate transcriptomic profiles of internal mRNA m7G.

m7G landscape along human mRNA shows conserved enrichment in translation initiation site

The report about the existence of internal m7G along mRNA45 suggests a potential function of this modification and raises a challenge about the accurate mapping of transcriptome-wide distribution of internal mRNA m7G. To this aim, we performed m7G miCLIP-seq under UV3 condition on 5′cap digested mRNAs from both HeLa and 293T cells (Fig. 2a and Supplementary information, Fig. S2a–c). After trimming m7G clusters near 5′cap region, we identified 2896 internal m7G clusters within 1635 mRNAs in HeLa cells and 4522 clusters within 2318 mRNAs in 293T cells, respectively (Fig. 2b). Intriguingly, internal m7G from both cell lines displayed a similar preference to AG-rich regions (Fig. 2c), with the 3-mer AAG tag strongly enriched in the 2 nt downstream the truncation site (Supplementary information, Fig. S2d), similar to that m6A modification has been identified in a conserved DRACH motif.53,54 We also found that internal m7G is enriched at 5′UTR in close proximity to translation initiation site (Fig. 2d). The conservation scores showed that nucleotides around m7G are much more conserved than random background in both HeLa and 293T mRNAs (Supplementary information, Fig. S2e),55 suggesting that the internal m7G enrichment at 5′UTR might also be consistent across different eukaryotic species. Compared to that almost two m6A peaks can be found per mRNA,3 most mRNAs contained less than three m7G clusters in both human cell lines (Supplementary information, Fig. S2f). Additionally, almost 73.5% m7G-modified mRNAs in HeLa cells were also detected in 293T cells (Fig. 2e), supporting a strong conservation of internal m7G modification. The gene ontology (GO) analysis also showed that in both human cell lines, m7G-modified mRNAs are enriched in several fundamental functions, such as translation, regulation of stability and cell-cell communications (Fig. 2f and Supplementary information, Fig. S2g). We further compared the m7G distribution along three randomly selected genes (FKBP3, NDUFA1 and RPUSD3) and revealed its consistent conservation between HeLa and 293T cells (Fig. 2g and Supplementary information, Fig. S2h). All these findings highlight the features of internal m7G with specific enrichment preference at 5′UTR and AG-rich regions, and also well conservation among different human cell lines.

Fig. 2
figure 2

The transcriptome landscape of m7G reveals the enrichment of internal m7G near start codon in human mRNAs. a Workflow for mRNA m7G miCLIP-seq. b Pie chart showing percentage of mRNA m7G clusters in each non overlapping segment in HeLa (left panel) and 293T (right panel). Segments are annotated by Ensembl database (hg38, release 86) and the cap region is defined as 50 nt downstream from the 5′ terminus. c Motif analysis of internal mRNA m7G in HeLa (upper panel) and 293T (lower panel) mRNAs by HOMER. d Distribution of internal m7G across HeLa (blue) and 293T (red) mRNA segments. Each segment is normalized according to its average length from Ensembl database. e Venn plot showing the numbers of mRNAs displaying internal m7G in HeLa and 293T cells. f Bar plot chart showing the significant GO terms for HeLa mRNAs containing internal m7G. g Representative mRNAs displaying m7G clusters in both HeLa (upper panel) and 293T (lower panel) cells. m7G clusters are highlighted with red arrows along the transcripts. Transcript architecture is shown beneath, with thin parts corresponding to UTRs and thicker ones to CDS; exon-exon junctions are indicated by vertical black lines. h Boxplot chart showing increased translation efficiency (TE) for HeLa (upper panel) and 293T (lower panel) mRNAs displaying m7G within 5′UTR, CDS or 3′UTR compared to mRNAs without internal m7G (Mann–Whitney U test). Translation efficiency results are downloaded from GSE63591 and GSE65778

To define whether or not the m7G enrichment at 5′UTR resulted from the bias on 5′ terminus of mRNA fragments, we examined truncations on input library by using the same analysis pipeline. Compared with the m7G enrichment at 5′UTR for truncations from miCLIP library, pseudo-truncations from input library were randomly distributed along mRNAs (Supplementary information, Fig. S2i, j), supporting the robustness of truncation model on detecting mRNA internal m7G. Besides, as the technical replicate, MeRIP library for 5′cap digested mRNAs also displayed a highly similar internal m7G distribution in both whole transcriptomes (Supplementary information, Fig. S2k) and randomly picked ALG5 mRNA (Supplementary information, Fig. S2l). These results strongly suggest that the enrichment of internal m7G at 5′UTR of mRNAs detected by miCLIP-seq is authentic and not resulted from systemic bias.

One recent report from Zhang et al. employed MeRIP-seq approach for m7G profiling and demonstrated that m7G peaks are distributed within 5′UTR and mainly accumulated at 3′UTR.46 However, we consistently observed m7G enrichment at 5′UTR. Through comparing the m7G clusters from our miCLIP-seq and the peaks from MeRIP-seq (Zhang et al.) in 293T cells, we found that almost 50% methylated mRNAs were identified by both methods (Supplementary information, Fig. S2m), while the overlapping degree between the clusters and peaks was relatively low possibly due to differences in experimental conditions and cell line heterogeneity among different laboratories.56 Indeed, even for the well-established m6A MeRIP-seq, a low overlapping rate of peaks was also observed between two labs (Supplementary information, Fig. S2n, MeRIP-seq data were downloaded from GSE10233657 and GSE37003,3 respectively).

As m7G modification might be unstable at high pH, we then tested the potential effect of pH condition on the m7G decapping activity using both Tobacco Decapping Enzyme (TAP) (Enzymax, pH 7.5) and RppH (NEB, pH 8.8), and the result showed a similar m7G level after treatments of these two enzymes (Supplementary information, Fig. S2o). Additionally, we found that the RppH enzyme that was used for constructing miCLIP library in our study could specially remove cap m7G but not internal m7G of the oligos containing either cap m7G (Oligo-cap-m7G) or internal m7G (Oligo-internal-m7G). Oligo without m7G (Oligo-G) was used as a control (Supplementary information, Fig. S2p). The TAP enzyme-based miCLIP-seq in 293T cells also displayed a consistent m7G landscape as compared with RppH-based analysis (Supplementary information, Fig. S2q). Overall, these findings exclude the possible influence of the pH conditions on m7G decapping enzyme activity.

As cap m7G has been reported to be involved in translation initiation,42 we downloaded the published ribosome profiling data sets of HeLa9 and 293T58 cells to assess potential effects of internal m7G on translation. Intriguingly, we found that translation efficiency (TE) of internal m7G-modified mRNAs was significantly enhanced compared to that of the unmodified mRNAs in both human cell lines (Supplementary information, Fig. S2r). Nonetheless, the m7G abundance and translation efficiency did not show a positive correlation (Supplementary information, Fig. S2s). We also examined the translation efficiency of mRNAs with internal m7G modifications located in different regions (5′UTR, CDS and 3′UTR), and found that m7G in all regions contributed to mRNA translation efficiency, using unmethylated mRNAs as controls (Fig. 2h). Altogether, these results strongly support that internal m7G facilitates mRNA translation independent of its abundance and position in human cell lines.

m7G profile is evolutionarily conserved among mammalian mRNAs

The well conserved distribution of m7G clusters in the human transcriptome prompted us to explore if this modification is evolutionarily conserved in mammals. Thus, we performed the m7G miCLIP-seq on 5′cap digested mRNAs extracted from mouse embryonic stem cells (mESCs) (Fig. 3a). As expected, the internal m7G modifications in mESCs were similarly distributed along mRNAs and also displayed preference to AG-rich regions as in human mRNAs (Fig. 3b and Supplementary information, Fig. S3a, b). Additionally, most of the enriched KEGG pathways and GO terms for m7G-containing mRNAs in mESCs were also related to some fundamental pathways involving metabolism, DNA repair, cell cycle, etc. (Fig. 3c and Supplementary information, Fig. S3c). We then investigated the number of m7G-modified mRNAs shared among human cell lines and mESCs and found that 432 m7G-modified mRNAs exist in all cell lines (Supplementary information, Fig. S3d), most of which are related to translation, mRNA regulation and splicing (Supplementary information, Fig. S3e). We next compared the m7G profile between mESCs and human cell lines in the representative gene, ECHS1 (Supplementary information, Fig. S3f) and found that the m7G enrichment in 5′UTR was conserved among human and mouse cell lines. All together, these results indicate that both m7G profile and features in mRNAs are evolutionarily conserved among mouse and human cell lines. We next selected Nanog and Sox2, two well-known stem cell markers crucial for cell proliferation, renewal, pluripotency and the determination of cell fate,59,60,61 to investigate their m7G modification profiles (Fig. 3d). Intriguingly, internal m7G was not only enriched in 5′UTR, but also showed additional enrichment in CDS and 3′UTR in mRNAs of both genes, indicating a dynamic feature of internal m7G under different cell states.

Fig. 3
figure 3

m7G methylome shows conservation between human and mouse. a Pie chart showing percentage of mRNA m7G in each non-overlapping segment in mESCs. Segments are annotated by Ensembl database (mm10, release 68). b Distribution of internal m7G across mRNA segments in mESCs. c KEGG pathway analysis of mRNAs displaying internal m7G. Pathways are filtered by P < 0.005 and FDR < 0.1 as default. d Two key embryonic mRNAs displaying m7G clusters in mESCs. e Pie chart showing percentage of mRNA m7G in each non overlapping segment in mouse brain. f Distribution of internal m7G across mRNA segments in mouse brain. g KEGG pathway analysis of mRNAs displaying internal m7G. Pathways are filtered by P < 0.005 and FDR < 0.1 as default. h Two brain specific mRNAs displaying m7G clusters in mouse brain

We hence analyzed the mRNA m7G profiles from fresh mouse brain tissues. Unexpectedly, although internal mRNA m7G in mouse brain was consistently enriched in 5′UTR and AG-rich region as in human cell lines and mESCs (Fig. 3e and Supplementary information, Fig. 3g, h), we also observed m7G enrichment along full 5′UTR region (Fig. 3f). Both KEGG pathway and GO analyses (Fig. 3g and Supplementary information, Fig. 3i) further revealed functional enrichment of these m7G-containing mRNAs in several neuronal function-related pathways, such as hormonal signaling pathways and those related to neurogenic disorder and nervous system development. Moreover, we noticed that proteins encoded by m7G-containing mRNAs are highly located in membrane and cytoplasm cellular compartments participating in multiple transport pathways, such as protein- and vesicle-related transports (Supplementary information, Fig. S3i). These results strongly support the crucial role of internal mRNA m7G in multiple signaling pathways in brain. We next selected four specific markers for different encephalic regions reported in previous studies (Thy1: hematopoietic stem cells; Mbp: oligodendrocytes; Aldoc: astrocytes; Gad1: interneurons)62 and found that all four transcripts displayed 5′UTR enrichment of m7G. Nevertheless, internal m7G was also revealed to be enriched within CDS and 3′UTR at a relative lower level than clusters in 5′UTR (Thy1, Mbp, Aldoc and Gad1) (Fig. 3h and Supplementary information, Fig. S3j).

All together, these results demonstrated that the internal mRNA m7G methylome and distributive features are highly conserved between human and mouse. Moreover, internal m7G in CDS and 3′UTR might also be involved in the regulation of biological pathways in mESCs and mouse brain.

Internal m7G is enriched in CDS and 3′UTR of mRNAs upon stress conditions

As both m6A and m5C have been reported to synergistically enhance p21 expression in oxidative stress-induced cellular senescence,19 and m6A is involved in the modulation of the HSP expression during heat shock response,63 we explored whether or not internal m7G can serve as a novel epitranscriptomic regulator in mRNA metabolism under stress conditions. We performed m7G miCLIP-seq on 5′cap digested mRNAs extracted from H2O2- or heat shock-treated human 293T cells (Supplementary information, Fig. S4a–c). Surprisingly, we found an increased number of mRNAs displaying multiple m7G clusters (Supplementary information, Fig. S4d), indicating that both H2O2 and heat shock treatments enhance the abundance of internal m7G. Additionally, compared to the predominant 5′UTR enrichment under control condition, internal m7G in both H2O2 and heat shock treatments displayed relatively increased enrichment in CDS and 3′UTR (Supplementary information, Fig. S4c, e, f), indicating a stress-induced differential m7G modification profile compared to control m7G profile. We further identified that over 53 and 55% of m7G-containing mRNAs were specifically modified under oxidative stress or heat shock treatment respectively (Supplementary information, Fig. S4g). Besides the shared fundamental pathways, m7G-containing mRNAs were enriched for DNA repair pathways induced by oxidative stress and cellular response to stress-related pathways induced by heat shock (Fig. 4a), indicating that stress-induced m7G modifications might be involved in the signaling pathways for specific stress responses.

Fig. 4
figure 4

Stresses enhance enrichment of internal m7G in CDS and 3′UTR in 293T cells. a Barplot chart showing the GO terms for mRNAs displaying oxidative stress-enhanced internal m7G (left panel, n = 1924) and heat shock-enhanced internal m7G (right panel, n = 2009) in 293T cells. b Barplot chart showing percentage of mRNA m7G in each non-overlapping segment under oxidative stress (upper panel) and heat shock (lower panel) in 293T cells. c Numbers of sustained (blue) and heat/oxidation-triggered internal m7G (red) across mRNA segments under oxidative stress (upper panel) and heat shock (lower panel) in 293T cells. d Two mRNAs displaying oxidative stress-enhanced internal m7G modification in control (blue) and hydrogen peroxide-treated (red) 293T cells. e Two mRNAs displaying heat shock-enhanced internal m7G modification in control (blue) and heat shock-treated (red) 293T cells. f Western blot assay showing increased METTL1 protein expression after heat shock treatment. γ-actin serves as a protein loading control. g Quantification of three biological replicates of western blot assays showing increased METTL1 protein expression after heat shock treatment (n = 3). h Dot blot assay showing m7G levels of 293T mRNAs under control and heat shock conditions. Methylene blue staining indicates equal RNA loadings. i Quantification of three biological replicates of dot blot assays showing increased m7G levels of 293T mRNAs upon heat shock treatment. (n = 3). Data represent mean ± SEM. The P values were calculated by a two-tailed unpaired Student’s t test

To further unveil the regulatory role of m7G under each cellular stress, we defined m7G clusters present in both untreated and treated 293T cells as sustained m7G clusters and m7G clusters only present in treated 293T cells as heat/oxidation-triggered m7G clusters. Intriguingly, compared to the sustained m7G, the heat/oxidation-triggered m7G displayed a dramatic increase in CDS and 3′UTR (Fig. 4b, c), highlighting potential roles of internal m7G in CDS and 3′UTR in regulating stress response. Hence, we selected marker genes for DNA repair (GADD45A,64 BTG265 and XRCC266) and reactive oxygen species (ROS)-related pathways (PCNA,67 CBX468 and SOD269) and examined mRNA m7G modification profiles upon oxidative stress. These gene transcripts displayed a strong internal m7G enrichment in CDS and 3′UTR, well validating the findings from transcriptome-wide analysis under H2O2 treatment (Fig. 4d and Supplementary information, S4h, i). For heat treatment, we further selected markers for heat shock response (HSPH1,70 CHORDC171 and MAPKAPK272) and SUMOylation associated with heat stress (BMI1,73 TOPORS74 and NUPL275) and found similar enrichment along mRNAs upon heat shock treatment (Fig. 4e and Supplementary information, S4j, k). Altogether, our findings provide strong evidence that internal m7G distributive patterns, in particular the enrichment in CDS and 3′UTR, are dynamically regulated under stress conditions, which therefore highlights the potential regulatory functions of m7G modification in response to stress.

As METTL1 has been revealed to be a potential m7G methyltransferase,46 we next explored whether METTL1 is involved in the induction of m7G formation upon stress conditions. After analyzing METTL1 protein expression under control and heat shock conditions, we found that METTL1 protein expression upon heat shock was increased (Fig. 4f, g). Moreover, m7G level was also increased upon heat treatment determined by dot blot (Fig. 4h, i). Taken together, these results highlight a correlation between METTL1 and m7G level changes upon heat shock condition, suggesting a potential role of METTL1 in regulating m7G formation under stress condition.

Internal m7G promotes mRNA translation

To further understand potential regulatory functions of internal m7G in mRNA translation, we examined PCNA gene transcript displaying a strong enrichment of internal m7G in CDS and 3′UTR upon oxidative stress (Fig. 5a). We assessed the m7G enrichment of PCNA mRNA before and after H2O2 treatment through m7G RIP-qPCR and found that m7G enrichment in PCNA mRNA was significantly increased upon oxidative stress (Fig. 5b). This trend was independent of transcription activity, since PCNA mRNA expression level was unchanged between treated and untreated conditions (Supplementary information, Fig. S5a). We next examined endogenous PCNA protein level and revealed an increased protein expression after H2O2 treatment (Fig. 5c, d). We additionally performed polysome fractionation assay to assess the percentage of PCNA mRNA in polysome fractions under both control and oxidative stress conditions, and results revealed that upon H2O2 treatment, the percentage of PCNA mRNA is higher than that of control condition (Fig. 5e), suggesting that stress-induced internal m7G enrichment enhances the translation efficiency of PCNA mRNA.

Fig. 5
figure 5

Internal m7G promotes PCNA mRNA translation. a IGV tracks displaying miCLIP-seq read distribution along PCNA in control (blue) and hydrogen peroxide-treated (red) 293T cells. b qPCR assay showing m7G enrichment of endogenous PCNA mRNA upon H2O2 treatment (n = 3). c Western blot assay showing increased PCNA protein expression after H2O2 treatment. γ-actin serves as a protein loading control. d Quantification of western blot assays from five biological replicates showing increased PCNA protein expression after H2O2 treatment (n = 5). e qRT-PCR analysis of the enrichment of endogenous PCNA mRNAs from the polysome fraction upon H2O2 treatment (n = 3). f Reporter constructs of Luciferase-WT-PCNA and Luciferase-MUT-PCNA. A 100 nt fragment containing m7G site from 3′UTR of PCNA was inserted after Luciferase CDS. Luciferase-MUT-PCNA was identical to Luciferase-WT-PCNA, except that one guanine (bold red) within the m7G motif was mutated to adenosine (bold red). g m7G enrichment of exogenous Luciferase-WT-PCNA reporter RNA upon H2O2 treatment assayed by qPCR (n = 3). h Dual luciferase assay showing increased translation efficiency of Luciferase-WT-PCNA reporter upon H2O2 treatment (n = 3). i Dual luciferase assay showing decreased translation efficiency of Luciferase-MUT-PCNA reporter compared to Luciferase-WT-PCNA reporter in standard condition (n = 3). Data represent mean ± SEM. The P values were calculated by a two-tailed unpaired Student’s t test

To further validate the promoting effect of internal m7G at 3′UTR on translation, we constructed two minigene reporters by inserting a 100 nt m7G-containing fragment of 3′UTR region of PCNA mRNA (located from 1026 to 1125 bp) into the 3′UTR region of the Luciferase reporter gene (Fig. 5f). The guanine (G) site of the m7G motif in Luciferase-WT-PCNA was mutated to adenosine (A) in Luciferase-MUT-PCNA. We then determined the m7G enrichment of Luciferase-WT-PCNA mRNA before and after H2O2 treatment. Just as the endogenous PCNA mRNA, Luciferase-WT-PCNA showed an enhanced m7G enrichment upon oxidative stress compared to control (Fig. 5g), while the level of m7G enrichment on m7G spike-in RNA showed no significant difference between both conditions (Supplementary information, Fig. S5b, c). We next performed a dual luciferase assay to assess the translation efficiency of Luciferase-WT-PCNA upon oxidative stress (Fig. 5h), and found that the translation efficiency of Luciferase-WT-PCNA was significantly increased, while the mRNA level was unaltered after H2O2 treatment (Supplementary information, Fig. S5d). Moreover, we observed that the translation efficiency of Luciferase-MUT-PCNA was significantly decreased compared to that of Luciferase-WT-PCNA (Fig. 5i). All together, these results suggest an important role of the stress-induced internal m7G modification in promoting translation.

Discussion

Although m7G modification at the 5′cap position within mRNAs is well studied,27,42 its existence in internal mRNAs was only revealed recently.45 Here, we described a sensitive and specific technique to map m7G modification along mRNAs using a modified miCLIP protocol based on previously reported method for unraveling m6A transcriptome.48,53 Using this well-validated approach (m7G miCLIP-seq), we established the internal mRNA m7G transcriptomic profile and characterized the evolutionarily conserved distributive features of internal m7G clusters with an enrichment at 5′UTR among different cell lines and species.

The individual-nucleotide resolution UV cross-linking and immunoprecipitation (iCLIP) technique was firstly designed to identify the targets of RNA binding proteins. As UV cross-linking can induce covalent binding of antibody-RNA complexes leading to single-nucleotide mutation or truncation,76 iCLIP-based miCLIP has been developed to detect the m6A sites at single-nucleotide resolution.53,54 In our study, we found that the well-conserved mRNA internal m7G motif, AAG, was strongly enriched in the 2 nt downstream of the truncation site, which illustrates that using the similar approach of m6A detection through conserved motif filtering, m7G modifications along mRNAs can be identified. Recently, an AlkAniline-seq method has been shown to enable detection of m7G and 3-methylcytosine (m3C) in RNA at single-nucleotide-resolution.77 This method was built on RNA-seq library obtained from positively selected RNA fragments through chemical reactions cleaving RNAs at one nucleotide before the modification site. Recent report by Zhang et al. demonstrated that m7G peaks were enriched within 5′UTR and had an enhanced accumulation at 3′UTR through an anti-m7G antibody-based MeRIP-seq approach.46 In support, we also observed the enrichment of m7G in 5′UTR. Moreover, a stress-triggered distribution of m7G clusters in 3′UTR was also observed. It should be noted that similar to m6A identified by MeRIP-seq from two labs (Supplementary information, Fig. S2n),3,57 the m7G overlap rate between miCLIP clusters and MeRIP peaks was relatively higher at transcript level, but lower at mRNA location (Supplementary information, Fig. S2m), which is probably attributed to differences in experimental conditions and cell line heterogeneity among different laboratories.56 The detailed reason for this low overlapping still remains unclear and needs future investigation. In addition, Zhang et al. also developed another preciser chemical-assisted m7G-seq approach selectively converting internal m7G sites to abasic sites, which induces misincorporation at these sites during reverse transcription and allows single-nucleotide-resolution mapping of m7G modification.46 Since our m7G miCLIP-seq relies on direct identification of m7G residues using a specific antibody followed by enrichment of RNA fragments containing m7G modifications before library construction, the combination of miCLIP and chemical-assisted approaches might be needed to fully establish the transcriptomic profile of internal m7G in future study.

The internal mRNA m7G architectural pattern is evolutionarily conserved, since its enrichment in 5′UTR is well observed among different human and mouse cell lines. 5′UTR region has been shown to contain secondary structures involving those for translation regulation,78 and m6A at this region can alter the local secondary structure enabling the binding of RNA binding proteins, which leads to altered gene expression and RNA maturation.13,79 Furthermore, it has been reported that m6A modification at the mRNA 5′UTR promotes cap-independent translation through bypassing 5′cap-binding proteins80 and increases Hsp70 mRNA translation under heat stress response.81 Our findings suggest that internal m7G enriched at this position might potentially acts as a structural switcher participating in mRNA translation regulation, just like m6A. This is a certainly important and interesting area worthy of future investigations.

Besides, the internal m7G within mRNA transcript showed conserved AG-rich region preference along mRNAs in both human and mouse. Interestingly, this AG-rich preference of m7G modification is strongly maintained among all m7G modifications occurring within tRNA, rRNA and mRNA. In support, one recent study reported that m7G is located in the variable loop of a subset of 22 tRNAs and within the RAGGU sequence motif,28 which also displays the AG preference. Furthermore, m7G modification within eukaryotic 18S rRNA was found at the position 1575 in yeast and 1639 in human,34,35 which also exhibits preference to the AG sequence motif. Hence, we highlight that all m7G modifications share the similar AG preference feature. In the case of m5C that is mainly located within CG context without typical motif,17 accumulative evidence has shown that NSUN2 serves as the methyltransferase for m5C in both tRNAs and mRNAs.17,82,83 Recently, METTL1 has been revealed to regulate cell migration via its catalytic activity through acting as an m7G methyltransferase for miRNAs and mediating their formation.84 Moreover, according to Zhang et al.46 and our findings, METTL1 might also be involved in the formation of internal m7G in mRNAs, and further mediate translation and stress response. Together, these findings highlight the potential role of METTL1 in catalyzing m7G modifications on various types of RNA species. However, although METTL1-WDR4 complex does install a subset of internal m7G, tRNA-like structures are required in order to mediate m7G methylation on mRNA. In addition, a wider motif variation is found for mRNA m7G modification than those for tRNA m7G modification obtained by m7G-seq.46 These observations highlight the possibility that METTL1 is not the only internal mRNA m7G methyltransferases or it may rely on different RNA binding partners other than WDR4 to recognize mRNAs. Thus, further investigations about whether or not the reported tRNA and rRNA m7G methyltransferases might also catalyze the formation of internal mRNA m7G will be necessary to unravel the underlying molecular mechanism responsible for the formation of m7G.

Finally, we revealed that, upon both heat shock and oxidative stress, internal m7G transcriptomic features are dynamically changed with a relatively lower enrichment in the 5′UTR and an enhanced enrichment within CDS and 3′UTR mRNA regions. Compared with the chemical-assisted method, the detection sensitivity of antibody-based method relies on the density and stoichiometry of the modified sites, and hence these inducible sites enriched in CDS and 3′UTR regions are feasibly detected under stress conditions. As 3′UTR is an important region for crucial regulation like gene expression and translation,85,86 the enrichment of internal m7G at 3′UTR might suggest a potential role of internal m7G in mRNA expression or translation under stress conditions. Our minigene reporter assay also demonstrated that the enhanced internal m7G upon stress indeed positively regulates translation efficiency of the reporter mRNA with m7G site in 3′UTR, whereas mutation of this site results in a significant decrease in translation efficiency, suggesting an important role of internal m7G enrichment in promoting translation.

In summary, our study highlights that internal m7G modification is evolutionarily conserved in mammalian mRNAs and dynamically regulated under stress conditions. Specifically, its enrichment positively regulates translation efficiency. Hence, internal mRNA m7G may be a novel epitranscriptomic regulator fundamentally significant in mRNA translation in cells coping with various challenges, which opens up a novel area for m7G modification in mRNA biology.

Materials and methods

Cell culture and treatment

The human cervical carcinoma cell line (HeLa) and the human embryonic kidney cell line (293T) were both obtained from Cell Resource Center, Chinese Academy of Medical Sciences and cultured in Dulbecco’s Modified Eagle Medium (Gibco, C11995500BT) supplemented with 10% Fetal Bovine Serum (Shanghai ExCell Biology Inc., FSS500) and 0.5% penicillin/streptomycin (Sigma, P4333) in humidified incubator at 37 °C under 5% CO2. Cells were routinely checked and treated with MycAway Treatment (1000×)-Mycoplasma Elimination Reagent (Yeasen, 40607ES03) to prevent mycoplasma contamination. The mESCs were cultured in DMEM supplemented with 20% knockout serum replacement (KOSR) (Thermo Scientific, 10828028), 0.1 mM β-mercaptoethanol (Sigma, 516732), 1000 units/ml leukemia inhibitory factor (Sigma, SRP9001), and 0.1 mM nonessential amino acids (Sigma, M7145). Mouse brain was isolated from wild-type mice (C57BL/6) purchased from Beijing Vital River Laboratory Animal Center according to the institution rules. Heat shock was induced on 293T cells by transferring plates from 37 °C incubator to 42 °C incubator and maintaining the plates at 42 °C for 1 h. Oxidative stress was induced on 293T cells by treating plates with 2.4 mM hydrogen peroxide (H2O2) for 1 h. Stress efficiency was assessed by western blot.

Western blot

To determine stress efficiency and protein expression, cells were lysed in RIPA buffer (0.5% NP-40, 50 mM Tris-HCl, pH 8, 150 mM NaCl, 1 mM EDTA) supplemented with proteinase and phosphatase inhibitors. Cell debris were removed from lysate by centrifuging at 13000 g at 4 °C for 20 min. Protein concentration was measured using Pierce Coomassie Plus (Bradford) assay kit (Thermo, 23236). 30 μg of protein lysate was loaded, separated by either 8%, 10% or 15% SDS-PAGE and immunoblotted with corresponding primary antibodies: anti-p21 (Abcam, ab109199), anti-HSP70 (Bioworld, BS6446), anti-PCNA (Santa Cruz, SC25280), anti-METTL1 (Proteintech, 14994-1-AP), anti-γ-ACTIN (Santa Cruz, SC65638), and HRP-conjugated goat secondary antibodies anti-mouse IgG (Beyotime, A0216) or anti-rabbit IgG (Beyotime, A0208). Protein levels were visualized by enhanced chemiluminescence (GE Healthcare, RPN2232) according to the manufacturer’s instructions. The membranes were finally imaged and quantified using ImageJ software (Version 2.0.0). The P value was calculated using GraphPad Software (Version 7.0a).

RNA preparation

Total RNA was extracted from different cell lines using TRIzol® (Invitrogen, 15596026) followed by poly (A)+ RNA enrichment using Poly(A)Purist™ MAG Kit (Thermo, AM1922). The enrichment procedure was performed twice according to the manufacturer’s instructions. Poly (A)+ RNAs were further purified after removing DNA contamination by TURBO™ DNase (Thermo, AM2238) treatment at 37 °C for 1 h following the manufacturer’s protocol. Purified poly (A)+ RNAs were either treated with RNase-free water or with RppH (NEB, M0356S) decapping enzyme as mentioned by the manufacturer’s protocol with some modifications. 1 µg RNA was incubated at 37 °C for 1 h in a 100 µL final reaction mixture composed of 1 × ThermoPol reaction buffer (NEB, B9004S), 50 U RppH enzyme, 10 U RNase Inhibitor (Beyotime, R0102). Alternatively, purified poly(A)+ RNAs were treated with Tobacco Decapping Plus 2 (TAP) enzyme (Enzymax, 94) as previously reported.46 3 µg fragmented poly (A)+ RNAs were incubated with 5 µL 10 × Decapping Reaction Buffer (100 mM Tris-HCl pH 7.5, 1 M NaCl, 20 mM MgCl2, 10 mM DTT), 1 µL 50 mM MnCl2, 2 µL SUPERase-In RNase Inhibitor (Invitrogen, AM2696) and 4 µL TAP decapping enzyme in a final volume of 50 µL. The reaction was incubated at 37 °C for 2 h. Decapped RNAs were isolated by proteinase K (Roche, 3115836001) digestion at 55 °C for 1 h with 1500 rpm shaking followed by phenol-chloroform extraction and overnight ethanol precipitation. RNA concentration was assessed by Qubit 4 Fluorometer (Invitrogen, Q33226). Decapping efficiency was assessed by both dot blot and HPLC. Before MeRIP or miCLIP, RNAs were fragmented into a size of around 100 nt using RNA fragmentation reagents (Invitrogen, AM8740).

Dot blot

To detect m7G levels in RNAs before and after decapping, upon stress or immunoprecipitation, equal amount of RNAs were loaded onto an Amersham Hybond-N+ positively charged nylon transfer membrane (GE Healthcare, RPN303B) using the Bio-Dot Apparatus (Bio-Rad). After blotting, membrane was allowed to air-dry for 5 min and was UV cross-linked for 3 min at 254 nm in a CL-1000 Ultraviolet Cross-linker (UVP). The membrane was blocked with 5% non-fat dried milk in TBST (50 mM Tris, 150 mM NaCl, 0.1% Tween 20, pH 7.5) followed by incubation with the primary mouse anti-m7G antibody (MBL, RN017M) and the HRP-conjugated goat anti-mouse IgG (Beyotime, A0216) secondary antibody. RNA levels were visualized by enhanced chemiluminescence (GE Healthcare, RPN2232) according to the manufacturer’s instructions. For loading control, the membrane was stained using methylene blue solution (0.02% methylene blue, 0.3 M sodium acetate, pH 5.2) from 10 min to 1 h and washed with distilled water. The membranes were finally imaged and quantified using ImageJ software (Version 2.0.0). The P value was calculated using GraphPad Software (Version 7.0a).

UHPLC-MRM-MS/MS analysis

To determine m7G levels in RNAs before and after decapping treatment or immunoprecipitation by UHPLC, a total of 200 ng RNAs were digested with 0.1 U Nuclease P1 (Sigma, N8630) and 1 U calf intestinal phosphatase (NEB, M0290S) in a 50 μL final reaction mixture and incubated at 37 °C for 5 h with 1500 rpm shaking. The digested RNA solutions were then filtered using 3 kDa MW cutoff ultrafiltration tubes (Pall, OD003C35) and then subjected to UHPLC-MS/MS analysis for detection of m7G and rG. The UHPLC-MS/MS analysis was performed with an Agilent 1290 UHPLC system coupled with a 6410 triple quadrupole mass spectrometer (Agilent Technologies, Palo Alto, CA). A Zorbax Eclipse Plus C18 column (100 mm × 2.1 mm I.D., 1.8 μm particle size, Agilent Technologies) was used for UHPLC separation of mononucleotides. UHPLC separation was used as: 0–2.0 min, 5.0% B; 2.0–4.0 min, 5–20.0% B; 4.0–6.0 min, 20% B; 6.1–8.0 min, 5.0% B. Solvent A was an aqueous solution of 0.1% formic acid, and solvent B was 100% methanol. The mass spectrometer was operated in the positive ion mode. A multiple reaction monitoring (MRM) mode was adopted: m/z 298→166 for m7G (collision energy, 10 eV), m/z 284→152 for rG (5 eV). The injection volume for each sample was 5 μL, and the amount of m7G and rG was calibrated by standards curves. Nitrogen was used for nebulizing and desolvation gas of MS detection. The nebulization gas was set at 40 psi, the flow-rate of desolvation gas was 9 L/min, and the source temperature was set at 300 °C. Capillary voltage was set at 4000 V. High purity nitrogen (99.999%) was used as collision gas. Each sample was analyzed at least three times.

m7G MeRIP-seq

m7G MeRIP was performed as previously described with some modifications.47 Briefly, 2 μg of RNAs were mixed with 6 μg of anti-m7G antibody (MBL, RN017M) and 1 mg Dynabeads Protein A (Invitrogen, 1001D) in 500 μL MeRIP buffer (150 mM NaCl, 10 mM Tris-HCl, pH 7.4, 0.1% NP-40) supplemented with 10 U RNase Inhibitor and incubated at 4 °C for 4 h with rotation. After 3 washes with MeRIP buffer, beads were resuspended in 200 μL MeRIP buffer and m7G-containing RNA fragments were eluted and purified using TRIzol reagent. After RNA recovery, cDNA libraries were constructed using the KAPA Stranded mRNA-seq kit (KAPA, KK8401). All samples were sequenced on Illumina HiSeq X-Ten platform at Novogene Co., LTD.

m7G miCLIP-seq

m7G single-base resolution high-throughput sequencing was carried out using previously reported methods48,53 with some modifications. 2 μg of RNAs were mixed with 6 μg of anti-m7G antibody (MBL, RN017M) in 250 μL Immunoprecipitation buffer (50 mM Tris, pH 7.4, 100 mM NaCl, 0.05% NP-40) supplemented with 10 U RNase Inhibitor and incubated at 4 °C for 2 h with rotation. The mixture was then transferred to a clear flat-bottom 96-well plate (Corning, CLS3997) on ice and irradiated in a CL-1000 Ultraviolet Cross-linker (UVP) using different times (2.5, 3, 4, 5 times), wavelength (254 and 365 nm) and energy (0.15 and 0.25 J/cm2) for total RNA or irradiated four times with 0.15 J/cm2 at 254 nm for poly (A)+ RNAs. The mixture was then immunoprecipitated with 1 mg Dynabeads Protein A (Invitrogen, 1001D) resuspended in 250 μL immunoprecipitation buffer at 4 °C for 2 h with rotation. After extensive washing, on-bead end-repair and linker ligation, RNA fragments were eluted from the beads by proteinase K (Roche, 3115836001) digestion at 55 °C for 1 h with 1500 rpm shaking followed by phenol-chloroform extraction and overnight ethanol precipitation. As described previously,76,87 eluted RNAs were reverse transcribed using Superscript III reverse transcriptase (Invitrogen, 18080093). First-strand cDNAs were size-selected on a 6% TBE-Urea gel (Invitrogen, EC6865BOX), and both circularization and re-linearization of cDNAs were respectively performed using CircLigase II (Epicentre, CL9021K) and BamHI (NEB, R0136) enzymes. Resulting libraries were then PCR amplified using Accuprime Supermix 1 enzyme (Invitrogen, 12342010) for 15 cycles and further size-selected on an 8% TBE gel (Invitrogen, EC6215BOX). All samples were sequenced on Illumina HiSeq X-Ten platform at Novogene Co., LTD.

Plasmid construction and mutagenesis

To generate the minigene reporter construct, 100 nt fragment located from 1026 to 1125 bp of the 3′UTR region containing m7G site from human PCNA gene was amplified by PCR from human 293T cDNA, purified using ISOLATE II PCR and Gel Kit (Bioline, BIO-52059) and cloned into the 3′UTR region of pGL4.23 vector (Promega) using Gibson Assembly® Cloning Kit (NEB, E5510S) according to the manufacturer’s instructions. The 3′UTR wild-type plasmid was thus named: Luciferase-WT-PCNA. The mutant plasmid was generated by mutating the guanine (G, methylated site) to adenosine (A) in the wild-type plasmid using QuikChange Lightning Site-Directed Mutagenesis Kit (Agilent Technologies, 210518). The 3′UTR mutant plasmid was thus named: Luciferase-MUT-PCNA. Both plasmids were purified using the Endo-Free Plasmid Midi Kit (CWBIO, CW2105S) and validated by DNA sequencing. All primers used for cloning and mutagenesis were listed in Supplementary information, Table S1.

Plasmid transfection

Wild-type and mutant plasmids were co-transfected with a control pRL-TK plasmid (Promega) using Lipofectamine™ 2000 transfection reagent (Invitrogen, 11668019) following a reverse transfection method. Lipofectamine™ 2000 and plasmid DNA were used at a ratio of 2:1 (v/m), and mixed in serum- and antibiotics-free DMEM separately. Briefly, a DNA mix was prepared by mixing experimental vector and control vector at a ratio of 30:1 (m/m) in 500 μL serum- and antibiotics-free DMEM and plated to a 6-cm dish. A Lipofectamine mix was simultaneously prepared by adding Lipofectamine™ 2000 transfection reagent in 500 μL serum- and antibiotics-free DMEM. Lipofectamine mix was incubated at room temperature for 5 min and plated within the 6-cm dish containing the DNA mixture. The dish was gently rocked and incubated at room temperature for 20 min to form transfection complexes. Subsequently after incubation, 1 million 293T cells were resuspended in 1 mL DMEM supplemented with 20% FBS without penicillin/streptomycin (P/S), plated to the dish containing DNA-Lipofectamine complexes and incubated at 37 °C under 5% CO2 for 8 h prior to the replacement of transfection medium with complete DMEM (with 10% FBS, 0.5% P/S). 24 h post transfection, cells were subjected to stress conditions as mentioned in Cell culture and treatment section followed by Dual luciferase and qPCR assays.

In vitro transcription

Three 60-mer oligos previously designed46 (5′-CCAATAAAATATTAACCACCAATAAAATATTAACCAAGATCCACCAATAAAATATTAACC-3′) were synthesized by in vitro transcription using MEGAscript™ SP6 Transcription Kit (Invitrogen, AM1330) following the manufacturer’s instructions with either GTP or m7GTP (Sigma Aldrich, M6133) as unique sources for the single G site for synthesizing Oligo-G and Oligo-internal-m7G, respectively. The in vitro transcription reaction was conducted with GTP and cap analog for the Oligo-cap-m7G. m7G incorporation efficiency was assessed by dot blot with Oligo-G as the negative control. The Oligo-internal-m7G was also used as spike-in for the exogenous PCNA mRNA m7G-IP-qPCR validation.

Dual luciferase assay

Luciferase assays were performed using the Dual-Luciferase® Reporter Assay System (Promega, E1910) following the manufacturer’s instructions. Briefly, cells were grown to exponential phase and were directly lysed using 500 μL of 1× passive lysis buffer in the culture plate pre-washed with PBS. After sufficient lysis for 15 min at room temperature with gentle rotation, a 20 μL sample aliquot was taken for luminescence measurements using a Modulus™ single tube multimode reader (Turner BioSystems) as follows: 100 μL of the firefly luciferase reagent (LARII) was added to the sample aliquot, and the sample was equilibrated for 5 s and measured. Then, 100 μL of the firefly quenching and renilla luciferase reagent (Stop&Glo) was added to the sample aliquot, and the sample was equilibrated for 5 s and measured. The relative fold change in firefly to renilla luciferase ratios (Fluc/Rluc) was calculated.

Quantitative polymerase chain reaction (qPCR)

To assess m7G enrichment and mRNA levels of endogenous PCNA mRNA and exogenous minigene reporters under control and stress conditions, cells were cultured and treated as mentioned in cell culture and treatment section. mRNAs were prepared as described in RNA preparation part without decapping treatment and fragmentation procedure. The procedure of m7G immunoprecipitation was preformed according to the m7G MeRIP-seq section and the same amount of m7G IPed and Input RNA were converted to cDNA using RevertAid™ First Strand cDNA Synthesis Kit (Thermo, K1621). All PCR reactions were carried out using ChamQ™ Universal SYBR qPCR Master Mix (Vazyme, Q711-02) according to the manufacturer’s protocol and quantified using an AriaMix Real-Time PCR system (Agilent technologies). RNA levels were normalized to the expression of the internal control gene GAPDH. The relative fold changes in m7G IPed/Input ratios were calculated using the 2−ΔΔCt method.88 The primer pairs used for qPCR are listed in Supplementary information, Table S1.

Polysome fractionation

The polysome fractionation under control and oxidative stress conditions was carried out using previously reported methods8,89 with some modifications. Eight 10-cm plates of 293T cells were prepared as described in Cell culture and treatment section (4 plates for control condition and 4 for oxidative stress). 10 min before harvest, 100 µg/mL cycloheximide (CHX) (Sigma, C1988) was added to the media. The medium was discarded and cells were washed once with ice-cold PBS. The cells were then harvested with ice-cold PBS complemented with 100 µg/mL CHX followed by centrifugation at 1000 g for 5 min. The cell pellets for each samples were resuspended in 500 μL lysis buffer (20 mM Tris pH 7.4, 100 mM KCl, 5 mM MgCl2, 100 µg/mL CHX, 1% Triton X-100, 1:100 protease inhibitor (Roche) and 40 U RNase Inhibitor) and incubated at 4 °C for 20 min with rotation. After incubation, cell debris were removed from lysate by centrifuging at 13000 g at 4 °C for 20 min. The supernatant was collected and was further purified after removing DNA contamination by TURBO™ DNase treatment at 37 °C for 15 min following the manufacturer’s protocol. The treated lysate was centrifuged at 13000 g at 4 °C for 15 min, the supernatant was collected and the A260 absorbance of each samples was measured in order to adjust samples to the same OD value.

A 10–50% w/v sucrose gradient was prepared for each samples in lysis buffer without Triton X-100. Adjusted lysates were loaded onto the sucrose gradient and centrifuged at 190,000 g at 4 °C for 90 min (Beckman, rotor SW41Ti). The gradients were then fractioned into 12 fractions of 1 mL using a fractionating system at 1 mL/min speed. The A254 absorbance of each fraction was measured by the optical unit of the fractionating system. Fractions corresponding to non-ribosome and polysome were collected, independently combined, and total RNAs were extracted using TRIzol® following manufacturer’s protocol. Total RNAs were converted to cDNAs using RevertAid™ First Strand cDNA Synthesis Kit and mRNA percentages in polysome fraction were obtained by qRT-PCR assay and analyzed following the previously reported method.90 GAPDH was used as an internal control.

Sequencing data analysis

For miCLIP-seq, read pre-processing was performed essentially as previously reported.48 Adapter sequences were trimmed by fastx_clipper from fastx_toolkit (http://hannonlab.cshl.edu/fastx_toolkit). Low-quality bases were filtered by fastq_filter.pl, a custom perl script from CLIP Tool Kit (CTK),91 and reads shorter than 24 nt were discarded. Based on the same criteria of previously reported approach to process paired-end data, the forward reads were demultiplexed based on 5′ barcodes for individual replicates by fastq2 collapse for removing PCR-amplified reads, whereas the reverse reads were reversely complemented and processed in the same way as the forward counterparts. Finally, the random barcodes of the remaining reads were stripped by stripBarcode.pl and moved to read headers for downstream processing by the CITS pipeline. The remaining reads were respectively mapped to the human rRNA (derived from UCSC database), human genome (version hg38) and mouse genome (version mm10) with BWA (v0.7.10), allowing ≤0.06 error rate (substitutions, insertions, or deletions) per read (bwa aln -n 0.06 -q 20).

For MeRIP-seq, adapter sequences were trimmed off for all raw reads using the Cutadapt software (version 1.2.1).92 Reads that were less than 35 nt in length or contained an ambiguous nucleotide were discarded by Trimmomatic (version 0.30).93 For library of total RNA, the remaining reads were aligned to human rRNA using TopHat (version 2.0.9).94 For library of decapped mRNA, the remaining reads were aligned to hg38 using HISAT2 (version 2.1.0).95 Only uniquely mapped reads with mapping quality score > 20 were kept for both libraries.

Identification of truncation sites and high-confident m7G clusters

To identify the m7G clusters, the mode of truncation calling was performed as previously reported with minor modifications.52 The identified truncation was then filtered by cutoff as P < 0.001/(alignment reads per million) to eliminate bias from sequencing depth. The remained truncations were then extended 15 nt in both up- and downstream and defined as m7G cluster. Further, to obtain high-confident m7G clusters, only clusters found in both two replicates were used for downstream analysis. Then each cluster was annotated based on ensembl database (hg38, release 86 or mm10, release 68) by applying BEDTools’ intersectBed (version 2.16.2).96 Then we defined cap region as the first 50 nucleotides from transcription start site (TSS) of each mRNA and obtained internal m7G clusters by trimming those clusters in cap region. The pseudo-truncation sites were identified by the same pipeline from regular RNA-seq and filtered by more than 10 reads. All m7G clusters for HeLa, 293T, mESCs, mouse brain and different stress conditions were presented in Supplementary information, Tables S2S7.

Motif identification of m7G clusters

The motifs enriched in m7G clusters were analyzed by HOMER (v4.7).97 Motif length was restricted to 6 nucleotides. All clusters mapped to mRNAs were used as the target sequences and background sequences were constructed by randomly shuffling clusters from transcripts on genome using BEDTools’ shuffleBed (version 2.16.2).

Evolutionary conservation analysis

The evolutionary conservation for m7G was analyzed according to a previous study.55 Phylogenetic conservation score was directly downloaded from UCSC website with comparison for 100 vertebrates using phastCons (http://hgdownload.soe.ucsc.edu/goldenPath/hg38/phyloP100way/hg38.phyloP100way.bw) and then wig file was generated to obtain conservation score of each nucleotide in whole genome. The conservation score of each internal m7G cluster was identified as an average of scores of its own nucleotides. The background was constructed by randomly shuffling clusters upon total mRNAs transcribed by genome using BEDTools’ shuffleBed (version 2.16.2). The significant difference between two groups was evaluated using the Mann–Whitney U test by R software.

mRNA translation efficiency analysis

Human ribosome profiling and mRNA input data were downloaded from GEO database with access numbers GSE635919 and GSE6577858 for HeLa and 293T cells, respectively. Translation efficiency of each mRNA was calculated by ribosome-bound fragments and mRNA input fragments following methods in a previous study.98

Gene Ontology (GO) and KEGG pathway analyses

GO and KEGG pathway analyses were performed using DAVID (https://david.ncifcrf.gov/) for each gene set. Enrichment maps were constructed using Cytoscape 3.2.099 installed with the Enrichment Map plugin. Within the enrichment maps, each node represents an enriched pathway (p < 0.005 and FDR < 0.1) and the node size is proportional to the total number of genes in each pathway. Edge thickness represents the number of overlapping genes between nodes. Nodes in similar pathway are sorted into one cluster, marked with circles and labels. Specially, we performed the functional enrichment analysis for 432 conserved genes between human and mouse by Metascape (http://metascape.org).100

Data availability

The accession number for the MeRIP-seq and miCLIP-seq data in this paper is GSA: CRA001302. These data have been deposited in the Genome Sequence Archive under project PRJCA001172.