Fine-mapping analysis including over 254,000 East Asian and European descendants identifies 136 putative colorectal cancer susceptibility genes

Chen, Zhishan; Guo, Xingyi; Tao, Ran; Huyghe, Jeroen R.; Law, Philip J.; Fernandez-Rozadilla, Ceres; Ping, Jie; Jia, Guochong; Long, Jirong; Li, Chao; Shen, Quanhu; Xie, Yuhan; Timofeeva, Maria N.; Thomas, Minta; Schmit, Stephanie L.; Díez-Obrero, Virginia; Devall, Matthew; Moratalla-Navarro, Ferran; Fernandez-Tajes, Juan; Palles, Claire; Sherwood, Kitty; Briggs, Sarah E. W.; Svinti, Victoria; Donnelly, Kevin; Farrington, Susan M.; Blackmur, James; Vaughan-Shaw, Peter G.; Shu, Xiao-Ou; Lu, Yingchang; Broderick, Peter; Studd, James; Harrison, Tabitha A.; Conti, David V.; Schumacher, Fredrick R.; Melas, Marilena; Rennert, Gad; Obón-Santacana, Mireia; Martín-Sánchez, Vicente; Oh, Jae Hwan; Kim, Jeongseon; Jee, Sun Ha; Jung, Keum Ji; Kweon, Sun-Seog; Shin, Min-Ho; Shin, Aesun; Ahn, Yoon-Ok; Kim, Dong-Hyun; Oze, Isao; Wen, Wanqing; Matsuo, Keitaro; Matsuda, Koichi; Tanikawa, Chizu; Ren, Zefang; Gao, Yu-Tang; Jia, Wei-Hua; Hopper, John L.; Jenkins, Mark A.; Win, Aung Ko; Pai, Rish K.; Figueiredo, Jane C.; Haile, Robert W.; Gallinger, Steven; Woods, Michael O.; Newcomb, Polly A.; Duggan, David; Cheadle, Jeremy P.; Kaplan, Richard; Kerr, Rachel; Kerr, David; Kirac, Iva; Böhm, Jan; Mecklin, Jukka-Pekka; Jousilahti, Pekka; Knekt, Paul; Aaltonen, Lauri A.; Rissanen, Harri; Pukkala, Eero; Eriksson, Johan G.; Cajuso, Tatiana; Hänninen, Ulrika; Kondelin, Johanna; Palin, Kimmo; Tanskanen, Tomas; Renkonen-Sinisalo, Laura; Männistö, Satu; Albanes, Demetrius; Weinstein, Stephanie J.; Ruiz-Narvaez, Edward; Palmer, Julie R.; Buchanan, Daniel D.; Platz, Elizabeth A.; Visvanathan, Kala; Ulrich, Cornelia M.; Siegel, Erin; Brezina, Stefanie; Gsur, Andrea; Campbell, Peter T.; Chang-Claude, Jenny; Hoffmeister, Michael; Brenner, Hermann; Slattery, Martha L.; Potter, John D.; Tsilidis, Kostas K.; Schulze, Matthias B.; Gunter, Marc J.; Murphy, Neil; Castells, Antoni; Castellví-Bel, Sergi; Moreira, Leticia; Arndt, Volker; Shcherbina, Anna; Bishop, D. Timothy; Giles, Graham G.; Southey, Melissa C.; Idos, Gregory E.; McDonnell, Kevin J.; Abu-Ful, Zomoroda; Greenson, Joel K.; Shulman, Katerina; Lejbkowicz, Flavio; Offit, Kenneth; Su, Yu-Ru; Steinfelder, Robert; Keku, Temitope O.; van Guelpen, Bethany; Hudson, Thomas J.; Hampel, Heather; Pearlman, Rachel; Berndt, Sonja I.; Hayes, Richard B.; Martinez, Marie Elena; Thomas, Sushma S.; Pharoah, Paul D. P.; Larsson, Susanna C.; Yen, Yun; Lenz, Heinz-Josef; White, Emily; Li, Li; Doheny, Kimberly F.; Pugh, Elizabeth; Shelford, Tameka; Chan, Andrew T.; Cruz-Correa, Marcia; Lindblom, Annika; Hunter, David J.; Joshi, Amit D.; Schafmayer, Clemens; Scacheri, Peter C.; Kundaje, Anshul; Schoen, Robert E.; Hampe, Jochen; Stadler, Zsofia K.; Vodicka, Pavel; Vodickova, Ludmila; Vymetalkova, Veronika; Edlund, Christopher K.; Gauderman, W. James; Shibata, David; Toland, Amanda; Markowitz, Sanford; Kim, Andre; Chanock, Stephen J.; van Duijnhoven, Franzel; Feskens, Edith J. M.; Sakoda, Lori C.; Gago-Dominguez, Manuela; Wolk, Alicja; Pardini, Barbara; FitzGerald, Liesel M.; Lee, Soo Chin; Ogino, Shuji; Bien, Stephanie A.; Kooperberg, Charles; Li, Christopher I.; Lin, Yi; Prentice, Ross; Qu, Conghui; Bézieau, Stéphane; Yamaji, Taiki; Sawada, Norie; Iwasaki, Motoki; Le Marchand, Loic; Wu, Anna H.; Qu, Chenxu; McNeil, Caroline E.; Coetzee, Gerhard; Hayward, Caroline; Deary, Ian J.; Harris, Sarah E.; Theodoratou, Evropi; Reid, Stuart; Walker, Marion; Ooi, Li Yin; Lau, Ken S.; Zhao, Hongyu; Hsu, Li; Cai, Qiuyin; Dunlop, Malcolm G.; Gruber, Stephen B.; Houlston, Richard S.; Moreno, Victor; Casey, Graham; Peters, Ulrike; Tomlinson, Ian; Zheng, Wei

doi:10.1038/s41467-024-47399-x

Download PDF

Article
Open access
Published: 26 April 2024

Fine-mapping analysis including over 254,000 East Asian and European descendants identifies 136 putative colorectal cancer susceptibility genes

Nature Communications volume 15, Article number: 3557 (2024) Cite this article

1174 Accesses
5 Altmetric
Metrics details

Subjects

Abstract

Genome-wide association studies (GWAS) have identified more than 200 common genetic variants independently associated with colorectal cancer (CRC) risk, but the causal variants and target genes are mostly unknown. We sought to fine-map all known CRC risk loci using GWAS data from 100,204 cases and 154,587 controls of East Asian and European ancestry. Our stepwise conditional analyses revealed 238 independent association signals of CRC risk, each with a set of credible causal variants (CCVs), of which 28 signals had a single CCV. Our cis-eQTL/mQTL and colocalization analyses using colorectal tissue-specific transcriptome and methylome data separately from 1299 and 321 individuals, along with functional genomic investigation, uncovered 136 putative CRC susceptibility genes, including 56 genes not previously reported. Analyses of single-cell RNA-seq data from colorectal tissues revealed 17 putative CRC susceptibility genes with distinct expression patterns in specific cell types. Analyses of whole exome sequencing data provided additional support for several target genes identified in this study as CRC susceptibility genes. Enrichment analyses of the 136 genes uncover pathways not previously linked to CRC risk. Our study substantially expanded association signals for CRC and provided additional insight into the biological mechanisms underlying CRC development.

Association analyses identify 31 new risk loci for colorectal cancer susceptibility

Article Open access 14 May 2019

Genetic risk factors for colorectal cancer in multiethnic Indonesians

Article Open access 11 May 2021

Deciphering colorectal cancer genetics through multi-omic analysis of 100,204 cases and 154,587 controls of European and east Asian ancestries

Article 20 December 2022

Introduction

Colorectal cancer (CRC) is one of the most common malignancies worldwide¹. Inherited genetic factors play an important role in the development of CRC². Since 2007, genome-wide association studies (GWAS) have identified over 200 common genetic variants independently associated with CRC risk^3,4,5,6,7. These GWAS, however, typically only reported the most significantly associated variant (the lead variant) at each risk locus. Statistical fine-mapping analyses of known risk loci can identify additional association signals independent of the lead variant.

Approximately 90% of GWAS-identified risk variants for CRC are located in noncoding or intergenic regions, and target genes for most of these risk variants remain unknown. Well-powered fine-mapping analyses, particularly those using data from multi-ancestry populations, can facilitate the identification of credible causal variants (CCVs) in each region. Previous genetic studies have provided strong evidence that regulatory variants in linkage disequilibrium (LD) with GWAS-identified risk variants drive the associations of genetic variants with cancer risk by modulating the expression of susceptibility genes^8,9,10,11. Therefore, integrating functional genomic data to interrogate CCVs in each independent risk-associated signal could help to identify putative causal variants and target genes for CRC risk. Herein, we conducted large trans-ancestry fine-mapping analyses of all currently known CRC risk regions, using GWAS data from 100,204 CRC cases and 154,587 controls of East Asian and European ancestry, to identify independent association signals and their target genes for CRC risk.

Results

Identification of independent association signals with CRC risk

We conducted fine-mapping analyses using GWAS summary statistics from 100,204 CRC cases and 154,587 controls (73% European and 27% East Asian ancestry) (Fig. 1, Supplementary Data 1). In our recent trans-ancestry meta-analysis of GWAS, we identified 205 genetic variants independently associated with CRC risk⁷. We aggregated regions flagged by these variants into 143 risk regions, each containing at least a 1 Mb interval centered on the most significant association (Supplementary Data 2). Among them, 40 regions harbor at least two reported independent risk associations. All risk regions were autosomal, except the one at Xp22.2. For subsequent analyses, we focused on the 142 regions located on the autosomes.

**Fig. 1: Schematic diagram of the study design.**

We used forward stepwise conditional analyses to identify independent association signals in each region in each population, conditioning on the most significant association from the trans-ancestral summary statistics (Supplementary Fig. 1, Methods). We then meta-analyzed the conditioned data using the fixed-effects inverse variance weighted model. We considered the threshold of conditional P < 1 × 10⁻⁶ to determine independent significant associations to balance both Type 1 and 2 errors, as recommended by a previous fine-mapping study in breast cancer¹². At this threshold, we identified 171 independent association signals in 122 regions (Fig. 2, Supplementary Data 3). To identify possible ancestry-specific association signals, we conducted similar analyses using only summary statistics from each population, conditioning on the ancestry-specific most significant association. Using the same threshold, we identified 198 and 45 independent association signals in European and East Asian descendants, respectively (Supplementary Data 4 and 5). Of them, 60 signals in European and 7 in East Asian were not detected in the trans-ancestry analysis above, suggesting them as potential ancestry-specific risk signals (Fig. 2).

**Fig. 2: Independent association signals for colorectal cancer risk.**

In total, we identified 238 independent association signals either from trans-ancestry or ancestry-specific analysis at these 142 regions (Fig. 2). A total of 94 regions (66.2%) contained only a single association signal, while the remaining 48 regions (33.8%) consisted of multiple independent association signals. Among the 238 independent association signals, 191 signals had lead variants that were correlated with previously GWAS-reported risk variants⁷ (LD r² > 0.1 in either of East Asian or European-ancestry population). The remaining 47 independent signals (19.7%) have not been previously reported, including 18 from trans-ancestry, 28 from European-specific, and one from East Asian-specific analyses (Fig. 2, Table 1). Among these 47 signals, 31 demonstrated significant associations with conditional P < 1 × 10⁻⁷, including 28 signals reached genome-wide significance.

Table 1 Independent association signals uncovered at known CRC risk loci in conditional analyses (conditional P < 1 × 10⁻⁶)

Full size table

Identification of credible causal variants (CCVs) for independent association signals

To identify CCVs for each independent association signal, we conducted conditional analysis with adjustment of the lead variants for other signals in the same risk region. We conducted this analysis for trans-ancestral independent signals separately for each population to account for differences in the LD structure and then meta-analyzed conditioned results. Using a similar approach conducted in breast cancer¹², we defined variants as CCVs if they satisfied conditional P values within two orders of magnitude of the most significant association, conditioning on all other independent association signals. We identified a total of 5741 CCVs for the 238 signals, with the number of CCVs per signal ranging from 1 to 249 (median: 11 CCVs per signal) (Supplementary Data 6). For 28 risk signals, only a single CCV was identified, suggesting that these CCVs are likely to be the causal variants for these signals (Table 2).

Table 2 Independent association signals with a single CCV

Full size table

For the 138 independent association signals identified in both trans-ancestry and European-ancestry specific analyses (Supplementary Data 7), trans-ancestry analyses identified a smaller-sized set of CCVs (mean = 23.2, median = 8.5), compared with European-ancestry specific analysis (mean = 31.08, median = 15) (paired Wilcoxon test, P = 4.9 × 10⁻⁷). Interestingly, a single CCV was identified for 10 signals in trans-ancestry analysis, while multiple CCV for them in European-ancestry specific analysis, highlighting the value of using multi-ancestry data to reduce the number of CCVs in fine-mapping analysis. For instance, signal 1 in region_42 included 16 CCVs in the European set (lead variant: rs41302867), but only one variant in the trans-ancestry set (rs9379084). The variant rs9379084 is a predicted-deleterious missense variant (p.Asp1171Asn) of the RREB1 gene which plays a regulatory role in Ras/Raf-mediated cell differentiation¹³, a pathway well known to be implicated in CRC development.

Identification of target genes for CCVs

Of the 5741 CCVs identified in this study, 3716 (64.7%) are located in regions with at least one of six genomic features (open chromatin, transcribed regions of active genes, promoter, enhancer, repressed gene regulatory elements, and transcription factor (TF) binding sites) (Supplementary Data 6 and 8). To identify putative target genes of these CCVs, we used functional genomic data generated in CRC-related tissues/cells to conduct in-silico analyses with a modified INQUISIT pipeline¹² (Methods, Supplementary Data 9). We identified 72 putative target genes via CCVs located in distal enhancer elements (Supplementary Data 10), 48 genes via CCVs located in proximal promoter elements (Supplementary Data 11), and 19 genes that could be targeted by CCVs in coding regions (i.e., deleterious missense, stop_gained, and start_lost) (Supplementary Data 12). In total, we identified 128 genes associated with CCVs for 76 independent association signals, with a range from one to five putative target genes per signal. Of them, 52 independent association signals contain only a single putative target gene.

We also conducted cis-expression quantitative trait loci (cis-eQTL) analyses to identify target genes using four transcriptome datasets derived from either normal colon tissues or tumor-adjacent normal colon tissues from 1299 individuals from the Genotype-Tissue Expression (GTEx) project (n = 368 individuals predominantly of European ancestry), the BarcUVa-Seq project (n = 423 individuals of European ancestry), the Colonomics project (n = 144 individuals of European ancestry), and the Asia Colorectal Cancer Consortium (ACCC) (n = 364 individuals of East Asian ancestry) (Methods). At Bonferroni-corrected P < 0.05, we identified 153 genes associated with the lead variants, including 127 genes in 65 independent association signals and 30 in 15 signals identified from trans-ancestry and European-ancestry specific analyses, respectively. We also identified the PPP1R21 gene in a potential Asian-specific risk signal (lead variant rs77272589) (Supplementary Data 13). Out of the 153 genes, 37 had been previously identified by eQTL analysis^5,10,11. For independent association signals identified in European and trans-ancestry analyses, we further performed cis-methylation quantitative trait loci (cis-mQTL) analyses using two methylation datasets generated from 321 individuals from the GTEx project (n = 189 individuals predominantly of European ancestry) and the Colonomics project (n = 132 individuals of European ancestry). We found that DNA methylation levels at CpG sites for 84 genes were associated with 71 independent association signals, including 14 genes identified in previous mQTL analysis¹¹ (Supplementary Data 14).

We next conducted colocalization analyses for identified likely target genes in significant eQTL/mQTLs above using the Summary-data-based Mendelian Randomization (SMR) approach (Methods). Through the integration of eQTL/mQTL results and GWAS associations signals, we identified 205 genes at Bonferroni-corrected P_SMR < 0.05 (Supplementary Data 15–19), including 150 genes from the eQTL analysis and 84 genes from the mQTL analysis. Of these, 45 (21.9%) genes were also identified as targets of CCVs by in-silico analyses based on functional genomic data as described above, and 29 genes were identified in both mQTL and eQTL analyses. That is in line with previous observations in the overlap fraction between mQTL and eQTL¹⁴. We considered genes with evidence of only mQTL colocalization, as the enrichment of mQTLs in gene regulatory elements, as well as their implications in other molecular phenotypes, such as chromatin accessibility^14,15. Notably, of the 55 genes only identified in the mQTL analysis, seven genes were supported by the above in silico analyses with functional genomic data, and 22 genes showed association with CRC risk in previous TWAS and eQTL colocalization analysis^7,11,16,17.

In total, we identified 288 putative target genes for 140 independent association signals based on functional genomics data and/or colocalization analysis. For 35 of these signals, multiple target gene candidates were detected per signal, suggesting that some may be false positives (Supplementary Data 20). To minimize false positive findings, we further prioritized target gene candidates by analyzing associations of genes with CRC risk based on previous transcriptome-wide association studies (TWAS) and colocalizations between eQTL and CRC GWAS signals^7,11,16,17 (Methods). Finally, we obtained a credible set of 136 protein-coding genes for 124 independent association signals. Among them, 56 genes were not previously identified as potential targets for CRC risk associations, including nine genes in eight previously unreported association signals in this study (Table 3). The remaining 80 genes were previously reported as potential CRC susceptibility genes, and our study provided additional supporting evidence (Table 4)^7,11,16,17.

Table 3 The 56 CRC susceptibility gene candidates not previously reported

Full size table

Table 4 The 80 previously reported CRC susceptibility genes supported in this study

Full size table

Using scRNA-seq data to evaluate gene expression pattern by cell types

To investigate potential underlying cell types of putative susceptibility genes that contribute to CRC development, we analyzed single-cell RNA-seq (scRNA-seq) datasets from normal colon tissues obtained from 31 participants included in the Colorectal Molecular Atlas Project¹⁸ (Methods). Of the 136 identified genes, 17 genes exhibited significantly differential expression in specific cell types compared to the other cell types at |log2 fold change (FC)| > 1 and a nominal P < 0.05 (Supplementary Data 21). Nine of these genes (DIP2B, CIB1, HPGD, CDKN2B, TMEM258, MYL12A, MYL12B, CDKN1A, and TMBIM1) showed a distinct expression pattern in specific absorptive cells (ABS) cell, underscoring the relevance of this cell type underlying CRC development.

Using whole exome sequencing data to evaluate pathogenic variants in target genes with CRC risk

We used whole exome sequencing data from 3362 CRC cases and 133,742 controls of European ancestry in the UK Biobank (UKBB) to evaluate the association of CRC risk with putative candidate genes identified our study using burden tests by aggregating either loss of function (pLOF) or pLOF and deleterious missense variants (Dmis) jointly in each gene (Methods). Of these 136 genes, MLH1 was significantly associated with CRC risk with P = 1.35 × 10⁻⁷ when considering only pLOF in tests (at Bonferroni-corrected threshold, 0.05/136 testing). Additional nine genes (TNFSF18, LRP1, SMAD9, PDGFB, CIB1, STK39, IGFBP3, FUT2, and FUT3) showed nominal P < 0.05 significance considering only pLoF or combination of pLoF and Dmis, whereas no significance was detected for the remaining genes.

Biological significance of the target genes for CCVs

We utilized Enrichr^19,20,21 to analyze multiple pathway databases and identify enriched biological pathways among the 136 credible target genes (Methods). At a false-discovery rate (FDR) < 0.05, 126 pathways showed significant enrichment (Supplementary Data 22). Our findings were in line with our prior study¹⁸ and highlighted the enriched signaling pathways such as TGF-β, BMP, Wnt, Hippo, and TNF-α/NF-κB, which are known to play a crucial role in the development and progression of colorectal cancer^19,20. Of the 56 genes not previously reported, nine genes (TGIF1, CDKN2B, MYC, BMP7, WNT7B, PRICKLE2, LGR6, CEBPB, and IRS2) were mapped to these pathways (Table 5). Additionally, we identified several significant pathways, including those related to cancer, pluripotency of stem cells, epithelial–mesenchymal transition, extracellular matrix organization, adipogenesis, senescence, and autophagy in cancer. Interestingly, we also identified the glycolysis pathway, which provides energy support for cancer cells, as a significant pathway not previously reported. Four previously unreported genes, GOT1, IGFBP3, IRS2, and LCT, were mapped to glycolysis, supporting their association with CRC risk.

Table 5 Significant enrichment in biological pathways

Full size table

In addition, we performed functional annotation analysis on each credible target gene and assigned them to previously described cellular processes¹⁸ (Supplementary Fig. 2). Of the 56 genes not previously reported, 26 were found to be involved in these cellular processes. Specifically, five genes were related to stemness/differentiation, one gene was linked to adhesion/migration, and six genes were associated with proliferation. Interestingly, we also identified an additional cellular process, post-translation modifications (PTMs) of protein, which included three genes (DACF12, USP12, and SENP8). These findings suggest potential critical roles of PTMs in the development of CRC.

Discussion

Our study, including approximately 254,000 individuals of East Asian and European ancestry, represents the largest study conducted to fine-map CRC risk-associated genomic regions using GWAS data. We identified 238 independent association signals at conditional P value < 1 × 10⁻⁶, including 47 signals not reported previously. Furthermore, integrating functional genomic data and results from cis-eQTL/mQTL and colocalization analyses, we identified 136 putative CRC susceptibility genes, including 56 genes that had not been previously reported. Notably, these identified genes are significantly enriched in several major CRC signaling pathways and other cancer-related pathways. Our findings not only significantly expanded the number of associated signals for CRC, but also provide substantial data to advance our understanding of CRC biology.

The integration of comprehensive functional genomic data from relevant colon tissues and cell lines, as well as genetic associations data, can facilitate the identification of potential target genes for CRC risk. Our study significantly extends previous efforts^7,11,16,17 by identifying 56 target gene candidates not previously reported for CRC risk, over half of which (29/56, 51.8%) are involved in the enriched biological pathways. For instance, eight target genes (TGIF1, CDKN2B, LGR6, MYC, PRICKLE2, WNT7B, BMP7, and TBX3) identified in this study may regulate normal intestinal homeostasis as they play roles in signaling pathways (i.e., Wnt and BMP) and pluripotency of stem cells. LGR6, for instance, is part of a G-protein-coupled receptor family and marks stem cells in the epidermis²². It activates a novel β-catenin/TCF7L2/LGR6-positive feedback loop in LGR6^high cervical cancer stem cells (CSCs) to enhance the properties of cancer stem cells, including self-renewal, differentiation, and tumorigenicity²³. Silencing of LGR6 resulted in the inhibition of stemness by repressing Wnt/β-catenin signaling in ovarian cancer²⁴. TBX3, a transcriptional repressor, regulates stem cell maintenance by controlling stem cell self-renewal and differentiation, and reduced expression levels of TBX3 are associated with reduced pluripotency of stem cells^25,26. MYC and WNT7B are implicated in the signaling related to the self-renewal and differentiation of cancer stem cells²⁷. Here, we linked MYC and WNT7B with credible causal variants of CRC risk associations through functional genomic interaction. Our findings also indicated the relevance of glycolysis to CRC risk associations, a metabolic pathway critical in early CRC tumorigenesis by supporting the energetic and biosynthetic demands of CRC cells^28,29. It should be noted that future studies are needed to validate chromatin interactions between identified CCVs and their target genes in this study by employing chromatin conformation capture technology such as in situ Hi-C, Capture Hi-C (CHi-C), and HiChIP.

Additional evidence supports some of the candidate target genes identified in our study as possible CRC susceptibility genes. In our differential gene expression analysis among normal colon mucosa, adenoma, and adenocarcinoma using gene expression data from 135 normal colon mucosas, 218 colon adenomas, and 2760 colon adenocarcinomas, we observed that 26 genes showed significant differential expression between adenoma and normal colon tissues, while 31 genes showed significant differential expression between carcinoma and adenoma tissues (adjusted P < 0.05) (Supplementary Data 20). Interestingly, three stemness/differentiation-related genes, including LRRC34, CEBPB, and TBX3, showed significant changes in their expression levels in adenoma compared to normal colon mucosa. Additionally, 34 (60.7%) of not previously identified genes have been implicated in cancer-related functions in in vitro or in vivo functional experimental studies in CRC or other cancer types (Supplementary Data 20). These results provide further evidence supporting the potential involvement of these genes in CRC progression. Despite the above supportive evidence, it remains necessary to evaluate the functions of identified putative CRC susceptibility genes through both in vitro and in vivo assays in future investigations.

The trans-ancestry and ancestry-specific fine-mapping analyses conducted in this study not only enabled the discovery of independent association signals that are shared across populations of European and East Asian ancestry, but also revealed ancestry-specific signals. The larger sample size of the European-ancestry study enabled us to identify a larger number of independent association signals than the study conducted on Asians. However, there are some ancestry-specific signals identified in this study, which is most likely due to differences in LD structures and allele frequency between these two populations. Indeed, we observed distinct differences in the allele frequency for most ancestry-specific signals, as shown in Supplementary Data 4 and 5. For instance, the lead variant of 24 European ancestry-specific signals (40%, 24/60) is not detected among East Asian-ancestry populations. On the other hand, fine-mapping analyses capitalizing on ancestry differences in LD structure can substantially reduce the credible set size compared to European-ancestry specific analysis. This highlights the value of multi-ancestry fine-mapping over single-ancestry analysis. Our analysis is limited to two ancestry groups. Further studies should increase the diversity of genetic data, including those from other racial groups.

In summary, our large trans-ancestry fine-mapping analysis has identified large numbers of not previously reported independent association signals for CRC risk and refined the majority of the previously reported association signals. By leveraging data from two ancestries, we further defined putative causal variants underlying CRC risk signals. Our study has also uncovered a credible set of target genes. These findings offer a significant advancement in our understanding of the genetic and biological processes underlying CRC and provide a roadmap for further investigation of variants and genes identified in our study.

Methods

GWAS data and meta-analysis

The GWAS data used in this study comprised 100,204 CRC cases and 154,587 controls (Supplementary Data 1), which were grouped into 31 GWAS analytical units based on the study or genotyping platform as consistent with the original reports. Of them, 17 datasets were derived from populations of European descent and 14 were from populations of Asian descent. These 31 GWAS datasets were meta-analyzed under the fixed-effects inverse variance weighted model implemented in METAL³⁰. Further details regarding each analytical unit and meta-analysis were described in Supplementary Note.

Identifying independent association signals

A total of 205 independent genetic associations have been reported for CRC risk by GWAS⁷. To define fine-mapping regions for CRC, we aggregated these risk variants using bedtools. Specifically, we identified 1 megabase (Mb) intervals centered on the risk variants, and if there were regions of overlap, we combined them into a single interval over 1 Mb. In total, we determined 143 fine-mapping regions, including 142 on autosomes and one on chromosome X (Supplementary Data 2). Our fine-mapping analysis and downstream analyses focused on the 142 genomic risk regions on autosomes.

To identify distinct association signals within each risk region, we conducted a forward stepwise conditional analysis for summary statistics from the trans-ancestral meta-analysis, using GCTA-COJO^31,32. We included common variants (MAF > 0.01) with associations at P < 0.05 in both populations. To account for differences in the LD structure, we conducted conditional analysis in each population for each fine-mapping region, conditioning on the most significant association from the trans-ancestral summary statistics. We then meta-analyzed the conditioned results using the fixed-effects inverse variance weighted model with METAL. To identify potential ancestry-specific independent signals, we also performed conditional analysis in each population, conditioning on the ancestry-specific most significant association. Common variants (MAF > 0.01) with association at P < 1 × 10⁻⁴ in each population were included. For LD estimation, we used genotyping data from 6684 unrelated samples of Asian descent³³, and 503 European samples in the 1000 Genome project as the reference.

Following a previous study conducted for breast cancer¹², we applied the conditional P value < 1 × 10⁻⁶ to define the independent signal. For each region, we first adjusted for the most significant association and then added any additional variant that remained an independent signal at the conditional P value < 1 × 10⁻⁶ to the conditional set. We then repeated the conditional analysis until no more variants met the significance threshold. In regions with multiple independent signals, we determined the index variant for each signal through a process of conditional analysis, adjusting for the index variants of the other signals. This process was repeated until the set of index variants were stabilized. The variant with the strongest residual association was defined as the index for the signal.

For independent association signals identified in ancestry-specific analyses, we compared them with those from trans-ancestry analyses by assessing correlations between their lead variants within each risk region. If a signal was consistently found in both ancestry-specific and trans-ancestry analyses (i.e., the same lead variant or correlated lead variants with LD r² > 0.1 in each corresponding population), we considered it as a sharing signal between Asian and European-ancestry populations. Otherwise, they were defined as ancestry-specific signals.

Identifying a set of CCVs of each independent signal

To determine the CCVs of each independent signal, we used the approach described in a previous study for breast cancer¹². Specifically, variants that have a conditional P value within two orders of magnitude of the most significant association, conditioning on all other independent association signals, were defined as CCVs.

RNA-seq data analysis

We conducted mRNA sequencing on tumor-adjacent normal colon tissues obtained from 364 East Asians patients with colorectal cancer who participated in the ACCC. Furthermore, we included RNA-seq data from normal colon tissues from 423 individuals of European ancestry who participated in the BarcUVa-Seq project. Included subjects, library preparation and sequencing of colon tissue samples in the ACCC and the BarcUVa-Seq project have been presented in Supplementary Note.

The raw RNA-seq data were processed according to the pipeline of the GTEx Consortium. Sequencing reads were aligned to the reference genome GRCh37 (RNA-seq data from East Asians) or GRCh38 (RNA-seq data from the BarcUVa-Seq project) with STAR (v2.5.4)³⁴. Quality control of aligned samples was performed using RNA-SeQC (v2.3.5)³⁵. Samples that met any of the following criteria were removed: (1) <10 million mapped reads; (2) read mapping rate < 0.2; (3) intergenic mapping rate >0.4; (4) base mismatch rate >0.01 for read mate 1 or >0.02 for read mate 2; and (5) rRNA mapping rate >0.3. If the sample had replicated RNA-seq data, the one with the highest mapped reads was retained.

Gene-level expression quantification was performed using RNA-SeQC based on the GENCODE release 19 annotation (for RNA-seq data from East Asians) and the GENCODE release 26 annotation (for RNA-seq data from the BarcUVa-Seq project)³⁶. The read counts and TPM values of genes were calculated using aligned reads with the following criteria: (1) reads were uniquely mapped; (2) aligned reads were properly paired; (3) the read alignment distance was <6. The genes with expression thresholds of ≥0.1 TPM in ≥20% of samples and ≥6 reads (unnormalized) in ≥20% of samples were selected. Quantile normalization of the gene expression was performed. We further performed rank-based inverse normal transformation for the expressions of genes across samples.

Cis-expression/methylation quantitative loci (cis-eQTL/mQTL) analysis

To identify target genes, we performed cis-eQTL analysis based on a linear regression framework^10,11. Gene expression data from four expression datasets comprising a total of 1,299 individuals were used: 1) GTEx project of transverse colon tissues from 368 individuals predominantly of European ancestry, 2) Colonomics project of normal colon tissues or tumor-adjacent normal colon tissues from 144 individuals of European ancestry, 3) BarcUVa-Seq project of normal colon tissues from 423 individuals of European ancestry, and 4) ACCC of tumor-adjacent normal colon tissue from 364 CRC patients of East Asian ancestry. We obtained available cis-eQTL results for CCVs and their nearby genes (within 1 Mb to CCV) from the GTEx database (version 8) and the Colonomics project. Details for gene expression data and eQTL analysis in the Colonomics project are explained elsewhere³⁷. For the analyses using the remaining two datasets, we conducted a linear regression analysis to assess the associations between CCV and the normalized expression levels of nearby genes (within 1 Mb to CCV), adjusting for age, gender, and five top principal components.

We conducted cis-mQTL analysis for CCVs identified in European and trans-ancestry analyses. To do this, we included methylation data obtained from a total of 321 individuals. These datasets consisted of 189 transverse colon tissues predominantly of European ancestry from GTEx, as well as normal colon tissues or tumor-adjacent normal colon tissues of 132 individuals of European ancestry from the Colonomics project. We extracted cis-mQTL results for CCVs and their nearby CpG sites (within 1 Mb to CCV) from the GTEx database (version 8)¹⁴. In the Colonomics project, a linear regression analysis was used to evaluate the associations between CCV and the normalized methylation levels of CpG sites (within 1 Mb to CCV), with adjustments of age, gender, and colon sites (right/left). Further details about the cis-mQTL in the Colonomics project can be found in previous studies^37,38.

Meta-analysis of cis-eQTL/mQTL results

We performed a meta-analysis to integrate the summary cis-eQTL/mQTL results based on beta and p values from different datasets^10,11. In brief, we calculated the z score from function qnorm(p/2)*sign(beta) and further converted the standard z score derived from sum(z*sqrt(N))/sqrt(sum(N)) with a normalized weighted sampled size. Here, beta and p value were derived from eQTL/mQTL results and N referred to the sample size for each dataset. The meta p value was derived from the standard z score. For independent signals detected in both European and Asian populations, the eQTL results from both populations were combined.

We adjusted the combined p-values of eQTL/mQTL results with the Bonferroni procedure. The procedure was conducted for index variants of independent association signals. The Bonferroni-adjusted P < 0.05 was applied to identify potential target genes for each signal.

Colocalization analyses between GWAS association signals and eQTL/mQTL signals

To identify putative target genes, we employed the SMR method to conduct a colocalization analysis³⁹. We integrated GWAS summary statistics of CCVs and their associations with genes from eQTL/mQTL analysis described above. The results of meta-analyses on cis-eQTLs/mQTLs were used. Specifically, we have a statistic:

$${T}_{{SMR}}={b}_{{xy}}^{2}/{Var}\left({b}_{{xy}}\right)\, \approx \, \frac{{Z}_{{zy}}^{2}{Z}_{{zx}}^{2}}{{Z}_{{zy}}^{2}+{Z}_{{zx}}^{2}\,}$$

(1)

Here, Z_zx and Z_zy are the Z statistics for the GWAS summary statistics and the cis-eQTL/mQTL results, respectively. T_SMR is the χ2 statistic, which tests the significance of b_xy. The significant colocalized signals were determined based on the threshold of the Bonferroni-corrected P_SMR < 0.05 within each independent signal.

Functional annotation of CCVs

We investigated whether each potential causal variant was mapped to gene regulatory regions (i.e., promoter or enhancer) (Supplementary Data 8). We obtained 351 chromatin immunoprecipitation sequencing (ChIP-seq) peak files for histone modification marks and transcription factors, and 25 DNase I hypersensitive sites sequencing peak files for chromatin accessibility, generated in normal colorectal epithelium and CRC cell lines from the Cistrome database^40,41. Only peaks that met all six quality controls set recommended by Cistrome were analyzed. Additionally, we obtained available ChIP-seq data of histone modification marks from colon tissues, tumor tissues of CRC, and CRC cell lines from Gene Expression Omnibus (GEO), which included 16 from GSE133928⁴², 215 from GSE136889⁴³ and 233 from GSE156613⁴⁴. To generate coverage tracks Bigwig (bw) files for ChIP-seq data, we converted them to bedGraph files and then identified peaks with the subcommand bdgpeakcall from macs2⁴⁵. For each variant, we examined whether it was mapped to a peak region of histone modification marks, DNase I hypersensitive, or transcription factors binding sites using an in-house script.

In silico prediction of regulatory element-to-gene

Since the majority of the CCVs are located outside protein-coding regions, genes can potentially be regulated by CCVs located in distal enhancer elements and proximal promoter elements. Hence, we identified an extensive set of functional genomic data from normal colon tissues or tumor tissues of colorectal cancer or colorectal cancer cell lines (Supplementary Data 9). Subsequently, we conducted an in-silico analysis for each CCV-gene pair.

We used a variety of experimental and computational functional genomic data to identify target genes of CCVs in regulatory elements. Specifically, for distal regulatory elements, we utilized chromatin-chromatin interaction data from experiments or computational predictions. To do this, we downloaded 13 experimental chromatin-chromatin interaction datasets under accessions GSE133928⁴² and GSE136629⁴³ from GEO, as well as two promoter capture Hi-C datasets from the previous study⁴⁶. We combined this data with ChIP-seq data of the histone modification H3K27ac (an active enhancer mark) to identify enhancer-promoter loops. We defined these loops as interactions where one fragment overlapped an H3K27ac peak (enhancer-like) and the other fragment overlapped the promoter of a gene (the region from downstream 1 kb to upstream 100 bp around the transcription start site).

In addition to this, we downloaded experimentally confirmed enhancer-gene pairs from the ENdb database. We also obtained computational enhancer-promoter interactions from IM-PET⁴⁷, FANTOM5^48,49, EnhancerAtlas⁵⁰, and super-enhancer^51,52. To further refine our analysis, we included topologically associating domain (TAD) boundaries in three colorectal cancer cell lines (HT29, LoVo, and DLD1)^46,53. Finally, we examined the overlap between CCVs and enhancer elements. For proximal promoter elements, we analyzed CCVs located within gene promoter regions that intersected with ChIP-seq peaks of H3K4me3 (an activity promoter mark).

To identify potential loss-of-function variants and their corresponding targeted genes, we conducted variant annotation of CCVs using the Variant Effect Predictor (VEP) tool⁵⁴. To predict the consequence of missense coding variants, we utilized PolyPhen-2 and SIFT. Furthermore, to evaluate splicing effects, MaxEntScan was used.

We scored CCV target genes using different criteria (Supplementary Data 9). For the potential target gene of CCV in distal enhancer elements, the gene was awarded two points or one point if there was evidence from experimental chromatin-chromatin interaction or computed interaction. The score was unweighted to three if both experimental and computational interaction were detected for the gene-CCV pair. If CCV interacted with genomic features (open chromatin, activity enhancer, and TF binding sites), the corresponding gene was further unweighted by one point. An additional point was awarded if there are at least two interactions for the CCV. If the gene were colorectal cancer or pan-cancer drivers⁵⁵, they were up-weighted by an additional point. The score was down-weighted for the gene if the CCV-gene pair was separated by TAD or a lack of expression in colon tissues. Distal scores eventually ranged from 0 to 6. For the potential target gene of CCV in proximal promoter elements, the gene was awarded one point if CCV overlapped with binding sites of transcription factors. If genes were colorectal cancer or pan-cancer drivers, they were up-weighted with an additional point. A lack of its expression resulted in down-weighting to 10% as target genes. Proximal scores eventually ranged from 0 to 2. Genes predicted to be regulated targets of coding CCVs were awarded points based on the annotation as either of missense, nonsense, and predicted splicing alterations. The consequences of missense variants which probably are damaging or deleterious resulted in the addition of one point to the target gene. Further points were awarded to such a gene if it was colorectal cancer or pan-cancer drivers. A lack of expression reduced the score (the score was down to 10%). Coding scores ranged from 0 to 2. For the set of confident target genes, we defined such genes if it has a distal score >4 or a proximal score >1, or a coding score >1.

Credible set of susceptibility genes

To determine a set of credible genes for CRC susceptibility, we combined information on gene-CRC risk associations through TWAS and colocalization of eQTL signal with GWAS risk signals for genes that were present in both our study and previous investigations. We used three sets of previously identified genes below: (A) 155 effector genes identified through GWAS, TWAS, TIsWAS, and MWAS⁷; (B) 136, 26, and 48 genes identified through TWASs^7,16,17; (C) 73 genes identified through colocalization analysis between eQTL and GWAS signals¹¹ or genes associated with CRC risk at nominal P < 0.05 in the previous TWAS¹⁷. We considered the prioritization order as A > B > C for these three gene sets and focused on protein-coding genes outside the MHC region. For the independent association signals with multiple target gene candidates, we kept either genes with higher prioritization or all genes if there was no evidence from these three gene sets. For the independent association signals with a single gene, we kept it regardless of evidence from the gene sets.

Single-cell RNA-sequencing data analysis

We included single-cell RNA-sequencing datasets from colon tissues of 31 individuals who participated in the Colorectal Molecular Atlas Project (COLON MAP)¹⁸. We analyzed gene expression dataset for each individual’s cell and combined these datasets into a count matrix. We normalized the number of unique molecular identifiers (UMIs) per cell and converted it to transcripts per 10,000 transcripts (TP10K). Next, we applied a logarithmic transformation to the normalized values and got the log₂(TP10K + 1) expression matrix for the downstream analyses. Further, we determined the 2000 most highly variable genes within the entire dataset and performed a principal component analysis (PCA). The top 30 and 40 principal components (PCs) were identified. Subsequently, we performed batch correction removal and utilized the top 40 batch-corrected components to construct a k-nearest neighbors graph of cell profiles with k = 9. We visualized the individual single-cell profiles using the Uniform Manifold Approximation and Projection (UMAP) and constructed the neighborhood graph using the Leiden graph-clustering method. Nine cell types were defined, including well-known major cell types such as absorptive cells (ABS), crypt top colonocytes (CT), enteroendocrine cells (EE), goblet cells (GOB), stem cells (STM), and others. We identified differentially expressed genes (DEGs) by comparing each cell type with all other cell types and calculated a P-value for each gene using Wilcoxon’s rank-sum test. The criteria |log2 fold change (FC)| > 1 and P < 0.05 were applied to determine genes with significantly differential expression between cell types.

Burden test for credible susceptibility genes

We annotated all variants in the UKBB WES 200 K cohort with functional annotations from ANNOVAR⁵⁶ based on the reference genome GRCh38. We only included rare loss-of-function (LoF) and deleterious missense (Dmis) variants with MAF < 0.01 in our gene-based test. LoF variants were those predicted as frameshift insertion/deletion, splice-site alteration, stop gain, and stop loss by ANNOVAR, and deleterious missense (Dmis) variants were those predicted as deleterious by MetaSVM⁵⁷. We considered both LoF sets and damaging sets (LoF+ Dmis) within a gene for testing. For a given set, we collapsed rare variants within a gene as a single combined ‘mask’ and tested the association between the ‘mask’ genotype and the CRC phenotype using logistic regression after adjusting for sex, age, the interaction of sex and age, and the top four principal components.

Pathway analysis of credible susceptibility genes

To explore the potential biological roles of the identified CRC susceptibility genes, we analyzed their functional enrichment using the enrichR^19,20,21 and various pathway databases, including WikiPathway, KEGG, MSigDB, and Reactome. The biological pathways (adjusted P < 0.05) were considered and presented.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability

The GWAS summary statistics are available at the GWAS catalog under accession number GCST90129505. The RNA-seq data and genotype data of subjects of East Asian ancestry from the ACCC is being deposited to NCBI database of Genotypes and Phenotypes (dbGaP, accession number phs002813.v1.p1). All requests to access these data could also be made by contacting Drs. Wei Zheng (wei.zheng@vanderbilt.edu) and Xingyi Guo (xingyi.guo@vumc.org). The data from the Genotype-Tissue Expression (GTEx, version 8) project used in this study are publicly available at the dbGaP under accession number phs000424.v8.p2 (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000424.v8.p2). The transcriptome and genotype data as well as the sample covariates from the BarcUVa-Seq project can be accessed at the dbGaP under accession number phs003338.v1.p1 (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs003338.v1.p1). The access to data from the Colonomics project could be requested by submission of an inquiry to Dr. Victor Moreno (v.moreno@iconcologia.net). The CRC-relevant epigenome and functional genomic data were obtained from the NCBI’s Gene Expression Omnibus database (GEO) under accession numbers: GSE133928, GSE136889, and GSE156613. Enhancer-promoter interaction data were obtained from the ENdb database (https://bio.liclab.net/ENdb/), 4Dgenome (https://4dgenome.research.chop.edu/), FANTOM5 (https://fantom.gsc.riken.jp/5/), EnhancerAtlas 2.0 (http://www.enhanceratlas.org/) and Super-enhancers (https://bio.liclab.net/sedb/ and https://www.cell.com/fulltext/S0092-8674(13)01227-0#supplementaryMaterial). Single-cell RNA-sequencing datasets from colon tissues of 31 individuals were obtained from the Colorectal Molecular Atlas Project (COLON MAP). Whole exome sequencing data from 137,104 individuals of European ancestry were obtained from the UK Biobank (https://www.ukbiobank.ac.uk/).

Code availability

The code used in this study is available at the GitHub repository https://github.com/zhishanchen/CRC_Finemapping⁵⁸.

References

Sung, H. et al. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J. Clin. 71, 209–249 (2021).
Article PubMed Google Scholar
Jiao, S. et al. Estimating the heritability of colorectal cancer. Hum. Mol. Genet. 23, 3898–3905 (2014).
Article CAS PubMed PubMed Central Google Scholar
Lu, Y. et al. Large-scale genome-wide association study of east asians identifies loci associated with risk for colorectal cancer. Gastroenterology 156, 1455–1466 (2019).
Article PubMed Google Scholar
Huyghe, J. R. et al. Discovery of common and rare genetic risk variants for colorectal cancer. Nat. Genet. 51, 76–87 (2019).
Article CAS PubMed Google Scholar
Law, P. J. et al. Association analyses identify 31 new risk loci for colorectal cancer susceptibility. Nat. Commun. 10, 2154 (2019).
Article ADS PubMed PubMed Central Google Scholar
Lu, Y. et al. Identification of Novel Loci and New Risk Variant in Known Loci for Colorectal Cancer Risk in East Asians. Cancer Epidemiol. Biomark. Prev. 29, 477–486 (2020).
Article CAS Google Scholar
Fernandez-Rozadilla, C. et al. Deciphering colorectal cancer genetics through multi-omic analysis of 100,204 cases and 154,587 controls of European and east Asian ancestries. Nat. Genet. 55, 89–99 (2023).
Article CAS PubMed Google Scholar
Zeng, C. et al. Identification of independent association signals and putative functional variants for breast cancer risk through fine-scale mapping of the 12p11 locus. Breast Cancer Res. 18 (2016).
Guo, X. et al. A comprehensive cis-eQTL analysis revealed target genes in breast cancer susceptibility loci identified in genome-wide association studies. Am. J. Hum. Genet. 102, 890–903 (2018).
Article CAS PubMed PubMed Central Google Scholar
Chen, Z. et al. Identifying putative susceptibility genes and evaluating their associations with somatic mutations in human cancers. Am. J. Hum. Genet. 105, 477–492 (2019).
Article CAS PubMed PubMed Central Google Scholar
Yuan, Y. et al. Multi-omics analysis to identify susceptibility genes for colorectal cancer. Hum. Mol. Genet. 30, 321–330 (2021).
Article CAS PubMed PubMed Central Google Scholar
Fachal, L. et al. Fine-mapping of 150 breast cancer risk regions identifies 191 likely target genes. Nat. Genet. 52, 56–73 (2020).
Article CAS PubMed PubMed Central Google Scholar
Thiagalingam, A. et al. RREB-1, a novel zinc finger protein, is involved in the differentiation response to Ras in human medullary thyroid carcinomas. Mol. Cell. Biol. 16, 5335–5345 (1996).
Article CAS PubMed PubMed Central Google Scholar
Oliva, M. et al. DNA methylation QTL mapping across diverse human tissues provides molecular links between genetic variation and complex traits. Nat. Genet. 55, 112–122 (2023).
Article CAS PubMed Google Scholar
Wu, Y. et al. Joint analysis of GWAS and multi-omics QTL summary statistics reveals a large fraction of GWAS signals shared with molecular phenotypes. Cell Genom. 3, 100344 (2023).
Article CAS PubMed PubMed Central Google Scholar
Guo, X. et al. Identifying novel susceptibility genes for colorectal cancer risk from a transcriptome-wide association study of 125,478 subjects. Gastroenterology 160, 1164–1178.e6 (2021).
Article CAS PubMed Google Scholar
Chen, Z. et al. Novel insights into genetic susceptibility for colorectal cancer from transcriptome-wide association and functional investigation. J. Natl Cancer Inst. https://doi.org/10.1093/jnci/djad178 (2023).
Article PubMed PubMed Central Google Scholar
Chen, B. et al. Differential pre-malignant programs and microenvironment chart distinct paths to malignancy in human colorectal polyps. Cell 184, 6262–6280.e26 (2021).
Article CAS PubMed PubMed Central Google Scholar
Chen, E. Y. et al. Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinform. 14, 128 (2013).
Article Google Scholar
Kuleshov, M. V. et al. Enrichr: a comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90–W97 (2016).
Article CAS PubMed PubMed Central Google Scholar
Xie, Z. et al. Gene set knowledge discovery with Enrichr. Curr. Protoc. 1, e90 (2021).
Article CAS PubMed PubMed Central Google Scholar
Jung, B., Staudacher, J. J. & Beauchamp, D. Transforming growth factor β superfamily signaling in development of colorectal cancer. Gastroenterology 152, 36–52 (2017).
Article CAS PubMed Google Scholar
Feng, Q., Li, S., Ma, H.-M., Yang, W.-T. & Zheng, P.-S. LGR6 activates the Wnt/β-catenin signaling pathway and forms a β-catenin/TCF7L2/LGR6 feedback loop in LGR6high cervical cancer stem cells. Oncogene 40, 6103–6114 (2021).
Article CAS PubMed PubMed Central Google Scholar
Ruan, X. et al. Silencing LGR6 attenuates stemness and chemoresistance via inhibiting Wnt/β-catenin signaling in ovarian cancer. Mol. Ther. Oncolytics 14, 94–106 (2019).
Article CAS PubMed PubMed Central Google Scholar
Dong, L., Lyu, X., Faleti, O. D. & He, M.-L. The special stemness functions of Tbx3 in stem cells and cancer development. Semin. Cancer Biol. 57, 105–110 (2019).
Article CAS PubMed Google Scholar
Russell, R. et al. A dynamic role of TBX3 in the pluripotency circuitry. Stem Cell Rep. 5, 1155–1170 (2015).
Article CAS Google Scholar
Elbadawy, M., Usui, T., Yamawaki, H. & Sasaki, K. Emerging roles of C-Myc in cancer stem cell-related signaling and resistance to cancer chemotherapy: A potential therapeutic target against colorectal cancer. Int. J. Mol. Sci. 20, 2340 (2019).
Article CAS PubMed PubMed Central Google Scholar
Satoh, K. et al. Global metabolic reprogramming of colorectal cancer occurs at adenoma stage and is induced by MYC. Proc. Natl Acad. Sci. USA 114, E7697–E7706 (2017).
Article CAS PubMed PubMed Central Google Scholar
Ong, E. S. et al. Metabolic profiling in colorectal cancer reveals signature metabolic shifts during tumorigenesis. Mol. Cell. Proteom. https://doi.org/10.1074/mcp.M900551-MCP200 (2010).
Article Google Scholar
Willer, C. J., Li, Y. & Abecasis, G. R. METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics 26, 2190–2191 (2010).
Article CAS PubMed PubMed Central Google Scholar
Yang, J. et al. Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits. Nat. Genet. 44, 369–375 (2012).
Article CAS PubMed PubMed Central Google Scholar
Yang, J., Lee, S. H., Goddard, M. E. & Visscher, P. M. GCTA: a tool for genome-wide complex trait analysis. Am. J. Hum. Genet. 88, 76–82 (2011).
Article CAS PubMed PubMed Central Google Scholar
Zhang, B. et al. Large-scale genetic study in East Asians identifies six new loci associated with colorectal cancer risk. Nat. Genet. 46, 533–542 (2014).
Article CAS PubMed PubMed Central Google Scholar
Dobin, A. et al. STAR: ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
Article CAS PubMed Google Scholar
Graubert, A., Aguet, F., Ravi, A., Ardlie, K. G. & Getz, G. RNA-SeQC 2: Efficient RNA-seq quality control and quantification for large cohorts. Bioinformatics 37, 3048–3050 (2021).
Article CAS PubMed PubMed Central Google Scholar
Frankish, A. et al. GENCODE: reference annotation for the human and mouse genomes in 2023. Nucleic Acids Res. 51, D942–D949 (2023).
Article CAS PubMed Google Scholar
Díez-Villanueva, A. et al. Identifying causal models between genetically regulated methylation patterns and gene expression in healthy colon tissue. Clin. Epigenetics 13, 162 (2021).
Article PubMed PubMed Central Google Scholar
Díez-Villanueva, A. et al. DNA methylation events in transcription factors and gene expression changes in colon cancer. Epigenomics 12, 1593–1610 (2020).
Article PubMed Google Scholar
Zhu, Z. et al. Integration of summary data from GWAS and eQTL studies predicts complex trait gene targets. Nat. Genet. 48, 481–487 (2016).
Article CAS PubMed Google Scholar
Zheng, R. et al. Cistrome Data Browser: expanded datasets and new tools for gene regulatory analysis. Nucleic Acids Res. 47, D729–D735 (2019).
Article ADS CAS PubMed Google Scholar
Mei, S. et al. Cistrome Data Browser: a data portal for ChIP-Seq and chromatin accessibility data in human and mouse. Nucleic Acids Res. 45, D658–D662 (2017).
Article CAS PubMed Google Scholar
Johnstone, S. E. et al. Large-scale topological changes restrain malignant progression in colorectal cancer. Cell 182, 1474–1489.e23 (2020).
Article CAS PubMed PubMed Central Google Scholar
Orouji, E. et al. Chromatin state dynamics confers specific therapeutic strategies in enhancer subtypes of colorectal cancer. Gut 71, 938–949 (2022).
Article CAS PubMed Google Scholar
Li, Q.-L. et al. Genome-wide profiling in colorectal cancer identifies PHF19 and TBC1D16 as oncogenic super enhancers. Nat. Commun. 12, 6407 (2021).
Article ADS CAS PubMed PubMed Central Google Scholar
Zhang, Y. et al. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Article PubMed PubMed Central Google Scholar
Orlando, G. et al. Promoter capture Hi-C-based identification of recurrent noncoding mutations in colorectal cancer. Nat. Genet. 50, 1375–1380 (2018).
Article CAS PubMed PubMed Central Google Scholar
He, B., Chen, C., Teng, L. & Tan, K. Global view of enhancer-promoter interactome in human cells. Proc. Natl Acad. Sci. USA 111, E2191–E2199 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Lizio, M. et al. Gateways to the FANTOM5 promoter level mammalian expression atlas. Genome Biol. 16, 22 (2015).
Article CAS PubMed PubMed Central Google Scholar
Andersson, R. et al. An atlas of active enhancers across human cell types and tissues. Nature 507, 455–461 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Gao, T. & Qian, J. EnhancerAtlas 2.0: an updated resource with enhancer annotation in 586 tissue/cell types across nine species. Nucleic Acids Res. 48, D58–D64 (2020).
CAS PubMed Google Scholar
Hnisz, D. et al. Super-enhancers in the control of cell identity and disease. Cell 155, 934–947 (2013).
Article CAS PubMed Google Scholar
Jiang, Y. et al. SEdb: a comprehensive human super-enhancer database. Nucleic Acids Res. 47, D235–D243 (2019).
Article CAS PubMed Google Scholar
Lajoie, B. R., Dekker, J. & Kaplan, N. The Hitchhiker’s guide to Hi-C analysis: practical guidelines. Methods 72, 65–75 (2015).
Article CAS PubMed Google Scholar
McLaren, W. et al. The ensembl variant effect predictor. Genome Biol. 17 (2016).
Bailey, M. H. et al. Comprehensive characterization of cancer driver genes and mutations. Cell 173, 371–385.e18 (2018).
Article CAS PubMed PubMed Central Google Scholar
Yang, H. & Wang, K. Genomic variant annotation and prioritization with ANNOVAR and wANNOVAR. Nat. Protoc. 10, 1556–1566 (2015).
Article CAS PubMed PubMed Central Google Scholar
Kim, S., Jhong, J.-H., Lee, J. & Koo, J.-Y. Meta-analytic support vector machine for integrating multiple omics data. BioData Min. 10, 2 (2017).
Article PubMed PubMed Central Google Scholar
zhishanchen. zhishanchen/CRC_Finemapping: crc_finemapping. (Zenodo, 2024). https://doi.org/10.5281/ZENODO.10645372

Download references

Acknowledgements

This research was supported primarily by US National Institutes of Health (NIH) grant R01CA188214 (to W.Z.), Anne Potter Wilson Chair endowment from the Vanderbilt University School of Medicine (to W.Z.), and NIH grant R37CA227130 and R01CA269589 (to X.G.). Sample preparation and genotyping assays at Vanderbilt University were conducted at the Survey and Biospecimen Shared Resources and Vanderbilt Microarray Shared Resource, supported in part by the Vanderbilt-Ingram Cancer Center (grant P30CA068485). Data analyses were performed on servers maintained by the Advanced Computing Center for Research and Education (ACCRE) at Vanderbilt University.

Author information

Authors and Affiliations

Division of Epidemiology, Department of Medicine, Vanderbilt-Ingram Cancer Center, Vanderbilt Epidemiology Center, Vanderbilt University Medical Center, Nashville, TN, USA
Zhishan Chen, Xingyi Guo, Jie Ping, Guochong Jia, Jirong Long, Chao Li, Xiao-Ou Shu, Yingchang Lu, Wanqing Wen, Qiuyin Cai & Wei Zheng
Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN, USA
Xingyi Guo
Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA
Ran Tao & Quanhu Shen
Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, 37232, TN, USA
Ran Tao
Public Health Sciences Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
Jeroen R. Huyghe, Minta Thomas, Tabitha A. Harrison, Polly A. Newcomb, John D. Potter, Robert Steinfelder, Emily White, Lori C. Sakoda, Stephanie A. Bien, Charles Kooperberg, Christopher I. Li, Yi Lin, Ross Prentice, Conghui Qu, Li Hsu & Ulrike Peters
Division of Genetics and Epidemiology, Institute of Cancer Research, London, UK
Philip J. Law, Peter Broderick, James Studd & Richard S. Houlston
Edinburgh Cancer Research Centre, Institute of Genomics and Cancer, University of Edinburgh, Edinburgh, UK
Ceres Fernandez-Rozadilla, Juan Fernandez-Tajes, Kitty Sherwood & Ian Tomlinson
Genomic Medicine Group, Instituto de Investigacion Sanitaria de Santiago, Santiago de Compostela, Spain
Ceres Fernandez-Rozadilla
Department of Biostatistics, Yale School of Public Health, New Haven, CT, USA
Yuhan Xie & Hongyu Zhao
Colon Cancer Genetics Group, Medical Research Council Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, UK
Maria N. Timofeeva, Victoria Svinti, Kevin Donnelly, Susan M. Farrington, James Blackmur, Peter G. Vaughan-Shaw, Stuart Reid, Marion Walker, Li Yin Ooi & Malcolm G. Dunlop
Danish Institute for Advanced Study, Department of Public Health, University of Southern Denmark, Odense, Denmark
Maria N. Timofeeva
Genomic Medicine Institute, Cleveland Clinic, Cleveland, OH, USA
Stephanie L. Schmit
Population and Cancer Prevention Program, Case Comprehensive Cancer Center, Cleveland, OH, USA
Stephanie L. Schmit
Colorectal Cancer Group, ONCOBELL Program, Bellvitge Biomedical Research Institute, Barcelona, Spain
Virginia Díez-Obrero, Ferran Moratalla-Navarro, Mireia Obón-Santacana & Victor Moreno
Consortium for Biomedical Research in Epidemiology and Public Health, Madrid, Spain
Virginia Díez-Obrero, Ferran Moratalla-Navarro, Mireia Obón-Santacana, Vicente Martín-Sánchez & Victor Moreno
Department of Clinical Sciences, Faculty of Medicine, University of Barcelona, Barcelona, Spain
Virginia Díez-Obrero, Ferran Moratalla-Navarro & Victor Moreno
Oncology Data Analytics Program, Catalan Institute of Oncology, Barcelona, Spain
Virginia Díez-Obrero, Ferran Moratalla-Navarro, Mireia Obón-Santacana & Victor Moreno
Center for Public Health Genomics, Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA
Matthew Devall & Graham Casey
Institute of Cancer and Genomic Sciences, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK
Claire Palles & Ian Tomlinson
Department of Public Health, Richard Doll Building, University of Oxford, Oxford, UK
Sarah E. W. Briggs
Department of Preventive Medicine, USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
David V. Conti, Jane C. Figueiredo, Christopher K. Edlund, W. James Gauderman & Andre Kim
Case Comprehensive Cancer Center, Case Western Reserve University, Cleveland, OH, USA
Fredrick R. Schumacher & Li Li
Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA
Fredrick R. Schumacher
The Steve and Cindy Rasmussen Institute for Genomic Medicine, Nationwide Children’s Hospital, Columbus, OH, USA
Marilena Melas
Clalit National Cancer Control Center, Haifa, Israel
Gad Rennert, Kevin J. McDonnell, Joel K. Greenson & Flavio Lejbkowicz
Department of Community Medicine and Epidemiology, Lady Davis Carmel Medical Center, Haifa, Israel
Gad Rennert, Zomoroda Abu-Ful, Katerina Shulman & Flavio Lejbkowicz
Ruth and Bruce Rappaport Faculty of Medicine, Technion-Israel Institute of Technology, Haifa, Israel
Gad Rennert, Kevin J. McDonnell & Joel K. Greenson
Biomedicine Institute, University of León, León, Spain
Vicente Martín-Sánchez
Center for Colorectal Cancer, National Cancer Center Hospital, National Cancer Center, Gyeonggi-do, South Korea
Jae Hwan Oh
Department of Cancer Biomedical Science, Graduate School of Cancer Science and Policy, National Cancer Center, Gyeonggi-do, South Korea
Jeongseon Kim
Department of Epidemiology and Health Promotion, Graduate School of Public Health, Yonsei University, Seoul, South Korea
Sun Ha Jee & Keum Ji Jung
Department of Preventive Medicine, Chonnam National University Medical School, Gwangju, South Korea
Sun-Seog Kweon & Min-Ho Shin
Cancer Research Institute, Seoul National University, Seoul, South Korea
Aesun Shin
Department of Preventive Medicine, Seoul National University College of Medicine, Seoul, South Korea
Aesun Shin & Yoon-Ok Ahn
Department of Social and Preventive Medicine, Hallym University College of Medicine, Okcheon-dong, South Korea
Dong-Hyun Kim
Division of Cancer Epidemiology and Prevention, Aichi Cancer Center Research Institute, Nagoya, Japan
Isao Oze
Department of Epidemiology, Nagoya University Graduate School of Medicine, Nagoya, Japan
Keitaro Matsuo
Division of Molecular and Clinical Epidemiology, Aichi Cancer Center Research Institute, Nagoya, Japan
Keitaro Matsuo
Laboratory of Clinical Genome Sequencing, Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Tokyo, Japan
Koichi Matsuda
Laboratory of Genome Technology, Human Genome Center, Institute of Medical Science, University of Tokyo, Tokyo, Japan
Chizu Tanikawa
School of Public Health, Sun Yat-sen University, Guangzhou, China
Zefang Ren
State Key Laboratory of Oncogenes and Related Genes and Department of Epidemiology, Shanghai Cancer Institute, Renji Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China
Yu-Tang Gao
State Key Laboratory of Oncology in South China, Cancer Center, Sun Yat-sen University, Guangzhou, China
Wei-Hua Jia
Centre for Epidemiology and Biostatistics, Melbourne School of Population and Global Health, University of Melbourne, Melbourne, VIC, Australia
John L. Hopper, Mark A. Jenkins, Aung Ko Win & Graham G. Giles
Department of Epidemiology, School of Public Health and Institute of Health and Environment, Seoul National University, Seoul, South Korea
John L. Hopper
Department of Laboratory Medicine and Pathology, Mayo Clinic Arizona, Scottsdale, AZ, USA
Rish K. Pai
Department of Medicine, Samuel Oschin Comprehensive Cancer Institute, Cedars-Sinai Medical Center, Los Angeles, CA, USA
Jane C. Figueiredo
Division of Oncology, Department of Medicine, Cedars-Sinai Cancer Research Center for Health Equity, Los Angeles, CA, USA
Robert W. Haile
Lunenfeld Tanenbaum Research Institute, Mount Sinai Hospital, University of Toronto, Toronto, ON, Canada
Steven Gallinger
Division of Biomedical Sciences, Memorial University of Newfoundland, St. John, ON, Canada
Michael O. Woods
School of Public Health, University of Washington, Seattle, WA, USA
Polly A. Newcomb
City of Hope National Medical Center, Translational Genomics Research Institute, Phoenix, AZ, USA
David Duggan
Institute of Medical Genetics, Cardiff University, Cardiff, UK
Jeremy P. Cheadle
MRC Clinical Trials Unit, Medical Research Council, Cardiff, UK
Richard Kaplan
Department of Oncology, University of Oxford, Oxford, UK
Rachel Kerr
Radcliffe Department of Medicine, University of Oxford, Oxford, UK
David Kerr
Department of Surgical Oncology, University Hospital for Tumors, Sestre milosrdnice University Hospital Center, Zagreb, Croatia
Iva Kirac
Department of Pathology, Central Finland Health Care District, Jyväskylä, Finland
Jan Böhm
Central Finland Health Care District, Jyväskylä, Finland
Jukka-Pekka Mecklin
Department of Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
Pekka Jousilahti & Paul Knekt
Department of Medical and Clinical Genetics, University of Helsinki, Helsinki, Finland
Lauri A. Aaltonen, Tatiana Cajuso, Ulrika Hänninen, Johanna Kondelin, Kimmo Palin & Tomas Tanskanen
Genome-Scale Biology Research Program, University of Helsinki, Helsinki, Finland
Lauri A. Aaltonen, Tatiana Cajuso, Ulrika Hänninen, Johanna Kondelin, Kimmo Palin & Tomas Tanskanen
Department of Public Health and Welfare, Finnish Institute for Health and Welfare, Helsinki, Finland
Harri Rissanen & Satu Männistö
Faculty of Social Sciences, Tampere University, Tampere, Finland
Eero Pukkala
Finnish Cancer Registry, Institute for Statistical and Epidemiological Cancer Research, Helsinki, Finland
Eero Pukkala
Folkhälsan Research Centre, University of Helsinki, Helsinki, Finland
Johan G. Eriksson
Human Potential Translational Research Programme, National University of Singapore, Singapore, Singapore
Johan G. Eriksson
Unit of General Practice and Primary Health Care, University of Helsinki and Helsinki University Hospital, Helsinki, Finland
Johan G. Eriksson
Department of Surgery, Abdominal Centre, Helsinki University Hospital, Helsinki, Finland
Laura Renkonen-Sinisalo
Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD, USA
Demetrius Albanes, Stephanie J. Weinstein, Sonja I. Berndt & Stephen J. Chanock
Department of Nutritional Sciences, School of Public Health, University of Michigan, Ann Arbor, MI, USA
Edward Ruiz-Narvaez
Department of Medicine, Boston University School of Medicine, Boston, MA, USA
Julie R. Palmer
Slone Epidemiology Center at Boston University, Boston, MA, USA
Julie R. Palmer
Colorectal Oncogenomics Group, Department of Clinical Pathology, University of Melbourne, Parkville, VIC, Australia
Daniel D. Buchanan
Genomic Medicine and Family Cancer Clinic, Royal Melbourne Hospital, Parkville, VIC, Australia
Daniel D. Buchanan
University of Melbourne Centre for Cancer Research, Victorian Comprehensive Cancer Centre, Parkville, VIC, Australia
Daniel D. Buchanan
Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA
Elizabeth A. Platz & Kala Visvanathan
Huntsman Cancer Institute and Department of Population Health Sciences, University of Utah, Salt Lake City, UT, USA
Cornelia M. Ulrich
Cancer Epidemiology Program, H. Lee Moffitt Cancer Center and Research Institute, Tampa, FL, USA
Erin Siegel
Institute of Cancer Research, Department of Medicine I, Medical University Vienna, Vienna, Austria
Stefanie Brezina & Andrea Gsur
Department of Epidemiology and Population Health, Albert Einstein College of Medicine, New York, NY, USA
Peter T. Campbell
Division of Cancer Epidemiology, German Cancer Research Center, Heidelberg, Germany
Jenny Chang-Claude
University Medical Centre Hamburg-Eppendorf, University Cancer Centre Hamburg, Hamburg, Germany
Jenny Chang-Claude
Division of Clinical Epidemiology and Aging Research, German Cancer Research Center, Heidelberg, Germany
Michael Hoffmeister, Hermann Brenner & Volker Arndt
Division of Preventive Oncology, German Cancer Research Center and National Center for Tumor Diseases, Heidelberg, Germany
Hermann Brenner
German Cancer Consortium, German Cancer Research Center, Heidelberg, Germany
Hermann Brenner
Department of Internal Medicine, University of Utah, Salt Lake City, UT, USA
Martha L. Slattery
Research Centre for Hauora and Health, Massey University, Wellington, New Zealand
John D. Potter
Department of Epidemiology and Biostatistics, School of Public Health, Imperial College London, London, UK
Kostas K. Tsilidis
Department of Hygiene and Epidemiology, University of Ioannina School of Medicine, Ioannina, Greece
Kostas K. Tsilidis
Department of Molecular Epidemiology, German Institute of Human Nutrition Potsdam-Rehbruecke, Nuthetal, Germany
Matthias B. Schulze
Institute of Nutritional Science, University of Potsdam, Potsdam, Germany
Matthias B. Schulze
Nutrition and Metabolism Branch, International Agency for Research on Cancer, World Health Organization, Lyon, France
Marc J. Gunter & Neil Murphy
Gastroenterology Department, Hospital Clínic, Institut d’Investigacions Biomèdiques August Pi i Sunyer, Centro de Investigación Biomédica en Red de Enfermedades Hepáticas y Digestivas, University of Barcelona, Barcelona, Spain
Antoni Castells, Sergi Castellví-Bel & Leticia Moreira
Department of Genetics, Stanford University, Stanford, CA, USA
Anna Shcherbina & Anshul Kundaje
Leeds Institute of Medical Research at St James’s, University of Leeds, Leeds, UK
D. Timothy Bishop
Cancer Epidemiology Division, Cancer Council Victoria, Melbourne, VIC, Australia
Graham G. Giles, Melissa C. Southey & Liesel M. FitzGerald
Precision Medicine, School of Clinical Sciences at Monash Health, Monash University, Clayton, VIC, Australia
Graham G. Giles & Melissa C. Southey
Department of Clinical Pathology, University of Melbourne, Melbourne, VIC, Australia
Melissa C. Southey
Department of Medical Oncology and Center For Precision Medicine, City of Hope National Medical Center, Duarte, CA, USA
Gregory E. Idos, Kevin J. McDonnell & Stephen B. Gruber
Department of Pathology, University of Michigan, Ann Arbor, MI, USA
Joel K. Greenson
Clalit Health Services, Personalized Genomic Service, Lady Davis Carmel Medical Center, Haifa, Israel
Flavio Lejbkowicz
Clinical Genetics Service, Department of Medicine, Memorial Sloan-Kettering Cancer Center, New York, NY, USA
Kenneth Offit
Department of Medicine, Weill Cornell Medical College, New York, NY, USA
Kenneth Offit & Zsofia K. Stadler
Kaiser Permanente Washington Health Research Institute, Seattle, WA, USA
Yu-Ru Su
Center for Gastrointestinal Biology and Disease, University of North Carolina, Chapel Hill, NC, USA
Temitope O. Keku
Department of Radiation Sciences, Oncology Unit, Umeå University, Umeå, Sweden
Bethany van Guelpen
Wallenberg Centre for Molecular Medicine, Umeå University, Umeå, Sweden
Bethany van Guelpen
Ontario Institute for Cancer Research, Toronto, ON, Canada
Thomas J. Hudson
Division of Human Genetics, Department of Internal Medicine, Ohio State University Comprehensive Cancer Center, Columbus, OH, USA
Heather Hampel & Rachel Pearlman
Division of Epidemiology, Department of Population Health, New York University School of Medicine, New York, NY, USA
Richard B. Hayes
Department of Family Medicine and Public Health, University of California San Diego, La Jolla, CA, USA
Marie Elena Martinez
Population Sciences, Disparities and Community Engagement, University of California San Diego Moores Cancer Center, La Jolla, CA, USA
Marie Elena Martinez
Fred Hutchinson Cancer Center, Seattle, WA, USA
Sushma S. Thomas
Department of Public Health and Primary Care, University of Cambridge, Cambridge, UK
Paul D. P. Pharoah
Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden
Susanna C. Larsson & Alicja Wolk
Taipei Medical University, Taipei, Taiwan
Yun Yen
Department of Medicine, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
Heinz-Josef Lenz
Department of Epidemiology, University of Washington School of Public Health, Seattle, WA, USA
Emily White
Center for Inherited Disease Research, Department of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, USA
Kimberly F. Doheny, Elizabeth Pugh & Tameka Shelford
Broad Institute of Harvard and MIT, Cambridge, MA, USA
Andrew T. Chan
Channing Division of Network Medicine, Brigham and Women’s Hospital and Harvard Medical School, Boston, MA, USA
Andrew T. Chan
Clinical and Translational Epidemiology Unit, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
Andrew T. Chan & Amit D. Joshi
Department of Epidemiology, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
Andrew T. Chan, David J. Hunter, Amit D. Joshi & Shuji Ogino
Department of Immunology and Infectious Diseases, Harvard T.H. Chan School of Public Health, Harvard University, Boston, MA, USA
Andrew T. Chan
Division of Gastroenterology, Massachusetts General Hospital and Harvard Medical School, Boston, MA, USA
Andrew T. Chan
Comprehensive Cancer Center, University of Puerto Rico, San Juan, Puerto Rico
Marcia Cruz-Correa
Department of Clinical Genetics, Karolinska University Hospital, Stockholm, Sweden
Annika Lindblom
Department of Molecular Medicine and Surgery, Karolinska Institutet, Stockholm, Sweden
Annika Lindblom
Nuffield Department of Population Health, University of Oxford, Oxford, UK
David J. Hunter
Department of General Surgery, University Hospital Rostock, Rostock, Germany
Clemens Schafmayer
Department of Genetics and Genome Sciences, Case Western Reserve University, Cleveland, OH, USA
Peter C. Scacheri
Department of Computer Science, Stanford University, Stanford, CA, USA
Anshul Kundaje
Department of Medicine and Epidemiology, University of Pittsburgh Medical Center, Pittsburgh, PA, USA
Robert E. Schoen
Department of Medicine I, University Hospital Dresden, Technische Universität Dresden, Dresden, Germany
Jochen Hampe
Department of Medicine, Memorial Sloan-Kettering Cancer Center, New York, NY, USA
Zsofia K. Stadler
Department of Molecular Biology of Cancer, Institute of Experimental Medicine of the Czech Academy of Sciences, Prague, Czech Republic
Pavel Vodicka, Ludmila Vodickova & Veronika Vymetalkova
Faculty of Medicine and Biomedical Center in Pilsen, Charles University, Pilsen, Czech Republic
Pavel Vodicka, Ludmila Vodickova & Veronika Vymetalkova
Institute of Biology and Medical Genetics, First Faculty of Medicine, Charles University, Prague, Czech Republic
Pavel Vodicka, Ludmila Vodickova & Veronika Vymetalkova
Department of Surgery, University of Tennessee Health Science Center, Memphis, TN, USA
David Shibata
Departments of Cancer Biology and Genetics and Internal Medicine, Comprehensive Cancer Center, Ohio State University, Columbus, OH, USA
Amanda Toland
Departments of Medicine and Genetics, Case Comprehensive Cancer Center, Case Western Reserve University and University Hospitals of Cleveland, Cleveland, OH, USA
Sanford Markowitz
Division of Human Nutrition and Health, Wageningen University and Research, Wageningen, The Netherlands
Franzel van Duijnhoven
Division of Human Nutrition, Wageningen University and Research, Wageningen, The Netherlands
Edith J. M. Feskens
Division of Research, Kaiser Permanente Northern California, Oakland, CA, USA
Lori C. Sakoda
Genomic Medicine Group, Galician Public Foundation of Genomic Medicine, Servicio Galego de Saude, Santiago de Compostela, Spain
Manuela Gago-Dominguez
Instituto de Investigación Sanitaria de Santiago de Compostela, Santiago de Compostela, Spain
Manuela Gago-Dominguez
Candiolo Cancer Institute FPO-IRCCS, Candiolo, (TO), Italy
Barbara Pardini
Italian Institute for Genomic Medicine, Candiolo Cancer Institute FPO-IRCCS, Candiolo, (TO), Italy
Barbara Pardini
Menzies Institute for Medical Research, University of Tasmania, Hobart, TAS, Australia
Liesel M. FitzGerald
National University Cancer Institute, Singapore, Cancer Science Institute of Singapore, National University of Singapore, Singapore, Singapore
Soo Chin Lee
Broad Institute of MIT and Harvard, Cambridge, MA, USA
Shuji Ogino
Cancer Immunology Program, Dana-Farber Harvard Cancer Center, Boston, MA, USA
Shuji Ogino
Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA
Shuji Ogino
Department of Biostatistics, University of Washington, Seattle, WA, USA
Ross Prentice
Service de Génétique Médicale, Centre Hospitalier Universitaire Nantes, Nantes, France
Stéphane Bézieau
Division of Epidemiology, National Cancer Center Institute for Cancer Control, National Cancer Center, Tokyo, Japan
Taiki Yamaji & Motoki Iwasaki
Division of Cohort Research, National Cancer Center Institute for Cancer Control, National Cancer Center, Tokyo, Japan
Norie Sawada & Motoki Iwasaki
Cancer Center, University of Hawaii, Honolulu, HI, USA
Loic Le Marchand
Preventative Medicine, University of Southern California, Los Angeles, CA, USA
Anna H. Wu
USC Norris Comprehensive Cancer Center, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
Chenxu Qu & Caroline E. McNeil
Van Andel Research Institute, Grand Rapids, MI, USA
Gerhard Coetzee
MRC Human Genetics Unit, Institute of Genomics and Cancer, University of Edinburgh, Edinburgh, UK
Caroline Hayward
Lothian Birth Cohorts group, Department of Psychology, University of Edinburgh, Edinburgh, UK
Ian J. Deary & Sarah E. Harris
Centre for Global Health, Usher Institute, University of Edinburgh, Edinburgh, UK
Evropi Theodoratou
Department of Pathology, National University Hospital, National University Health System, Singapore, Singapore
Li Yin Ooi
Epithelial Biology Center and Department of Cell and Developmental Biology, Vanderbilt University School of Medicine, Nashville, TN, USA
Ken S. Lau
Department of Genetics, Yale School of Medicine, New Haven, CT, USA
Hongyu Zhao
Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA
Hongyu Zhao
Department of Biostatistics, School of Public Health, University of Washington, Seattle, WA, USA
Li Hsu
Department of Epidemiology, University of Washington, Seattle, WA, USA
Ulrike Peters

Authors

Zhishan Chen
View author publications
You can also search for this author in PubMed Google Scholar
Xingyi Guo
View author publications
You can also search for this author in PubMed Google Scholar
Ran Tao
View author publications
You can also search for this author in PubMed Google Scholar
Jeroen R. Huyghe
View author publications
You can also search for this author in PubMed Google Scholar
Philip J. Law
View author publications
You can also search for this author in PubMed Google Scholar
Ceres Fernandez-Rozadilla
View author publications
You can also search for this author in PubMed Google Scholar
Jie Ping
View author publications
You can also search for this author in PubMed Google Scholar
Guochong Jia
View author publications
You can also search for this author in PubMed Google Scholar
Jirong Long
View author publications
You can also search for this author in PubMed Google Scholar
Chao Li
View author publications
You can also search for this author in PubMed Google Scholar
Quanhu Shen
View author publications
You can also search for this author in PubMed Google Scholar
Yuhan Xie
View author publications
You can also search for this author in PubMed Google Scholar
Maria N. Timofeeva
View author publications
You can also search for this author in PubMed Google Scholar
Minta Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Stephanie L. Schmit
View author publications
You can also search for this author in PubMed Google Scholar
Virginia Díez-Obrero
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Devall
View author publications
You can also search for this author in PubMed Google Scholar
Ferran Moratalla-Navarro
View author publications
You can also search for this author in PubMed Google Scholar
Juan Fernandez-Tajes
View author publications
You can also search for this author in PubMed Google Scholar
Claire Palles
View author publications
You can also search for this author in PubMed Google Scholar
Kitty Sherwood
View author publications
You can also search for this author in PubMed Google Scholar
Sarah E. W. Briggs
View author publications
You can also search for this author in PubMed Google Scholar
Victoria Svinti
View author publications
You can also search for this author in PubMed Google Scholar
Kevin Donnelly
View author publications
You can also search for this author in PubMed Google Scholar
Susan M. Farrington
View author publications
You can also search for this author in PubMed Google Scholar
James Blackmur
View author publications
You can also search for this author in PubMed Google Scholar
Peter G. Vaughan-Shaw
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Ou Shu
View author publications
You can also search for this author in PubMed Google Scholar
Yingchang Lu
View author publications
You can also search for this author in PubMed Google Scholar
Peter Broderick
View author publications
You can also search for this author in PubMed Google Scholar
James Studd
View author publications
You can also search for this author in PubMed Google Scholar
Tabitha A. Harrison
View author publications
You can also search for this author in PubMed Google Scholar
David V. Conti
View author publications
You can also search for this author in PubMed Google Scholar
Fredrick R. Schumacher
View author publications
You can also search for this author in PubMed Google Scholar
Marilena Melas
View author publications
You can also search for this author in PubMed Google Scholar
Gad Rennert
View author publications
You can also search for this author in PubMed Google Scholar
Mireia Obón-Santacana
View author publications
You can also search for this author in PubMed Google Scholar
Vicente Martín-Sánchez
View author publications
You can also search for this author in PubMed Google Scholar
Jae Hwan Oh
View author publications
You can also search for this author in PubMed Google Scholar
Jeongseon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Sun Ha Jee
View author publications
You can also search for this author in PubMed Google Scholar
Keum Ji Jung
View author publications
You can also search for this author in PubMed Google Scholar
Sun-Seog Kweon
View author publications
You can also search for this author in PubMed Google Scholar
Min-Ho Shin
View author publications
You can also search for this author in PubMed Google Scholar
Aesun Shin
View author publications
You can also search for this author in PubMed Google Scholar
Yoon-Ok Ahn
View author publications
You can also search for this author in PubMed Google Scholar
Dong-Hyun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Isao Oze
View author publications
You can also search for this author in PubMed Google Scholar
Wanqing Wen
View author publications
You can also search for this author in PubMed Google Scholar
Keitaro Matsuo
View author publications
You can also search for this author in PubMed Google Scholar
Koichi Matsuda
View author publications
You can also search for this author in PubMed Google Scholar
Chizu Tanikawa
View author publications
You can also search for this author in PubMed Google Scholar
Zefang Ren
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Tang Gao
View author publications
You can also search for this author in PubMed Google Scholar
Wei-Hua Jia
View author publications
You can also search for this author in PubMed Google Scholar
John L. Hopper
View author publications
You can also search for this author in PubMed Google Scholar
Mark A. Jenkins
View author publications
You can also search for this author in PubMed Google Scholar
Aung Ko Win
View author publications
You can also search for this author in PubMed Google Scholar
Rish K. Pai
View author publications
You can also search for this author in PubMed Google Scholar
Jane C. Figueiredo
View author publications
You can also search for this author in PubMed Google Scholar
Robert W. Haile
View author publications
You can also search for this author in PubMed Google Scholar
Steven Gallinger
View author publications
You can also search for this author in PubMed Google Scholar
Michael O. Woods
View author publications
You can also search for this author in PubMed Google Scholar
Polly A. Newcomb
View author publications
You can also search for this author in PubMed Google Scholar
David Duggan
View author publications
You can also search for this author in PubMed Google Scholar
Jeremy P. Cheadle
View author publications
You can also search for this author in PubMed Google Scholar
Richard Kaplan
View author publications
You can also search for this author in PubMed Google Scholar
Rachel Kerr
View author publications
You can also search for this author in PubMed Google Scholar
David Kerr
View author publications
You can also search for this author in PubMed Google Scholar
Iva Kirac
View author publications
You can also search for this author in PubMed Google Scholar
Jan Böhm
View author publications
You can also search for this author in PubMed Google Scholar
Jukka-Pekka Mecklin
View author publications
You can also search for this author in PubMed Google Scholar
Pekka Jousilahti
View author publications
You can also search for this author in PubMed Google Scholar
Paul Knekt
View author publications
You can also search for this author in PubMed Google Scholar
Lauri A. Aaltonen
View author publications
You can also search for this author in PubMed Google Scholar
Harri Rissanen
View author publications
You can also search for this author in PubMed Google Scholar
Eero Pukkala
View author publications
You can also search for this author in PubMed Google Scholar
Johan G. Eriksson
View author publications
You can also search for this author in PubMed Google Scholar
Tatiana Cajuso
View author publications
You can also search for this author in PubMed Google Scholar
Ulrika Hänninen
View author publications
You can also search for this author in PubMed Google Scholar
Johanna Kondelin
View author publications
You can also search for this author in PubMed Google Scholar
Kimmo Palin
View author publications
You can also search for this author in PubMed Google Scholar
Tomas Tanskanen
View author publications
You can also search for this author in PubMed Google Scholar
Laura Renkonen-Sinisalo
View author publications
You can also search for this author in PubMed Google Scholar
Satu Männistö
View author publications
You can also search for this author in PubMed Google Scholar
Demetrius Albanes
View author publications
You can also search for this author in PubMed Google Scholar
Stephanie J. Weinstein
View author publications
You can also search for this author in PubMed Google Scholar
Edward Ruiz-Narvaez
View author publications
You can also search for this author in PubMed Google Scholar
Julie R. Palmer
View author publications
You can also search for this author in PubMed Google Scholar
Daniel D. Buchanan
View author publications
You can also search for this author in PubMed Google Scholar
Elizabeth A. Platz
View author publications
You can also search for this author in PubMed Google Scholar
Kala Visvanathan
View author publications
You can also search for this author in PubMed Google Scholar
Cornelia M. Ulrich
View author publications
You can also search for this author in PubMed Google Scholar
Erin Siegel
View author publications
You can also search for this author in PubMed Google Scholar
Stefanie Brezina
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Gsur
View author publications
You can also search for this author in PubMed Google Scholar
Peter T. Campbell
View author publications
You can also search for this author in PubMed Google Scholar
Jenny Chang-Claude
View author publications
You can also search for this author in PubMed Google Scholar
Michael Hoffmeister
View author publications
You can also search for this author in PubMed Google Scholar
Hermann Brenner
View author publications
You can also search for this author in PubMed Google Scholar
Martha L. Slattery
View author publications
You can also search for this author in PubMed Google Scholar
John D. Potter
View author publications
You can also search for this author in PubMed Google Scholar
Kostas K. Tsilidis
View author publications
You can also search for this author in PubMed Google Scholar
Matthias B. Schulze
View author publications
You can also search for this author in PubMed Google Scholar
Marc J. Gunter
View author publications
You can also search for this author in PubMed Google Scholar
Neil Murphy
View author publications
You can also search for this author in PubMed Google Scholar
Antoni Castells
View author publications
You can also search for this author in PubMed Google Scholar
Sergi Castellví-Bel
View author publications
You can also search for this author in PubMed Google Scholar
Leticia Moreira
View author publications
You can also search for this author in PubMed Google Scholar
Volker Arndt
View author publications
You can also search for this author in PubMed Google Scholar
Anna Shcherbina
View author publications
You can also search for this author in PubMed Google Scholar
D. Timothy Bishop
View author publications
You can also search for this author in PubMed Google Scholar
Graham G. Giles
View author publications
You can also search for this author in PubMed Google Scholar
Melissa C. Southey
View author publications
You can also search for this author in PubMed Google Scholar
Gregory E. Idos
View author publications
You can also search for this author in PubMed Google Scholar
Kevin J. McDonnell
View author publications
You can also search for this author in PubMed Google Scholar
Zomoroda Abu-Ful
View author publications
You can also search for this author in PubMed Google Scholar
Joel K. Greenson
View author publications
You can also search for this author in PubMed Google Scholar
Katerina Shulman
View author publications
You can also search for this author in PubMed Google Scholar
Flavio Lejbkowicz
View author publications
You can also search for this author in PubMed Google Scholar
Kenneth Offit
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Ru Su
View author publications
You can also search for this author in PubMed Google Scholar
Robert Steinfelder
View author publications
You can also search for this author in PubMed Google Scholar
Temitope O. Keku
View author publications
You can also search for this author in PubMed Google Scholar
Bethany van Guelpen
View author publications
You can also search for this author in PubMed Google Scholar
Thomas J. Hudson
View author publications
You can also search for this author in PubMed Google Scholar
Heather Hampel
View author publications
You can also search for this author in PubMed Google Scholar
Rachel Pearlman
View author publications
You can also search for this author in PubMed Google Scholar
Sonja I. Berndt
View author publications
You can also search for this author in PubMed Google Scholar
Richard B. Hayes
View author publications
You can also search for this author in PubMed Google Scholar
Marie Elena Martinez
View author publications
You can also search for this author in PubMed Google Scholar
Sushma S. Thomas
View author publications
You can also search for this author in PubMed Google Scholar
Paul D. P. Pharoah
View author publications
You can also search for this author in PubMed Google Scholar
Susanna C. Larsson
View author publications
You can also search for this author in PubMed Google Scholar
Yun Yen
View author publications
You can also search for this author in PubMed Google Scholar
Heinz-Josef Lenz
View author publications
You can also search for this author in PubMed Google Scholar
Emily White
View author publications
You can also search for this author in PubMed Google Scholar
Li Li
View author publications
You can also search for this author in PubMed Google Scholar
Kimberly F. Doheny
View author publications
You can also search for this author in PubMed Google Scholar
Elizabeth Pugh
View author publications
You can also search for this author in PubMed Google Scholar
Tameka Shelford
View author publications
You can also search for this author in PubMed Google Scholar
Andrew T. Chan
View author publications
You can also search for this author in PubMed Google Scholar
Marcia Cruz-Correa
View author publications
You can also search for this author in PubMed Google Scholar
Annika Lindblom
View author publications
You can also search for this author in PubMed Google Scholar
David J. Hunter
View author publications
You can also search for this author in PubMed Google Scholar
Amit D. Joshi
View author publications
You can also search for this author in PubMed Google Scholar
Clemens Schafmayer
View author publications
You can also search for this author in PubMed Google Scholar
Peter C. Scacheri
View author publications
You can also search for this author in PubMed Google Scholar
Anshul Kundaje
View author publications
You can also search for this author in PubMed Google Scholar
Robert E. Schoen
View author publications
You can also search for this author in PubMed Google Scholar
Jochen Hampe
View author publications
You can also search for this author in PubMed Google Scholar
Zsofia K. Stadler
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Vodicka
View author publications
You can also search for this author in PubMed Google Scholar
Ludmila Vodickova
View author publications
You can also search for this author in PubMed Google Scholar
Veronika Vymetalkova
View author publications
You can also search for this author in PubMed Google Scholar
Christopher K. Edlund
View author publications
You can also search for this author in PubMed Google Scholar
W. James Gauderman
View author publications
You can also search for this author in PubMed Google Scholar
David Shibata
View author publications
You can also search for this author in PubMed Google Scholar
Amanda Toland
View author publications
You can also search for this author in PubMed Google Scholar
Sanford Markowitz
View author publications
You can also search for this author in PubMed Google Scholar
Andre Kim
View author publications
You can also search for this author in PubMed Google Scholar
Stephen J. Chanock
View author publications
You can also search for this author in PubMed Google Scholar
Franzel van Duijnhoven
View author publications
You can also search for this author in PubMed Google Scholar
Edith J. M. Feskens
View author publications
You can also search for this author in PubMed Google Scholar
Lori C. Sakoda
View author publications
You can also search for this author in PubMed Google Scholar
Manuela Gago-Dominguez
View author publications
You can also search for this author in PubMed Google Scholar
Alicja Wolk
View author publications
You can also search for this author in PubMed Google Scholar
Barbara Pardini
View author publications
You can also search for this author in PubMed Google Scholar
Liesel M. FitzGerald
View author publications
You can also search for this author in PubMed Google Scholar
Soo Chin Lee
View author publications
You can also search for this author in PubMed Google Scholar
Shuji Ogino
View author publications
You can also search for this author in PubMed Google Scholar
Stephanie A. Bien
View author publications
You can also search for this author in PubMed Google Scholar
Charles Kooperberg
View author publications
You can also search for this author in PubMed Google Scholar
Christopher I. Li
View author publications
You can also search for this author in PubMed Google Scholar
Yi Lin
View author publications
You can also search for this author in PubMed Google Scholar
Ross Prentice
View author publications
You can also search for this author in PubMed Google Scholar
Conghui Qu
View author publications
You can also search for this author in PubMed Google Scholar
Stéphane Bézieau
View author publications
You can also search for this author in PubMed Google Scholar
Taiki Yamaji
View author publications
You can also search for this author in PubMed Google Scholar
Norie Sawada
View author publications
You can also search for this author in PubMed Google Scholar
Motoki Iwasaki
View author publications
You can also search for this author in PubMed Google Scholar
Loic Le Marchand
View author publications
You can also search for this author in PubMed Google Scholar
Anna H. Wu
View author publications
You can also search for this author in PubMed Google Scholar
Chenxu Qu
View author publications
You can also search for this author in PubMed Google Scholar
Caroline E. McNeil
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard Coetzee
View author publications
You can also search for this author in PubMed Google Scholar
Caroline Hayward
View author publications
You can also search for this author in PubMed Google Scholar
Ian J. Deary
View author publications
You can also search for this author in PubMed Google Scholar
Sarah E. Harris
View author publications
You can also search for this author in PubMed Google Scholar
Evropi Theodoratou
View author publications
You can also search for this author in PubMed Google Scholar
Stuart Reid
View author publications
You can also search for this author in PubMed Google Scholar
Marion Walker
View author publications
You can also search for this author in PubMed Google Scholar
Li Yin Ooi
View author publications
You can also search for this author in PubMed Google Scholar
Ken S. Lau
View author publications
You can also search for this author in PubMed Google Scholar
Hongyu Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Li Hsu
View author publications
You can also search for this author in PubMed Google Scholar
Qiuyin Cai
View author publications
You can also search for this author in PubMed Google Scholar
Malcolm G. Dunlop
View author publications
You can also search for this author in PubMed Google Scholar
Stephen B. Gruber
View author publications
You can also search for this author in PubMed Google Scholar
Richard S. Houlston
View author publications
You can also search for this author in PubMed Google Scholar
Victor Moreno
View author publications
You can also search for this author in PubMed Google Scholar
Graham Casey
View author publications
You can also search for this author in PubMed Google Scholar
Ulrike Peters
View author publications
You can also search for this author in PubMed Google Scholar
Ian Tomlinson
View author publications
You can also search for this author in PubMed Google Scholar
Wei Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Wei Zheng and Xingyi Guo conceived and supervised the study, and acquired funding. Wei Zheng, Xingyi Guo, and Zhishan Chen designed the study with significant contribution from Ran Tao. Zhishan Chen carried out the main analysis. Chao Li, Quanhu Shen, and Ken S Lau contributed to single-cell RNA-seq analysis. Yuhan Xie and Hongyu Zhao contributed to whole exome sequencing analysis. Zhishan Chen, Xingyi Guo, and Wei Zheng interpreted results with help from other authors. Zhishan Chen, Xingyi Guo, Jeroen R Huyghe, Philip J Law, Ceres Fernandez-Rozadilla, Jie Ping, Guochong Jia, Maria N Timofeeva, Minta Thomas, Stephanie L Schmit, Virginia Díez-Obrero, Matthew Devall, Ferran Moratalla-Navarro, Juan Fernandez-Tajes, Sarah E W Briggs, Victoria Svinti, Kevin Donnelly, Yingchang Lu, Fredrick R Schumacher, Stephanie J Weinstein, Kala Visvanathan, Kostas K Tsilidis, Yu-Ru Su, Robert Steinfelder, Sonja I Berndt, Sushma S Thomas, Kimberly F Doheny, Tameka Shelford, Amit D Joshi, Anshul Kundaje, Christopher K Edlund, Andre Kim, Lori C Sakoda, Stephanie A Bien, Yi Lin, Conghui Qu, Chenxu Qu, Stuart Reid, and Li Hsu analyzed the data. Xingyi Guo, Ceres Fernandez-Rozadilla, Jirong Long, Matthew Devall, Claire Palles, Kitty Sherwood, Susan M Farrington, James Blackmur, Peter G. Vaughan-Shaw, Xiao-Ou Shu, Peter Broderick, James Studd, Tabitha A Harrison, David V Conti, Marilena Melas, Gad Rennert, Mireia Obón-Santacana, Vicente Martín-Sánchez, Jae Hwan Oh, Jeongseon Kim, Sun Ha Jee, Keum Ji Jung, Sun-Seog Kweon, Min-Ho Shin, Aesun Shin, Yoon-Ok Ahn, Dong-Hyun Kim, Isao Oze, Wanqing Wen, Keitaro Matsuo, Koichi Matsuda, Chizu Tanikawa, Zefang Ren, Yu-Tang Gao, Wei-Hua Jia, John L Hopper, Mark A Jenkins, Aung Ko Win, Rish K Pai, Jane C Figueiredo, Robert W Haile, Steven Gallinger, Michael O Woods, Polly A Newcomb, David Duggan, Jeremy P. Cheadle, Richard Kaplan, Rachel Kerr, David Kerr, Iva Kirac, Jan Böhm, Jukka-Pekka Mecklin, Pekka Jousilahti, Paul Knekt, Lauri A. Aaltonen, Harri Rissanen, Eero Pukkala, Johan G Eriksson, Tatiana Cajuso, Ulrika Hänninen, Johanna Kondelin, Kimmo Palin, Tomas Tanskanen, Laura Renkonen-Sinisalo, Satu Männistö, Demetrius Albanes, Edward Ruiz-Narvaez, Julie R Palmer, Daniel D Buchanan, Elizabeth A Platz, Cornelia M Ulrich, Erin Siegel, Stefanie Brezina, Andrea Gsur, Peter T Campbell, Jenny Chang-Claude, Michael Hoffmeister, Hermann Brenner, Martha L Slattery, John D Potter, Matthias B Schulze, Marc J Gunter, Neil Murphy, Antoni Castells, Sergi Castellví-Bel, Leticia Moreira, Volker Arndt, Anna Shcherbina, D. Timothy Bishop, Graham G Giles, Melissa C. Southey, Gregory E Idos, Kevin J McDonnell, Zomoroda Abu-Ful, Joel K Greenson, Katerina Shulman, Flavio Lejbkowicz, Kenneth Offit, Temitope O Keku, Bethany van Guelpen, Thomas J Hudson, Heather Hampel, Rachel Pearlman, Richard B Hayes, Marie Elena Martinez, Paul D. P. Pharoah, Susanna C Larsson, Yun Yen, Heinz-Josef Lenz, Emily White, Li Li, Elizabeth Pugh, Andrew T Chan, Marcia Cruz-Correa, Annika Lindblom, David J Hunter, Clemens Schafmayer, Peter C Scacheri, Robert E Schoen, Jochen Hampe, Zsofia K Stadler, Pavel Vodicka, Ludmila Vodickova, Veronika Vymetalkova, W. James Gauderman, David Shibata, Amanda Toland, Sanford Markowitz, Stephen J Chanock, Franzel van Duijnhoven, Edith JM Feskens, Manuela Gago-Dominguez, Alicja Wolk, Barbara Pardini, Liesel M FitzGerald, Soo Chin Lee, Shuji Ogino, Charles Kooperberg, Christopher I Li, Ross Prentice, Stéphane Bézieau, Taiki Yamaji, Norie Sawada, Motoki Iwasaki, Loic Le Marchand, Anna H Wu, Caroline E McNeil, Gerhard Coetzee, Caroline Hayward, Ian J Deary, Sarah E Harris, Evropi Theodoratou, Marion Walker, Li Yin Ooi, Qiuyin Cai, Malcolm G Dunlop, Stephen B Gruber, Richard S Houlston, Victor Moreno, Graham Casey, Ulrike Peters, Ian Tomlinson, and Wei Zheng recruited patients and collected samples. Zhishan Chen, Xingyi Guo, and Wei Zheng wrote the manuscript with substantial contributions from Ceres Fernandez-Rozadilla, Jie Ping, Guochong Jia, Jirong Long, Xiao-Ou Shu, Richard S Houlston, and Ian Tomlinson. All authors have reviewed and approved the final manuscript.

Corresponding author

Correspondence to Wei Zheng.

Ethics declarations

Competing interests

Antoni Castells is a consultant to Bayer Pharma AG, Boehringer Ingelheim and Pfizer Inc. for work unrelated to this manuscript. Anna Shcherbina is an employee at Insitro, including consulting fees from BMS. Heather Hampel is SAB for Invitae Genetics, Promega and Genome Medical, Stock/Stock options for Genome Medical and GI OnDemand. Rish K Pai collaborates with Eli Lilly, AbbVie, Allergan, Verily and Alimentiv, which includes consulting fees (outside the submitted work). Stephanie A Bien has a financial interest in Adaptive Biotechnologies. Stephen B Gruber is co-founder, Brogent International LLC. One of Zsofia K Stadler’s immediate family members serves as a consultant in ophthalmology for Alcon, Adverum, Gyroscope Therapeutics Limited, Neurogene and RegenexBio (outside the submitted work). Victor Moreno has research projects and owns stocks of Aniling. The remaining authors declare no competing interests.

Peer review

Peer review information

Nature Communications thanks Jyotsna Batra and Juliet French for their contribution to the peer review of this work. A peer review file is available.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information

Peer Review File

Description of Additional Supplementary Files

Supplementary Data 1-22

Reporting Summary

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Chen, Z., Guo, X., Tao, R. et al. Fine-mapping analysis including over 254,000 East Asian and European descendants identifies 136 putative colorectal cancer susceptibility genes. Nat Commun 15, 3557 (2024). https://doi.org/10.1038/s41467-024-47399-x

Download citation

Received: 16 October 2023
Accepted: 26 March 2024
Published: 26 April 2024
DOI: https://doi.org/10.1038/s41467-024-47399-x

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Subjects

Abstract

Similar content being viewed by others

Introduction

Results

Identification of independent association signals with CRC risk

Identification of credible causal variants (CCVs) for independent association signals

Identification of target genes for CCVs

Using scRNA-seq data to evaluate gene expression pattern by cell types

Using whole exome sequencing data to evaluate pathogenic variants in target genes with CRC risk

Biological significance of the target genes for CCVs

Discussion

Methods

GWAS data and meta-analysis

Identifying independent association signals

Identifying a set of CCVs of each independent signal

RNA-seq data analysis

Cis-expression/methylation quantitative loci (cis-eQTL/mQTL) analysis

Meta-analysis of cis-eQTL/mQTL results

Colocalization analyses between GWAS association signals and eQTL/mQTL signals

Functional annotation of CCVs

In silico prediction of regulatory element-to-gene

Credible set of susceptibility genes

Single-cell RNA-sequencing data analysis

Burden test for credible susceptibility genes

Pathway analysis of credible susceptibility genes

Reporting summary

Data availability

Code availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Peer review

Peer review information

Additional information

Supplementary information

Rights and permissions

About this article

Cite this article

Share this article

Comments

Search

Quick links