Introduction

Sex differences in opioid dependence (OD) have a strong biological basis1 with a genetic component2,3. Heritability estimates of heroin use are around 0.52. Although there is no report for sex-specific heritability estimates of OD specifically, heritability estimates of drug dependence are higher in men than women4. In animal models, genetic effects related to OD have differed by sex5,6,7. For example, a mouse model of the human OPRM1 (A118G) polymorphism found genotype-by-sex-specific reductions in the rewarding properties of morphine5. This variant does not, however, appear to affect OD risk in humans8. To our knowledge, the only systematic genome-wide search for the genes responsible for sex differences in human OD was a linkage study3.

Systematic search for the OD genetic risk variants (regardless of sex differences) using the genome-wide association study (GWAS) design has been reported. We published the first GWAS with genomewide significant findings9. The most compelling results in that study were the identification of genes involved in potassium signaling pathways (i.e., KCNC1 (Potassium Voltage-Gated Channel Subfamily C Member 1) and KCNG2 (Potassium Voltage-Gated Channel Modifier Subfamily G Member 2)) in the African American (AA) population, and genes involved in calcium signaling and long-term potentiation. Another GWAS was conducted in an Australian cohort, in which genetic data from opioid-dependent daily injectors were compared with that from opioid misusers who never progressed to daily injection, and identified several genomewide significant variants in CNIH3 (Cornichon Family AMPA Receptor Auxiliary Protein 3)10. In a recent study, we found that a variant on chromosome 15, rs12442183, near RGMA (Repulsive Guidance Molecule A), was genome-wide significantly associated with OD in the European American (EA) population11. RGMA encodes a central nervous system axon guidance protein called repulsive guidance molecule A. Risk allele rs12442183*T was related with higher expression of a specific RGMA transcript variant in frontal cortex. After chronic morphine injection, the homologous mouse gene, Rgma, was upregulated in the striatum of C57BL/6 J mice11.

We aimed to identify genetic variants exerting a sex difference in susceptibility to OD in a GWAS framework using the cohort of substance use disorder we have collected; characterize and annotate the identified genetic variants using publicly available databases of co-expressed genes and enrichment analysis; and use transcriptome analysis to identify biological mechanisms consistent with sex-specific effects on OD risk for the variants identified.

Materials and Methods

Subjects and the semi-structured assessment

This study analyzed data obtained from African-American (AA) and European-American (EA) participants in the Yale-Penn genetics of substance dependence cohort (total N = 8,387). The 4,944 AA and 3,443 EA samples were recruited from 2000 to 2013, as previously described9,12. These subjects represent two phases of collection: Yale-Penn-1 (N = 4,970) and Yale-Penn-2 (N = 3,417), which differed in their period of recruitment and the genotyping platforms used. The demographic characteristics of the study cohort are presented in Table 1. We assessed all subjects using the Semi-Structured Assessment for Drug Dependence and Alcoholism (SSADDA)13 and obtained certificates of confidentiality for all subjects from the National Institute on Drug Abuse (NIDA) and the National Institute on Alcohol Abuse and Alcoholism (NIAAA). The study protocol was approved by the Yale Institutional Review Board and the study was performed in accordance with the relevant guidelines and regulations. All subjects provided written informed consent

Table 1 Demographic characteristics of the study samples.

Genotype quality control, population stratification, and imputation

We genotyped the Yale-Penn-1 sample with the Illumina HumanOmni1-Quad array of approximately 988,000 single nucleotide polymorphisms (SNPs), and the Yale-Penn-2 sample with the Illumina HumanCore Exome array of approximately 266,000 exonic SNPs and 240,000 tagging SNPs. We excluded SNPs with a genotype call rate <98% or minor allele frequency (MAF) <1%.

We used the following measures to differentiate AA or EA subjects and control for population stratification. First, we conducted principal component (PC) analysis14 for the SNPs common among the genetic data for the Yale-Penn-1, Yale-Penn-2, and the 1000 Genomes phase 3 reference panel, which contains African, admixed American, European, East Asian and South Asian populations15. We then trimmed SNPs in LD (r2 > 0.2) using PLINK16. Using the remaining SNPs, we clustered the Yale-Penn subjects into different groups compared to the reference populations by the first three PCs in Euclidean space. Subjects were removed from the subsequent analyses if they were not clustered with African or European populations. Finally, we conducted a second PC analysis within each group to remove outliers greater than three standard deviations from the mean Euclidean distances. The resultant first 10 PCs were covariates in all subsequent association analyses to adjust for residual population stratification.

We imputed GWAS data to the 1000 Genomes phase 3 reference panel15, using Minimac3 implemented in the Michigan Imputation Server17. Post-imputation SNP exclusion metrics included: Hardy-Weinberg equilibrium p < 10−6, imputation accuracy < 0.8, or MAF < 5%. The final SNP counts for the subsequent association analyses were: Yale-Penn-1 sample, 8,775,706 SNPs in AAs and 6,417,418 in EAs; Yale-Penn-2 sample, 6,702,161 SNPs in AAs and 5,205,763 in EAs.

Genomewide gene-by-sex interaction scan for sex differences in OD

We performed genetic association tests for the DSM-IV diagnosis OD using a linear mixed model implemented in the program GEMMA18. A standard linear mixed-effect model was chosen to control for the relatedness among participants as our cohort contained a mixture of individuals ascertained using both an unrelated case-control design and a family design (11.9% of AAs and 7.9% of EAs came from the family design). We chose the standard linear mixed effect model implemented in GEMMA, where the effect estimates asymptotically converge to estimates of logistic regression for a binary outcome (such as OD affected versus OD unaffected here) when the sample size is large. We tested the gene-by-sex interaction effect and included the main effects of sex and SNP in addition to controlling for age, four other substance dependence diagnoses (alcohol, cocaine, nicotine, cannabis) and the first 10 PCs, to examine sex differences in genetic risk for OD. This model was examined separately for each of four datasets, i.e., both AA and EA subjects in the Yale-Penn-1 and Yale-Penn-2 samples. We then meta-analyzed the interactive effects of SNP-by-sex from Yale-Penn-1 and Yale-Penn-2 within each population using the inverse variance method that converted all the effects into a signed Z-score and tested for association using the Z-test, which was implemented in the program METAL19.

Co-expression, sex differences in gene expression, disease enrichment, and functional annotation

For the enrichment analysis, we first queried the target gene, i.e., the gene harboring the genomewide significant variants detected, to identify co-expressed genes using COXPRESdb20, a coexpression database of DNA-microarray and RNAseq-based expression data. After obtaining the top 100 co-expressed genes of the targeted gene (or genes) identified from the association analyses, we assessed functional enrichment using the WEB-based GEne SeT AnaLysis Toolkit (WebGestalt)21, a functional enrichment analysis web tool (see supplementary material for the setting of parameters). WebGestalt 2017 supports 12 organisms and 150,937 functional categories from public databases and computational analyses21. The enrichment analysis method ‘Disease Association Analysis’ of WebGestalt was used to test for enrichment of disease-associated genes among the top co-expressed genes with the targeted gene.

We studied sex differences in the expression of the targeted gene, i.e., ADGRV1, to investigate the functional roles of the top variants and the disease-enriched genes to provide insight into the molecular mechanism of sex differences in OD. To do this, we used the Genotype-Tissue Expression (GTEx) data and the tissue-specific gene expression database, the Brain eQTL Almanac, Braineac22. The RNA-Seq data of GTEx Analysis V7 (phs000424.v7.p2), included in the file, ‘GTEx_Analysis_2016-01-15_v7_RNASeQCv1.1.8_gene_tpm.gct.gz’, were downloaded from the GTEx Portal datasets23, and information about tissues and sex was retrieved from the files, ‘GTEx_v7_Annotations_SampleAttributesDS.txt’ and ‘GTEx_v7_Annotations_SubjectPhenotypesDS.txt’, respectively. We downloaded the gene expression (RNA-seq) data and the related phenotypes and analyzed the data using the general linear model, implemented in the SAS procedure “ proc glm,” where sex was modeled as the main effect in association with the levels of gene expression.

Results

Genomewide gene-by-sex interaction scan

We summarized the result for each dataset and the meta-analysis within each population in the Manhattan plots (Supplementary Figure S1). The QQ plots show no systematic bias (Figure S1). A peak with the lead SNP rs2366929 and another eight SNPs in strong LD (r2 > 0.9) with the lead SNP (r2 > 0.8) reached genomewide significance (all p’s < 5 × 10−8) in the gene-by-sex interaction scans for OD in the meta-analyzed AA samples (Fig. 1a). For lead SNP rs2366929, each subsample also showed dose-response effects of nominal significance with p = 2.19 × 10−5 and p = 1.23 × 10−5 for Yale-Penn 1 and Yale-Penn 2, respectively. This peak of genomewide significance is located at ADGRV1 (Adhesion G protein-coupled receptor V1) on chromosome 5q14.3. The gene-by-sex interaction effects exhibit a gene dose-response relationship for the OD risk in AA men but not women (Fig. 1b). That is, the proportions of OD-affected to the total subjects (OD affected plus controls), within each of the three genotypes consistently show a gene dose-response trend relationship in men but not women (for lead SNP rs2366929*(C/T), chi-square test for trend in proportions, p = 1.1 × 10−14 in men and p = 0.15 in women). Figure 1b displays the increasing trend of the OD-affected proportions for men (CC > CT > TT) compared with women for the lead SNP rs2366929. This increasing trend in men versus women was observed for all nine SNPs identified in the association analyses (Supplementary Table S1 shows the trend tests for all nine SNPs). The MAF for the lead SNP increased 10.5% for the OD-affected versus the OD-unaffected men (0.353 vs. 0.248; comparison of two proportions, p < 0.0001); however, this increase was not present in women (0.241 vs. 0.265; p = 0.15). Table 2 displays the characteristics of the nine genome-wide significant SNPs.

Figure 1
figure 1

Genomewide gene-by-sex interaction scans on opioid dependence (OD). (a) Regional association plot at the ADGRV1 (aka. GPR98) locus for the meta-analysis of the two phases of the African American (AA) sample. (b) Proportions of the OD affected for the AA sample for each genotype for the lead SNP rs2366929.

Table 2 Genomewide significant SNPs located at the ADGRV1 locus with differential effects on susceptibility to opioid dependence between male and female for the African American sample.

We identified no genomewide significant signals in the EA sample (Supplementary Figure S2) and the trans-ethnic meta-analysis of the AA and EA samples (Supplementary Figure S3).

Sex difference in ADGRV1 gene expression

We used GTEx data to identify tissues with differential ADGRV1 gene expression by sex for further annotating ADGRV1. We detected significant sex-differential ADGRV1 gene expression in breast (Bonferroni correction p < 0.002) and in heart (p < 0.0125) based on GTEx transcriptome data (dbGaP Accession phs000424.v7.p2) (Table 3). The ADGRV1 expression in heart supports a previous study of sexually dimorphic ADGRV1 expression in patients with non-ischemic human heart failure (Supplementary Table S2), in which ADGRV1 expression was lower in men than women, the same effect direction to what we found in the GTEx transcriptome. We identified four additional tissues with sex-differential ADGRV1 gene expression that were nominally significant, including brain (p = 0.018), thyroid (p = 0.021), lung (p = 0.038), and stomach (p = 0.045) (Table 3).

Table 3 ADGRV1 gene expression: Tests of sexually dimorphic expression using Genotype-Tissue Expression (GTEx) project data.

ADGRV1 co-expression and disease enrichment analysis

The top 100 genes co-expressed with ADGRV1 (Supplementary Table S3) identified by COEXPRESdb were subject to disease enrichment analysis. We found that ADGRV1 and 13 ADGRV1 co-expressed genes were significantly enriched within 10 diseases (all raw p’s < 0.01 and adjusted p’s < 0.05, Supplementary Table S4). The 13 genes are ABCC12, CFTR, CTAGE1, DAOA-AS1, EFHC2, FHL5, GRIK2, NAV3, SPO11, SYCP2, TAAR9, TPH2, and ZNF157. Among the 10 diseases, two are male-specific (male infertility and non-syndromic oligospermia), four affect males more than females (X-linked mental retardation, generalized epilepsy, Sezary syndrome, cutaneous T-cell lymphoma), three affect females more than males (chronic fatigue syndrome, panic disorder, and pseudoxanthoma elasticum), and one displays distinct disease characteristics between males and females (personality disorders).

We also correlated the expression of ADGRV1 with the expression of the 12 co-expressed disease risk genes (DAOA-AS1 expression was not available) across the 10 human brain tissues in the Braineac database. The correlation pattern of gene expression across the 10 brain tissues is displayed in Fig. 2. Robust correlations of gene expression between ADGRV1 and two genes, GRIK2 (Glutamate ionotropic receptor kainate type subunit 2) and EFHC2 (EF-hand domain containing 2), are ubiquitous across various human brain tissues. The highest correlations between the gene expressions of ADGRV1 and GRIK2 are in the substantia nigra and medulla inferior olivary nucleus and between ADGRV1 and EFHC2 in the frontal cortex and putamen (Fig. 2).

Figure 2
figure 2

The mRNA expression of ADGRV1 was associated with 12 co-expressed genes across 10 human brain tissues in the Braineac database (Ramasamy et al.22). Pairwise Pearson correlation coefficients (r) for ADGRV1 and the 12 genes are labeled in each square of the heatmap. The color bar indicates the extent of significance, \(-{\log }10(p-{values})\), for the correlation coefficients.

Discussion

To our knowledge, this is the first systematic search for genetic variants contributing to sex differences in OD risk using GWAS. We identified a male-specific effect of ADGRV1 on risk of OD in the AA sample using genome-wide gene-by-sex interaction scans. Co-expressed genes, the enrichment analysis, and the transcriptome analysis provided mechanistic support for the finding that ADGRV1 exerts a sex-specific effect on OD risk.

ADGRV1 spans approximately 605.5 kb on chromosome 5q14.3 and encodes a member of the G-protein coupled receptor superfamily. In addition to the N- and C-terminal domains of ADGRV1, the encoded protein contains 7 putative Na( + )/Ca(2 + ) exchangers (defining the cation binding domain) and 21 N-linked glycosylation sites24. Its extracellular repeat domains bind Ca(2 + ) and are involved in signal transduction24. Members of the ADGR family play key roles in regulating cortical patterning, dendrite and synapse formation, and myelination25. Using GTEx transcriptome data and the Human Protein Atlas26, we found that ADGRV1 is also highly expressed in three endocrine glands (adrenal gland, thyroid, pituitary), lung, and multiple brain regions (Supplementary Figure S4). Opioid use disrupts hypothalamic-pituitary-adrenal (HPA) dynamics at the level of the pituitary or adrenal27. Chronic opioid use can cause pervasive endocrine dysfunction, which leads to hypogonadism, infertility, fatigue, anxiety, depression, menstrual irregularities, and so on28,29. Many of these disorders are among what we have identified for this study in the top ten diseases significantly enriched by the ADGRV1 co-expressed genes, such as fatigue, infertility, anxiety or panic disorder (Supplemental Table S4).

In the co-expressed genes and enrichment analyses, all of the top 10 diseases identified demonstrate sexually dimorphic risks or manifestations. We identified ADGRV1 as a male-specific risk factor in OD. The top 10 diseases also include two that are male-specific (i.e., male infertility and non-syndromic oligospermia) or four that predominantly affect males more than females (i.e., X-linked mental retardation, generalized epilepsy, Sezary syndrome, or cutaneous T-cell lymphoma). On the other hand, three of the 10 diseases affect more females than males (chronic fatigue syndrome, panic disorder, and pseudoxanthoma elasticum)30,31. Collectively, the top 10 diseases for ADGRV1 co-expressed genes are strikingly sexually dimorphic.

In disease-association studies, ADGRV1 has been well examined. Mutations in ADGRV1 are associated with familial febrile seizures32, and also contribute to focal and generalized epilepsy33,34, and epilepsy with myoclonic seizures35. Besides, ADGRV1 is associated with cardiac conduction disorder33 and ADGRV1 variants segregated in families with epilepsy co-occurring sudden death (due to cardiac conduction disorder), showing shared ADGRV1 risk variants between epilepsy and cardiac conduction disorder33. ADGRV1 expression in the heart has been reported for patients with non-ischemic heart failure (Supplementary Table S2), where expression was lower in men than women36. This sex difference in ADGRV1 expression in the heart was replicated in the GTEx transcriptome data (Table 3), although disease status was not controlled in this analysis because of the disease heterogeneity of GTEx donors. ADGRV1 has also been implicated in adverse metabolic effects of antipsychotic drugs37. Taken together, our novel discovery of ADGRV1 for contributing to OD risk in men add into this scientific literature on the disease-gene association of the strong candidate gene ADGRV1.

In the GTEx data, individual-level data do not include the diagnosis of OD. Thus our investigation into the association between sex differences in ADGRV1 expression and those in OD is not possible in that dataset. However, ADGRV1 is highly expressed in endocrine tissues (Supplementary Figure S4); opioid use could act on the tissues in which ADGRV1 is highly expressed (i.e., endocrine and brain), disrupting the normal G-protein coupled receptor signaling and hormone production and causing pathogenic cellular processes. The differences in ADGRV1 expression could affect stress responses or produce hormonal or behavioral effects that differ by sex in opioid users.

Regarding the co-expression patterns for the enriched genes expressed in brain, the identified brain regions show supportive evidence for neuropathology. The robust correlations in gene expression between ADGRV1 and GRIK2 and EFHC2 are ubiquitous in human brain. The top correlations in gene expressions between ADGRV1 and GRIK2 are in the substantia nigra and medulla inferior olivary nucleus (Fig. 2). The substantia nigra is enriched in dopaminergic neurons and plays an important role in reward38, while the medulla inferior olivary nucleus is implicated in motor learning and control39. Another set of top correlations in the expression are between ADGRV1 and EFHC2 in frontal cortex and putamen (Fig. 2). The frontal cortex contains most of the dopamine-sensitive neurons in the cerebrum and is associated with reward, attention, short-term memory, planning, and motivation40. The putamen is involved in reinforcement learning and various movements41. These results support a role for ADGRV1 in networks of co-expressed genes that regulate the neural activities involved in addiction-related functions, including reward, memory, and learning.

As for the other co-expressed genes involved in the top correlated brain regions, TPH2 encodes tryptophan hydroxylase 2, which catalyzes the rate-limiting step in the synthesis of serotonin. Chronic morphine and cocaine exert common effects on tyrosine hydroxylase in dopaminergic brain reward regions42. Mutations in TPH2 were associated with quality of life of patients in methadone maintenance treatment for heroin use disorder43, responses to cocaine treatment44, and a spectrum of psychiatric disorders (cf. a review and meta-analysis45). GRIK2 encodes a member of the kainate family of glutamate receptor subunits. In a mouse model, Grik2 mRNA levels were decreased after prolonged morphine treatment46. These results again support a role for ADGRV1 in networks of co-expressed genes that regulate the neural activities involved in opioid or drug addiction.

The variants at ADGRV1 reaching genomewide significance for a sex-difference in OD risk were identified in AAs, but not EAs. The minor allele “C” of lead SNP rs2366929 in the International Genome Sample Resource (IGSR)15 has MAF = 0.47 for Europeans, and MAF = 0.22 for Africans, consistent with the MAFs (0.248 male; 0.265 female) we observed in our admixed AA population (Table 2). Local or population-specific variation is important in mapping disease risk47,48. Although the current information is not enough to determine why the genetic effects in ADGRV1 were not seen in EAs, there are several plausible explanations. For example, effect sizes might be smaller in EA ancestries such that larger sample size may be necessary for discovery. Further studies with larger sample sizes and more power are warranted to investigate sex-differences in OD risk.

The GEMMA model we used for the current study includes 17 main effects (i.e., age, sex, SNP, and four other substance dependence on alcohol, cocaine, marijuana, and nicotine, and ten PCs) and one two-way interaction (i.e., SNP-by-sex). In theory, a saturated model that incorporates all of the two-way interaction effects would be ideal but would involve a total of 153 terms in the model, comprised of the 17 main effects and 136 pairwise interaction effects. We chose not to apply the saturated model because in previous simulations (unpublished data) we obtained effect estimates for the targeted interactions that were generally close to those estimated by the saturated model. However, the power was greatly reduced by the inclusion of a large number of interaction terms. We opted to use the more limited model.

The strengths of the present study include a moderately large cohort of substance dependence subjects who were carefully phenotyped and whose data were rigorously analyzed. However, the study findings are limited by the absence of a suitable cohort of AAs with OD in which to replicate the findings after we made extensive efforts to locate an existing dataset sufficiently powered for replication. Despite this limitation, we believe that the findings from this first study of sex- differences in genetic risk for OD can aid in understanding the underlying differences and facilitate better sex-specific prevention and treatment efforts for OD.

In summary, we identified ADGRV1 as a risk locus contributing to increased risk of OD in AA men by examining genetic variants systematically on a genome-wide scale. Functional annotation of this finding corroborated a substantial role for ADGRV1 in increasing OD risk, especially the potential pathogenic effects of variation in AGDRV1 on cardio-cerebral mechanisms, which could contribute to the risk of fatal opioid overdose or respiratory depression that has been observed following high-dose opioid exposure. Further study of this finding is warranted.