Introduction

Oxytocin is a neuropeptide secreted from the central nervous system and has similar functions with the antidiuretic hormone vasopressin1. In addition to its functions in the physiology of uterus and milk secretion, oxytocin has been shown to affect carcinogenesis1. A former in vitro study has demonstrated the mitogenic effects of oxytocin on MCF7 cells indicating the possible role of this neuropeptide in the growth of breast cancer cells2. Yet, another study in MCF7 and T47D breast cancer cells has shown the inhibitory effect of oxytocin on estrogen-associated cell growth. This neuropeptide has also been shown to promote the suppressive impact of tamoxifen on cell proliferation. Moreover, expression of oxytocin receptor has been detected in these cell lines and MDA-MB-231 cells3. Subsequent investigations have verified anti-proliferative effects of oxytocin and have demonstrated the role of cyclic adenosine monophosphate protein kinase A in the mediation of these effects4. Further experiments in animal models of breast cancer have also verified such effects5. As a G protein-coupled receptor, oxytocin receptor exemplifies a fascinating target for cancer treatment since it partakes in the development of in breast cancer and is expressed by numerous breast cancer cell lines6. Yet, the underlying mechanisms of involvement of oxytocin receptor and its related pathways are not completely understood. In the present project, we have identified mRNA coding genes and long non-coding RNAs (lncRNAs) which are associated with this pathway through an in-silico strategy, then measured their expression in a cohort of Iranian females affected with this type of malignancy7. We hypothesized that oxytocin-related lncRNAs are involved in the pathogenesis of different histopathological types of breast cancer.

Materials and methods

Bioinformatics methods

GSE54002 dataset was downloaded from Gene Expression Omnibus database and preprocessed in R version 3.6.1 using limma package version 3.40.6. This dataset was selected as it contains expression data of an appropriate number of clinical samples prepared by laser capture microdissection (417 patients with breast cancer and 16 non-tumor tissues) (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE54002). Gene expression matrix was obtained using the log2 values. Then, data was normalized using limma package. Differentially expressed genes (DEGs) between tumoral and normal tissues were assessed using Bayes methods and limma package. Raw P values were corrected using Benjamini and Hochberg methods. Cut-off criteria for identification of DEGs were P < 0.05 and logFC > 2 for up-regulated genes and logFC < -2 for down-regulated genes. Pathway Enrichment Analyses of DEGs were performed using https://amp.pharm.mssm.edu/Enrichr and KEGG database. PPI network was depicted and hub genes were recognized using STRING (https://string-db.org) and Cytoscape v3.8.1. Then, from the down-regulated genes, those being associated with oxytocin pathway were selected. Finally, lncRNAs associated with these genes were chosen based on the results of Khalil et al. study (GSE16226)8.

Enrolled individuals

Expression of oxytocin-related genes were assessed in 69 pairs of breast cancer specimens and their matched nearby tissues. Samples were gathered from Farmanieh and Sina hospitals during 2017–2020, Tehran, Iran. The study protocol was approved by the ethical committee of Shahid Beheshti University of Medical Science and the study protocol was performed in accordance with the relevant guidelines (IR.SBMU.MSP.REC.1398.1010). Patients’ samples were excised before any chemotherapy or radiotherapy. Medical records were gathered to obtain histopathological and clinical data. Informed written consent forms were obtained from study participants.

Expression assays

All tissue sections were subjected to RNA extraction using the RiboEx kit (GeneAll, Seoul, South Korea). Afterwards, 70–100 ng of RNA was used for production of cDNA using the ExcelRT Reverse Transcription Kit II (SMOBIO, Taiwan). Expressions of genes in breast cancer samples and nearby non-cancerous tissues were measured in the ABI step one plus PCR machine. Expression levels were normalized to transcripts of GAPDH. RealQ Plus 2 × PCR Master Mix (Ampliqon, Odense, Denmark) was used for making the reactions. Primers and amplicons characteristics are shown in Table 1.

Table 1 Primers and amplicons characteristics.

Statistical analyses

Statistical analyses were executed in the R environment. Transcript quantities of oxytocin-related genes were measured in relation to the HPRT1 reference gene using the equation: \(\frac{am{p}_{gene}^{-C{T}_{gene}}}{am{p}_{House keeping}^{-C{T}_{House keeping}}}\) . Afterwards, the acquired values were log2 transformed and utilized for subsequent analysis.

A comparison was made between non-cancerous and tumor tissues of patients, and the significance of the difference between mean values was appraised using the paired t-test. Correlations between expression levels of oxytocin-related genes were appraised through the calculation of Spearman correlation coefficients. In order to appraise of the diagnostic power of genes, receiver operating characteristic (ROC) curves were depicted. ROC curves were depicted using the methods described previously9, 10. For this purpose, Bayesian Generalized Linear Model (BayesGLM), Generalized Linear Model (GLM), and Linear Discriminant Analysis (LDA) were used to compute the sensitivity and specificity of each model. GLM is a generalization of linear regression with no constraint on the distribution models of response variables. BayesGLM is an approach to GLM using Bayesian inference, and LDA aims to find a linear combination of features that separates two or more classes of objects or events. Log 2 values of transcript quantities of all genes were used as the predictive features to train three machine learning models with tenfold cross validation to avoid overfitting. Area under curve (AUC) metric was computed to pick the best model. Finally, BayesGLM model was selected based on the previous test, and the model was trained for each gene separately to test the distinguishing power of specific genes. Chi-square test was used to assess the association between demographic/clinical data and transcript levels of oxytocin-associated genes. Genes with log2FC ≥ 1 (tumor tissues vs. non-cancerous tissues) were considered as up-regulated and those with log2FC ≤ − 1 were considered as down-regulated. The level of significance was set at P value < 0.05.

Results

Bioinformatics step

The in-silico method has led to identification of a number of down-regulated genes in cancerous tissues compared with non-cancerous tissues (Fig. 1).

Figure 1
figure 1

The interaction networks for down-regulated genes as predicted by STRING and visualized in Cytoscape. Hub genes and their neighbors are shown.

KEGG pathways analysis revealed oxytocin signaling pathway as the most significant enriched pathway of the down-regulated genes (Fig. 2).

Figure 2
figure 2

KEGG pathways of the enriched pathways of the down-regulated genes11.

Expression assays

Figure 3 depicts the relative expression levels of oxytocin related genes in breast cancer samples and nearby non-cancerous tissues.

Figure 3
figure 3

Relative expression levels of oxytocin related genes in breast cancer samples and nearby non-cancerous tissues. Minimum, maximum and interquartile range are shown. Purple dots represent expression of each gene in each sample. Outliers are shown as black dots.

Expression levels of OXTR, FOS, ITPR1, RCAN1, CAMK2D, CACNA2D and lnc_ZFP161 were significantly down-regulated in the breast cancer tissues compared with nearby non-cancerous tissues. On the other hand, expression of lnc_MTX2 was higher in breast cancer tissues compared with controls. Expressions of lnc_TNS1 and lnc_FOXF1 were not different between these two kinds of samples (Table 2).

Table 2 Detailed parameters of expression analysis of oxytocin-related genes in breast cancer samples compared with nearby non-cancerous tissues.

Association between expression of genes and clinical data

Then, we appraised the association between expression levels of oxytocin-associated genes and a number of clinical and demographic data such as cancer stage and grade, age, mitotic rate, tumor size and hormone receptor status. Expression of CACNA2D was associated with mitotic rate and PR status (P values = 3.02E−02 and 2.53E−02, respectively). Expression of other oxytocin-related genes was not associated with these parameters (Table 3).

Table 3 Association between expression of oxytocin-related genes and clinical/demographic data.

ROC curves

Figure 4 demonstrates the efficacy of three predictive models in predicting the diagnostic power of oxytocin-related genes and the obtained AUC values for each gene. ROC curves were depicted using Log 2 values of transcript quantities of all genes as the predictive features to train three machine learning models (LDA, BayesGLM and GLM) with tenfold cross validation. AUC metric was computed to pick the best model. Finally, BayesGLM model was selected based on the previous test, and the model was trained for each gene separately to test the distinguishing power of specific genes.

Figure 4
figure 4

Efficacy of three predictive models in predicting the diagnostic power of oxytocin-related genes (A) and the obtained AUC values for each gene (B).

FOS and ITPR1 had the highest AUC value among the oxytocin-related genes. Combination of expression profile of all oxytocin-related genes increased the AUC value to 0.75. However, the combinatorial sensitivity and specificity values were lower than some individual genes (Table 4).

Table 4 Detailed characteristics of ROC curves.

In breast cancer tissues, the most robust correlations have been detected between lnc_ZFP161/ lnc_FOXF1, CAMK2D/ lnc_ZFP161 and CAMK2D / lnc_FOXF1 (r = 0.86, 0.71 and 0.64 respectively) (Fig. 5A). In the non-cancerous tissues, the strongest correlation was detected between lnc_FOXF1/lnc_MTX2 and lnc_ZFP161/CAMK2D respectively (r = 0.78 and 0.65). (Fig. 5B).

Figure 5
figure 5

Correlations between expression levels of oxytocin-associated genes.

Discussion

Breast cancer is a complex disorder in which several molecular mechanisms are involved. Immunology regulations may also affect breast cancer development and immunodeficiency may promote adaptive alterations of host gut- and tissue-based microbiome12. LncRNAs can affect several aspects in this regard. Several lines of evidence such as the structural and genomic relation to vasopressin, co-expression of oxytocin and vasopressin, and the mitogenic effects of these hormones connected oxytocin to carcinogenesis1. Moreover, breastfeeding has been shown to decrease the risk of a number of cancers and particularly breast cancer, with elongated periods of breastfeeding being associated with a progressive reduction in the risk of this cancer13, 14. Meanwhile, oxytocin has been shown to affect immune regulation15, thus possibly influencing breast cancer pathogenesis via different routes. Despite these observations, the underlying mechanisms involving oxytocin in breast carcinogenesis are not elucidated. Based on the importance of lncRNAs in the regulation of cancer-related pathways, we aimed at identifying the oxytocin-related lncRNAs through an in silico approach and assessed expression of a number of oxytocin-related mRNAs and lncRNAs in breast cancer samples. We identified down-regulation of OXTR, FOS, ITPR1, RCAN1, CAMK2D, CACNA2D and lnc_ZFP161, and up-regulation of lnc_MTX2 in the breast cancer tissues compared with nearby non-cancerous tissues. In line with our observation, Ariana et al. have reported lower expression of OXTR in breast cancer tissues compared with para-cancerous tissues. They also detected high expression of oxytocin in breast cancer patients16. Based on the results of in vitro and in vivo investigations regarding the role of oxytocin as a peptide with bivariate biological functions, Imanieh et al. have hypothesized that oxytocin acts as either an activator or inhibitor of growth through activating OXTR in tumoral cells17. The observed down-regulation of FOS in breast cancer samples is in line with the study of Fisler, which reported association between higher FOS expression and better survival of patients with breast cancer. Moreover, higher levels of FOS target apoptosis-effector gene have been associated with improved survival of these patients. Based on these results, authors have suggested that FOS is a pro-apoptotic protein18. In addition to the functional association with oxytocin-related pathways, ITPR1 has a regulatory role on autophagy and sensitivity to chemotherapeutic agents in cancer cells19. Therefore, its down-regulation in breast cancer cells might influence several aspects of breast carcinogenesis. RCAN1 has been suggested as a super-enhancer-driven tumor suppressor whose down-regulation enhances the malignant features of breast cancer cells20. CAMK2D is a kinase that regulates several cellular processes, such as proliferation, differentiation and apoptosis. Chi et al. have reported higher levels of CAMK2D expression and phosphorylation in breast cancer samples compared with non-cancerous samples21. This finding is in contrast with the reported expression pattern of CAMK2D mRNA in the current study. Further assessment of expression levels of this gene at both mRNA and protein levels is necessary for solving this controversy. We also detected down-regulation of the calcium channel coding gene CACNA2D in breast cancer samples and its association with mitotic rate and PR status. Former studies have reported that breast cancer cells can attain a selective growth advantage through modulating ion channel expression or function. These channels have also been shown to participate in the prominent features of this cancer22. However, the specific role of CACNA2D has not been elucidated. Future functional studies are required to clarify this point.

We also assessed the diagnostic value of oxytocin-related genes in breast cancer. FOS and ITPR1 had the highest AUC value among the oxytocin-related genes. Combination of expression profile of all oxytocin-related genes increased the AUC value to 0.75. However, the combinatorial sensitivity and specificity values were lower than some individual genes. We recommend appraisal of expression of these genes in the peripheral blood of patients with breast cancer to unravel their diagnostic potential.

Finally, appraisal of correlation between expression levels of oxytocin-related genes has led to identification of specific patterns in cancerous and non-cancerous tissues. In breast cancer tissues, the most robust correlations have been detected between lnc_ZFP161/lnc_FOXF1, CAMK2D/lnc_ZFP161 and CAMK2D/lnc_FOXF1. In the non-cancerous tissues, the strongest correlation was detected between lnc_FOXF1/Lnc_MTX2 followed by lnc_ZFP161 and CAMK2D. Taken together, oxytocin-associated genes have been dysregulated in breast cancer tissues. Moreover, the correlation between these genes is influenced by the presence of cancer, as correlation coefficients between gene pairs were different in tumoral and non-tumoral tissues.

The current study used a combination of bioinformatics and gene expression methods. Bioinformatics methods have been extensively used to find appropriate targets for experimental assessment of gene expressions. A common strategy is to collect all related public expression-profiling of microarray and RNA-sequencing data using appropriate criteria and to combine them to construct co-expression network to identify hub mRNA/lncRNAs along with using PPI network analysis23, 24. However, in the current study, we only selected one dataset. Although selection of this dataset was based on the appropriateness of included samples and methods, additional datasets could also be used for this purpose. So, we proposed future assessment of the results of this study using these datasets.

Although deep learning method is a very promising way to predict prognosis for cancer based on biomarkers, an important prerequisite for efficient deep learning models is the large number of samples in proportion of the number of parameters in the model. Here, in the statistical part of the study, we aimed to validate the selected markers in a case–control study with 69 specimens. So, some simpler machine learning methods were used to examine the efficacy of markers. Finally, the potential causal effects behind the association of the oxytocin-related lncRNA biomarkers with breast cancer should be verified using a statistical approach named Mendelian Randomization25.

Taken together, our study demonstrates abnormal expression levels of oxytocin-related genes in breast cancer tissues versus non-cancerous tissues and influence of cancer on the correlation network between these genes, potentiating these genes as biomarkers for breast cancer.