Introduction

Homocysteine, a thiol containing amino acid, is formed intracellularly during methionine metabolism. It is a key branch point intermediate in the ubiquitous methionine cycle, the function of which is to generate one-carbon methyl groups for transmethylation reactions that are essential to all life forms. During the past decade, homocysteine has received increasing attention, as elevated levels of homocysteine have been implicated as an independent risk factor for cardiovascular disease (Robinson et al. 1995) and have also been associated with various other diseases and/or clinical conditions, including Alzheimer’s disease (McCaddon et al. 1998), neural tube defects (Mills et al. 1995; de la Calle et al. 2003), schizophrenia (Applebaum et al. 2004), end-stage renal disease (van Guldener and Stehouwer 2003), osteoporosis (Villadsen et al. 2005), non-insulin-dependent diabetes (de Luis et al. 2005; Buysschaert et al. 2004), etc. Although elevated levels of homocysteine have been associated with several diseases, the mechanism underlying the deleterious effects of homocysteine has not yet been completely elucidated. It is believed that elevated levels of homocysteine might result in the accumulation of reactive oxygen species (Starkebaum and Harlan 1986) which has been implicated as a major factor in the pathogenesis of several diseases. It has also been proposed that homocysteine can potentially disrupt critical protein–disulfide bonds, thereby altering the structure and/or function of proteins (Sengupta et al. 2001; Majors et al. 2002; Lim et al. 2003). Furthermore, an increase in the concentration of homocysteine leads to elevated levels of S-adenosyl homocysteine—an inhibitor for many methyl transferases—resulting in altered methylation of genes that subsequently leads to modulation of gene expression (Ping et al. 2000).

In healthy, well-nourished individuals, homocysteine metabolism is well regulated, and the concentration of plasma total homocysteine (tHcy) ranges between 5 and 12 μM. However, deficiencies of the enzymes and/or the cofactors involved in the metabolism of homocysteine can lead to its aberrant intracellular processing, thereby elevating its level. Among the genes involved in the metabolism of homocysteine, methylenetetrahydrofolate reductase (MTHFR) plays a key role. MTHFR converts methylenetetrahydrofolate to methyltetrahydrofolate, the primary methyl donor in the transmethylation reaction in which homocysteine is converted to methionine. It can thus be perceived that defects in the gene MTHFR will result in elevated levels of plasma homocysteine. Two common polymorphisms in the MTHFR gene, C677T (in exon 4; A222 V) and A1298C (in exon 7; E429A), have been reported to decrease the activity of MTHFR, resulting in increased levels of homocysteine (Weisberg et al. 1998). In addition to the genetic factors, environmental factors, including diet, also influence the levels of tHcy (Mann et al. 1999). Folate and vitamin B12 are the two critical nutritional factors, a deficiency in which results in hyperhomocysteinemia. Interestingly, a majority of Indian population are known to be deficient in vitamin B12, presumably due to their adherence of a strict vegetarian diet (Refsum et al. 2001). Thus, these two polymorphisms may be perceived to have a greater impact in relation to hyperhomocysteinemia in the Indian population. Despite the susceptibility of the Indian population to elevated homocysteine levels (presumably due to their dietary habits), there are no reports to date in which the effect of the two MTHFR polymorphisms on homocysteine levels was studied in the same cohort.

Herein we report the effect of MTHFR gene polymorphism (C677T; A1298C) on the homocysteine and cysteine levels of 200 individuals recruited at the All India Institute of Medical Sciences, New Delhi, India. In addition, we report the frequencies of these two polymorphisms in 19 Indian populations selected on the basis of their linguistic lineage and geographical location.

Materials and methods

Subjects

During the period of September 2004 to April 2005, 203 subjects were recruited at the All India Institute of Medical Sciences as a part of a project to evaluate the role of homocysteine in coronary artery disease. The ethics committees of both the All India Institute of Medical Sciences and the Institute of Genomics and Integrative Biology approved this study. Written consent was obtained from all the participants, and the study was carried out in accordance with the principles of the Helsinki Declaration and was approved by the ethics committee of both the Institutes. All the subjects except six belonged to the Indo-European linguistic group; these six belonged to the Dravidian linguistic group. A detailed questionnaire was filled out for each subject which included information about the subject’s smoking habits, height, weight measurements, etc. Since a majority of the Indian population adheres to a strict vegetarian diet, information about this was also obtained. People who do not consume any animal product other than milk products were classified as vegetarians.

Apart from these, samples collected as a part of the Indian Genome Variation project (The Indian Genome Variation Consortium 2005) were also genotyped to obtain the frequencies of MTHFR polymorphism in the general population. In the Indian Genome Variation project, samples are being collected from various populations in the country. In general, the Indian population can be, to a large extent, sub-structured on the basis of their ethnic origin as well as their linguistic lineages. Linguistically, the Indian population can be classified into four major families: Indo-European, Dravidian, Tibeto-Burman, and Austro-Asiatic (The Indian Genome Variation Consortium 2005). Populations are selected based on certain criteria that include linguistic lineages, ethnicity, geographical location, etc. For the present study, samples were randomly selected from these populations. Samples were collected from individuals unrelated at least to the first cousin level. This study obtained prior ethical clearance from the Institutional Bioethics Committee for the collection of samples, following the guidelines of Indian Council of Medical Research. Prior to sample collection, a uniform bar-coded detailed questionnaire was developed containing information pertaining to ethnicity, family history of diseases, and other phenotypic traits of the sample donor.

Blood sample collection

Blood samples were collected from volunteers into tubes containing anticoagulant, and plasma was separated from the blood samples within an hour of collection. The plasma samples were then stored at −80°C until further analysis. For the polymorphism studies, DNA was isolated from the blood samples after removal of the plasma. For the population-based studies, blood samples were kept at 4°C until DNA was isolated. DNA was isolated from whole blood sample within 3–4 days of collection using the modified salting-out method as described below.

DNA isolation and PCR

Genomic DNA was isolated from blood samples using the modified salting-out method as described by Miller et al. (1988), and stored at −20°C until further analysis. In brief, 10 ml of blood was treated with 40 ml RBC lysis buffer and mixed gently for 10 min. It was then centrifuged for 10 min at 2,500 rpm. Supernatant was discarded and 12 ml of Nuclei lysis buffer was added, followed by the addition of 0.8 ml of 10% sodium dodecyl sulfate and 50 μl of Proteinase K (20 mg/ml). This mix was incubated at 65°C for 2 h. Four milliliters of 6 M sodium chloride was added to this mix, followed by vigorous shaking for 15 s. This was then immediately centrifuged at 2,500 rpm for 25 min. Supernatant was carefully taken without disturbing the pellet. Two volumes of absolute ethanol kept at room temperature were added to this supernatant, and DNA was obtained by inverting the mix 10–15 times slowly. DNA was washed with 70% ethanol by centrifuging at 13,000 rpm for 10 min. Supernatant was removed carefully without disturbing the pellet. This pellet was resuspended in Tris–EDTA buffer (pH 8.0) and stored at −20°C until further analysis. Polymerase chain reaction (PCR) was carried out using GeneAmp PCR system 9700 (Applied Biosystems, Foster City, CA, USA), in a total volume of 15 μl with 1.5 mM MgCl2, 0.1 mM of each dNTP (Amersham Biosciences, NJ, USA), 30 ng of genomic DNA, 0.1 pmol/μl of each forward and reverse primer (Supplementary Table 1), 0.03 U/μl of Taq DNA polymerase (Bangalore Genie, India) and the buffer recommended by the supplier. PCR products were purified using ExoSAP, in which 1 U each of shrimp alkaline phosphatase (SAP) (Amersham Biosciences, NJ, USA) and Exonuclease I (Amersham Biosciences, NJ, USA) was used. PCR products were incubated at 37°C for 2 h and then 85°C for 15 min. For genotyping using the SNaPshot method, 2 μl of this purified PCR amplified product and 2 pmol of primer (Supplementary Table 1) were added to the SNaPshot Multiplex Ready Reaction mix (ABI, Foster City, CA, USA). After extension, the product was treated with SAP as before to remove the excess fluorescent dye terminator. An aliquot (1 μl) of this product was added to 9 μl of Hi-Di formamide (ABI, Foster City, CA) and the mixture was analyzed by electrophoresis in a DNA analyzer (ABI Prism 3700). The software GeneScan analysis (ABI, Foster City, CA, USA) was used to analyze the results.

Table 1 Characteristics of the study group subjects

HPLC analysis

Plasma levels of homocysteine and cysteine were determined using high performance liquid chromatography (HPLC) equipped with a fluorescence detector as described earlier (Ji et al. 1995). Briefly, 0.1 ml of plasma was treated with 0.035 ml of 1.43 M sodium borohydride in 0.10 M sodium hydroxide (to reduce oxidized thiols), followed by the addition of 0.035 ml of 1.0 M HCl. To this, 0.05 ml of 7 mM monobromobimane in 5 mM sodium EDTA (pH 7.0) was added (to conjugate the reduced thiols with the flurophore) and the solution was incubated at 42°C for 12–15 min. Plasma proteins were then precipitated by the addition of 0.050 ml of 1.5 M perchloric acid, followed by centrifugation at 12,000 rpm for 10 min. The supernatant was then transferred to injector vials for automated HPLC analysis. HPLC measurements were done by using Agilent 1100 using reverse-phase C18 column (5 μM bead size; 4.6 mm×150 mm from Phenomenex Inc., Torrance, CA, USA). Standard curves were generated with known amounts of homocysteine and cysteine to calculate the concentration of these thiols in the plasma.

Statistical analysis

The Kruskal–Wallis test was used to compute statistical significance of difference in quantitative parameters (tHcys level, etc.) between multiple groups. Comparison between two groups was performed by the Mann–Whitney U-test. Analysis of the effect of interaction of diet and genotype or interaction between genotypes of MTHFR C677T and A1298C polymorphism on tHcys, tCys, or total thiol (tHcys + tCys) levels was performed by univariate analysis of variance under general linear model. Depending on the number of samples in each group, genotypic associations were performed under additive as well as dominant and recessive model. Haplotypes were constructed from genotypes of two polymorphic markers by using PHASE (http://linkage.rockefeller.edu). Genotypes were checked for the conformance of Hardy–Weinberg Equilibrium.

Results

The main objective of this present study was to ascertain the correlation of plasma tHcy and cysteine levels with MTHFR polymorphisms and/or dietary habits.

The subjects were divided into two groups based on their dietary habits: (a) vegetarians (those adhering to strict vegetarian diet) and (b) non-vegetarians. The characteristics of the subjects studied in these two groups are shown in Table 1. The concentration of plasma homocysteine was significantly higher in vegetarians than in non-vegetarians. However, none of the other factors, like BMI, waist–hip ratio, or plasma cysteine concentrations differed significantly between the two groups.

To check the allelic frequencies of these two common MTHFR polymorphisms (C677T and A1298C) in the study population, and to evaluate whether they are associated with homocysteine and/or cysteine levels, we genotyped these individuals for the two polymorphisms. The distribution of genotypes of the two polymorphisms along with the plasma homocysteine and cysteine levels is given in Table 2. The minor allele frequency (MAF) of MTHFR C677T was found to be 0.15. Furthermore, only six of the 202 samples (2.9%) had TT genotype. Although the TT genotype of MTHFR C677T had a higher median homocysteine level than the CC genotype, the difference was not statistically significant. The MAF of MTHFR A1298C was found to be 0.43, with 18.2% of individuals having homozygous CC genotype. In contrast to MTHFR C677T, significant association was found between MTHFR A1298C polymorphism and plasma tHcy and cysteine concentrations under the additive model (Table 2). Under the assumption of a recessive model, this polymorphism was found to be associated with plasma homocysteine (P=0.005) but not plasma cysteine levels. Significant association was found between the sum of homocysteine and cysteine levels and MTHFR A1298C under this model (P=0.026). Analysis under the recessive model could not be done for MTHFR C677T due to the small number of individuals (six) with TT genotype. Since diet and MTHFR A1298C polymorphism are both associated with homocysteine levels, we checked if this polymorphism was associated with homocysteine levels irrespective of the dietary factor. For this, we analyzed the genotypic association independently in the two dietary groups, and found that this polymorphism was significantly associated with homocysteine levels for individuals consuming a non-vegetarian diet under both additive (P=0.003, Table 3) and recessive models (P=0.001). However, no association was observed in individuals adhering to a vegetarian diet (Table 3).

Table 2 Distribution of genotypes along with plasma thiol concentration [median (range)] for MTHFR polymorphisms
Table 3 Distribution of genotypes with plasma thiol concentration [median (range)] for MTHFR C677T and A1298C polymorphisms in vegetarian and non-vegetarian subjects

From the results mentioned thus far it is clear that homocysteine levels were associated both with vegetarian diet and MTHFR A1298C polymorphism. We then wanted to see the effect of diet–genotype and genotype–genotype interactions on plasma homocysteine levels. The genotype–diet interaction did not significantly influence plasma homocysteine or cysteine levels (data not shown). Furthermore, we also did not find any significant influence of genotype–genotype interaction on homocysteine levels (Table 2). Linkage disequilibrium between these two markers was found to be very low in the study population (D′=0.56; r2=0.04). Haplotype construction using PHASE revealed that the CA was the most frequent haplotype (0.45), followed by CC (0.4), TA (0.12), and TC (0.03).

We also wanted to assess the frequencies of MTHFR C677T and A1298C in the Indian population as a whole, to check if the genotypic frequencies obtained using hospital-based samples for MTHFR C677T and A1298C were a true reflection of the frequencies of these polymorphisms in the Indian population. Keeping in view the linguistic and geographical diversity in India, we selected 19 populations belonging to the four linguistic lineages from different geographical locations encompassing the entire country (Fig. 1). Samples were genotyped for the two MTHFR SNPs C677T (834 alleles) and A1298C (894 alleles), and this included a minimum of 20 alleles from each population (Table 4). The MAF of C677T was less than 0.1 in most of the populations. Region-wise, the highest MAF was observed in the North Indian population, where of the four populations studied three had a MAF of more than 0.1. This is in agreement with our hospital-based study (MAF C677T 0.15), where most of the individuals were from Northern India. Two populations—one belonging to the Tibeto-Burman linguistic group and the other to the Indo-European group in the North Indian population—showed MAF that are comparable to the Caucasian and other Asian populations (Table 4). Likewise, another population belonging to the Indo-European linguistic group from the Western region showed a MAF of >0.2.

Fig. 1
figure 1

Map of India showing the populations (red circles) that were included for this study. DR Dravidian, AA Austro-Asiatic, IE Indo-European, TB Tibeto-Burman, C central, E eastern, W western, N northern, S southern, NE northeastern, IP isolated population, LP large population, SP special population

Table 4 Distribution of two MTHFR polymorphisms in different ethnic groups of India

In contrast to MTHFR C677T, the MAF of A1298C polymorphism in most of the populations studied was comparable to other populations (Table 4). In fact, in some of the populations the MAF was higher than that reported for the Caucasian population. Furthermore, the CC genotype of A1298C is more prevalent (19.46%) in the Indian population. Interestingly, in at least four of the 19 populations studied the “C” was the major allele. This is interesting, as the first published sequence of human MTHFR cDNA also carried the C nucleotide (Weisberg et al. 1998).

Discussion

Hyperhomocysteinemia has been associated with various complex disorders. The levels of plasma homocysteine can potentially be elevated either due to environmental (including diet) and/or genetic factors. Among the dietary factors, folate and vitamin B12 play a critical role in the metabolism of homocysteine. Since a majority of the Indian population is deficient in vitamin B12 (Refsum et al. 2001), presumably due to their adherence to a strict vegetarian diet, we hypothesized that polymorphism in the genes responsible for the metabolism of homocysteine will have a greater impact in relation to hyperhomocysteinemia in the Indian population.

The two common polymorphisms in the MTHFR gene, C677T and A1298C, decrease the enzyme activity, thereby elevating homocysteine levels. Weisberg et al. reported that in heterozygotes for the C677T polymorphism the MTHFR activity is decreased by approximately 30%, whilst in homozygous individuals the activity decreases by about 60% (Weisberg et al. 1998). In contrast to C667T, in heterozygotes for the A1298C polymorphism only a 10% reduction in the activity of MTHFR is shown, whilst the homozygotes for this polymorphism show a reduction of 35–45% in the activity of the enzyme. Thus, both these polymorphisms can lead to a decrease in enzyme activity, and as a consequence can potentially elevate the levels of homocysteine. The C677T polymorphism lies in the catalytic domain of the enzyme, whilst the A1298C polymorphism is in the C-terminal regulatory domain. Since C677T lies in the catalytic domain, it is assumed that this polymorphism has a more dramatic effect on the activity of the enzyme. However, the polymorphism A1298C can effect the S-adenosyl methionine mediated regulation of the enzyme (Weisberg et al. 1998).

We thus wanted to check if these two polymorphisms, in conjunction with the diet, are in any way associated with the levels of plasma homocysteine and cysteine. Cysteine levels were determined, as intracellular homocysteine can be easily converted to cysteine via the transsulfuration pathway, and excess cysteine can be transported into the circulation through various amino acid transporters. Thus, it might be important to determine the effect of a polymorphism on the levels of both homocysteine and cysteine. We divided the subjects into two categories based on their diet—vegetarian and non-vegetarian. This was done with the assumption that people consuming vegetarian diets will have lower vitamin B12 levels than those consuming non-vegetarian diets. Refsum et al. reported that about 75% of the subjects studied (from a hospital in Pune, India) had metabolic signs of cobalamin deficiency, which was at least partly due to the vegetarian diet (Refsum et al. 2001). We indeed got a significant association between the consumption of a vegetarian diet and plasma homocysteine levels. This is in agreement with the study reported by Cappuccio et al. (2002) wherein they had found that vegetarians had higher levels of homocysteine than non-vegetarians (Cappuccio et al. 2002). No correlation was found between plasma homocysteine levels and other demographic parameters like BMI and waist circumference.

Several studies in different populations have reported that MTHFR 677TT genotype is associated with high levels of homocysteine (Frosst et al. 1995; Christensen et al. 1997; Herrmann et al. 2003; Amouzou et al. 2004; Mager et al. 2005). The MTHFR C677T homozygous genotype is associated with premature coronary artery disease and other cardiovascular disorders (Inbal et al. 1999). It has also been reported to be associated with neural tube defects, preeclampsia, and other complications of pregnancy. The TT genotype is present at a frequency of about 9% in the Caucasian population and 15–16% in the Chinese and Japanese populations. However, in the Indian population this polymorphism is present at a much lower frequency (2.9%). Furthermore, although homocysteine levels in plasma were found to be higher in subjects with TT genotype of MTHFR C677T polymorphism in Indian population, this was not statistically significant. This probably is due to a lower number of subjects having TT genotype in this population. Thus, it can be perceived that the impact of this genotype will be low in the Indian population. Our findings are in agreement with Chambers and Kooner (2001), who reported that although plasma homocysteine levels were elevated in Indians living in UK, the TT genotype was not associated with homocysteine concentrations.

In contrast to MTHFR C677T polymorphism, individuals with homozygous A1298C polymorphism had significantly higher homocysteine levels. Only a few studies have been attempted to check the status of MTHFR polymorphism in relation to homocysteine levels in the Indian population, of which to the best of our knowledge there are only two reports on A1298C polymorphism in individuals from Indian origin. However, in both these studies homocysteine levels were not determined. In one of the studies, Angeline et al. (2004) suggested a possible role of MTHFR A1289C in the pathogenesis of heart disease. The study population included 72 individuals from Tamil Nadu (South India), of whom 52 individuals were acute myocardial infarction patients and 20 were controls. In another case–control study, individuals of Indian origin residing in South Africa and suffering from myocardial infarction were included. However, in this study, no correlation was observed between cases and controls for either the A1298C or C677T polymorphisms (Ranjith et al. 2003). Thus, this is the first study where the effect of MTHFR polymorphisms (C677T and A1298C) on homocysteine levels has been studied in the Indian population. Interestingly, the percentage of homozygous CC genotype in A1298C obtained in the two studies mentioned above (approximately 15%) is similar to that obtained in this study. The frequency of the CC genotype is much higher than the frequency reported in Caucasian (9.4%), Chinese (3.3%), or Japanese (1.6%) populations.

India consists of ethnically, geographically, and genetically diverse populations consisting of 4,693 communities with several thousand endogamous groups, 325 functioning languages, and 25 scripts (Singh 2002). In general, the Indian population can be, to a large extent, sub-structured on the basis of their ethnic origin as well as linguistic lineages. Linguistically, Indian population can be classified into four major families: Indo-European, Dravidian, Tibeto-Burman, and Austro-Asiatic. The Indo-European and Dravidian languages are spoken in the northern and southern parts of the subcontinent respectively (Gadgil et al. 1998). The Tibeto-Burman speakers, concentrated in the northern and northeastern parts of the country, are supposed to have immigrated to India from Burma (now, Myanmar) and Tibet (Guha 1935). Austro-Asiatic speakers are exclusively tribals, and are dispersed mostly in the central and eastern parts of the country

Due to these diversities in the Indian population inhabiting different regions, we extended our study to find the frequencies of these two polymorphisms in approximately 19 populations selected on the basis of their linguistic lineage and geographical location. The overall MAF for C677T was found to be 0.129 in these populations. Interestingly, in 12 out of the 19 populations screened the MAF of this population was less than 0.1; in 8 of these populations the MAF was less than or equal to 0.05, which in effect means that MTHFR C677T is not polymorphic in these populations. Furthermore, there were only three populations where the homozygous genotype was found. This also agrees with the data obtained from the hospital samples, where the MAF of MTHFR C677T was found to be 0.15, with 2.9% of the samples having the TT genotype. Thus, these data clearly indicate that this polymorphism might not be relevant in the Indian population. In contrast to MTHFR C677T polymorphism, the MAF for MTHFR A1298C polymorphism was found to be 0.38 in the Indian population, which is similar to that found in several other populations. However, compared to other populations, the percentage of the homozygous mutant genotype was considerably higher in the Indian population (approximately 6–10% vs 15–20%).

Thus, from our study we conclude that a strict vegetarian diet elevates the level of homocysteine. We also found that the frequency of MTHFR C677T polymorphism is low in the Indian population, and it is not associated with plasma homocysteine or cysteine levels. Plasma homocysteine and cysteine levels are significantly associated with MTHFR A1298C polymorphism. However, when dietary factor was taken into account this polymorphism showed significant association only in individuals consuming a non-vegetarian diet. Thus, irrespective of the MTHFR genotype, individuals adhering to a vegetarian diet have high homocysteine levels. Our observation that the MTHFR A1298C polymorphism is significantly associated with homocysteine levels under recessive model, along with the fact that the CC genotype is present at a higher frequency in Indian population, renders it to be extremely relevant in terms of its potential impact on hyperhomocysteinemia. However, MTHFR polymorphism and diet alone cannot entirely explain the cause behind the elevated levels of homocysteine in every individual of our cohort. Thus, it may be important to study different dietary, environmental, and lifestyle factors along with polymorphism of other genes involved in the metabolism of homocysteine.