Introduction

Glycophorin A (GPA) is a sialoglycoprotein expressed on erythrocyte membranes carrying MN blood group antigens M and N. Polymorphic differences at the first and fifth amino acids from the amino terminus of GPA (M, Ser1/Gly5; N, Leu1/Glu5) arises from three nucleotide changes in exon 2 of the GPA gene. We have previously classified standard M and N alleles into several allele types based on differences in nucleotide sequence of the GPA gene (Akane et al. 1997, 2000; Mizukami et al. 2002). The M allele was first divided into MG and MT alleles (Akane et al. 1997). Further study allowed us to classify the MG, MT, and N alleles as M10X, M20X, and N10X, where the first letter (M/N), the second number (1/2) and the third to fourth digits (0X) represent antigenicity, major, and minor variations, respectively (Akane et al. 2000; Mizukami et al. 2002): M101 is the standard allele expressing M antigen, and alleles M102 and M201 are minor and major variants of M101. To date, we have found ten alleles, including four new variants, that can express normal M or N antigen (Shows et al. 1987). Sequencing many more samples from various races will allow us to identify more classes of alleles. A systematic classification of GPA gene alleles is thus required.

Materials and methods

Genomic DNA samples were prepared by phenol/chloroform extraction from peripheral blood leukocytes obtained following informed consent from volunteer Japanese donors. MN serological phenotyping and genotyping were performed as reported previously (Nakayashiki and Sasaki 1996; Akane et al. 1997; Li et al. 1998; Sasaki et al. 2000). Two new N alleles had been named temporarily N2 and Nv (Sasaki et al. 2000) and are named N201 and N104, respectively, in this paper. These N variants and a new M allele, M103, were analyzed using three M101/N2, two M201/N2, one N101/N2, one M103/NV and one N101/NV heterozygous samples. Six samples containing the N2 allele were from unrelated individuals, and the M103/NV and N101/NV samples were from a mother and child pair, respectively. Another N variant named N103 was also sequenced using a sample genotyped N101/N103.

Allele fragments from each DNA sample were amplified with Z-Taq polymerase (Takara, Otsu, Japan), and sequenced directly using the dye-termination method with an automated DNA sequencer ABI 310 (Applied Biosystems, Foster City, CA). GPA gene-specific and allele-specific primers were designed in our previous studies (Akane et al. 2000; Mizukami et al. 2002) or newly in this study.

Phylogenetic analysis was performed to clarify whether the alleles are major or minor variations of the standard alleles. The maximum likelihood method was used to delineate the relationship of all alleles. Using the computer program PHYLIP 3.5 (Mizukami et al. 2002), the distance between each allele was calculated using 10,704 bp of sequence from exon 1 to exon 7, excluding most of intron 1. The results were used to construct a phylogenetic tree.

Results

Table 1 shows the sequencing results of the ten alleles. M or N antigenicity was determined according to the 22nd, 34th and 35th nucleotides in exon 2. Sequences from exon 4 to exon 7 of M102 and N102 were the same as those of N103 and M101, respectively. The M103 allele was found in one M/NV sample: the M allele in this sample was regarded first as M101, but a G/A substitution was detected at position 55 in intron 3. The sequences of M20X alleles (M201 and M202) were similar in part to those of N10X.

Table 1 Nucleotide substitutions in alleles toward M101. E Exon, I intron. Numbers indicate nucleotide positions from the 5′ end of each exon or intron. Negative numbers in I1 (intron 1) are the position from the 3′ end. − deletion of nucleotide

N103 was similar to the whole sequence of N101 except for three successive mutations at positions 1373–1375 in intron 5. Therefore, the sequence downstream from exon 4 of the N103 allele is identical to that of M102. The N104 (NV) allele differed from N101 by only one substitution at position 23 in intron 2 (A→G). In exon 2 of N201 (N2), two base changes were found at positions 1 and 56, with the former nucleotide substitution, C→A, resulting in an amino acid substitution from alanine to glutamic acid. Moreover, N201 had 12 M101-type and 7 N101-type mutations.

The phylogenetic tree (Fig. 1) demonstrates that M and N alleles divided first from their ancestral allele, M10X and M20X then branched from the M ancestor, and N10X and N20X branched from the N ancestor. This result is in agreement with the nomenclature of MN alleles we propose.

Fig. 1
figure 1

Phylogenetic tree of glycophorin A (GPA) MN alleles. There is a possibility that the effects of recombination or some mutually incompatible sites were disregarded

Discussion

In our previous studies, M and N alleles were classified into six variations, provisionally called MN*M101, M102, M201, M202, N101 and N102, from the sequencing results of a total of 12,576 bp from the 5′-flanking region to exon 7 of the GPA gene, with the exception of 30 kb of intron 1 (Akane et al. 1997, 2000; Mizukami et al. 2002). These studies also revealed sequence data for new alleles M103, N103, N104 (NV) and N201 (N2). In these ten alleles, the first letter (M/N), the second digit, and third to the forth digits express antigenicity (three nucleotide polymorphism in exon 2 of the GPA gene), major variation, and minor variation of the gene sequence, respectively.

The region around exon 3 (from intron 2 to 3) was reported to include a crossing-over hot spot responsible for some variant glycophorins (Vignal et al. 1989; Kudo et al. 1990; Huang et al. 2000; Storry et al. 2000). In a previous study (Sasaki et al. 2000), the M102 and N102 alleles were considered to have been generated via recombination between the 5′-region of M and the 3′-region of N alleles around the hot spot. This study revealed that N103 and M102 share the same sequence downstream of the region 3′ of intron 3. N103 was classified as a minor variation of N101 by phylogenetic analysis (Fig. 1). Based on these results, M102 might have been generated by recombination between M101 and N103 after generation of N103 from N101 via three successive point mutations in intron 5. Alternatively, the three mutations might have occurred in M102 and N103 in parallel and individually, because the mutations were also found in M201 and M202. These hypotheses should be examined in further studies.

The N104 (NV) allele was reported to differ from N101 by only one substitution at position 23 in intron 2 (Sasaki et al. 2000). In the present study no other mutations were detected. As shown by the phylogenetic analysis (Fig. 1), the N104 allele is a minor variation of N101.

In exon 2 of the N201 (N2) allele, two substitutions were found, at positions 1 and 56. The C→A substitution at the first nucleotide results in an amino acid exchange from alanine to glutamic acid, which is, however, located in the leader peptide region. Another base change from C→T at position 56 is a silent mutation in a threonine codon. Therefore, neither of these two mutations affect antigenicity of the mature GPA. N201 possessed many mutations observed in the M101, N101 and M20X alleles, as well as four N201-specific mutations. Phylogenetic analysis revealed that N201 arose from N101 after separation of M101 and N101 (Fig. 1). The M101-type and M20X-type mutations found in N201 are likely to have occurred in parallel with those in the M101 or M20X alleles.

In our previous study (Mizukami et al. 2002), M20X alleles possessed M101-type, N101-type and M20X-specific mutations, and were presumed to have arisen not by recombination but via the accumulation of point mutations. These mutations were also likely to have occurred in parallel with those in N101 alleles. Although parallel mutations are often observed among the alleles of blood group genes such as ABO (Ogasawara et al. 1996; Saitou and Yamamoto 1997), the same point mutation at the same site would occur only rarely. Most of the same mutations in the different alleles are thought to result from genetic events, but further studies are needed to understand the mechanisms leading to these mutations.

The frequencies of allele clusters of MG (M10X), MT (M20X), N1 (N10X) and N2 (N20X) in a northern Japanese population were 0.4450, 0.0978, 0.4303 and 0.0269, respectively (Sasaki et al. 2000). The heterozygosity of the cluster system was 0.607. Determination of minor variations requires time-consuming analysis of entire sequences. Major cluster analysis can save physical labor and time, while still remaining informative for genetic screening. A lot of alleles will be found when gene analysis is performed with a large number of samples from various races. Many alleles should be grouped, and phylogenetic analysis is useful for evidence-based grouping of these alleles.