Main

From an historical point of view, the discovery of the covalent addition of a phosphate group to proteins occurred in the late nineteenth and early twentieth century.1, 2, 3 However, the first protein phosphotransferase activity and the notion of protein kinases as a new class of enzyme emerged only in the 1950s.4, 5, 6 The early interest in protein phosphorylation was justified by the subsequent emergence of phosphorylation-based signaling pathway (signal transduction, regulation of gene expression, etc.), and the identification of diverse biochemical mechanisms for regulation of proteins by phosphorylation (structural conformation, enzymatic activity, stability and degradation, binding partner interaction, subcellular localization, etc.). For their discovery that protein phosphorylation regulates enzyme activity, the biochemists Fischer and Krebs were awarded the Nobel Prize in Physiology or Medicine in 1992, underscoring the importance of this ubiquitous post-translational modification (PTM).

Although DNA is the basis of all life forms, proteins and PTMs are the ultimate mechanistic functionality of life. Although genomics and proteomics are intrinsically connected in a complementary way, the sites and function of phosphorylation cannot yet be accurately predicted by DNA-sequence and epigenetic events. As shown by one of the earliest mass spectrometry (MS)-based global phosphoproteomic analyses, specific primary sequence phosphorylation motifs can be defined and be used to predict phosphorylation sites. However, proper regulation of development involving temporal and spatial variables, as well as environmental conditions cannot be fully genetically coded.7 That is why global MS-based phosphoproteomic analysis is vital and has flourished since 2000.8, 9 However, phosphoproteome analysis faces inherent difficulties because of the chemical lability of the phosphate linkage, and the low abundance of phosphopeptides because of the substoichiometric nature of phosphorylation, and also a lack of absolute protease specificity. Moreover, if a peptide sequence is too short or very hydrophilic, its binding efficiency to standard C18 resin used as a reverse-phase column in-line with the MS analysis is very poor, and very long peptides also pose problems because of aggregation and poor elution, in both cases compromising peptide identification by MS. In addition, the hydrophilic nature of phosphopeptides often makes their recovery by standard peptide purification procedures inefficient, and decreases their yield compared with non-phosphorylated peptides. To compensate for this, phosphopeptide enrichment methods have been developed,10, 11 including affinity chromatography (hydroxyapatite, IMAC, TiO2, superbinder SH2 domains, etc.), immunoaffinity purification using phosphospecific antibodies (eg, pTyr antibodies), or substrate trapping using, for example, a catalytically dead phosphatase mutant.12, 13 Accurate identification of the phosphorylation site within a peptide is also a challenge as mis-assignment can occur when there are multiple phosphorylatable residues close to each other within a peptide. The use of powerful scoring algorithms and the development of fragmentation strategies such as multistage activation (MSA), which invokes MS3, considerably decreases this risk, even if it requires longer data acquisition time.14, 15 More recently, the quantification of specific proteins in biological samples based on selected reaction monitoring is emerging as a targeted phosphoproteomic approach and this will help us define the relevance of site-specific phosphorylation events.16

Despite the huge progress made in this field, particularly with regard to the efficiency and accuracy of large-scale MS, our knowledge of the phosphoproteome is still mainly restricted to phosphorylation of the three hydroxyamino acids, Ser, Thr and Tyr, particularly in metazoan organisms. And yet, evidence for phosphorylation of other amino acids, and in particular phosphorylation of His was first reported in 1962, in a mitochondrial protein isolated from bovine liver.17 However, studies of His phosphorylation have been challenging because of the lack of methods and reagents required to study this unstable modification. Forty years after the discovery of His phosphorylation, the design of stable pHis analogs, through the work of Muir et al and Lilley et al among others, opened up the possibility of successful development of pHis antibodies.18, 19 Subsequently, Muir’s and other groups made use of these analogs to generate the first polyclonal pan-pHis antibodies in 2013, followed by the development of isomer-specific monoclonal 1- and 3-pHis antibodies in 2015 20, 21 (Figure 1). During the past 55 years, only a relative handful of studies have reported His phosphorylation and described the His kinases and pHis phosphatases responsible for reversible His phosphorylation from various organisms. Compared with all other phosphoamino acids, pHis is unique in forming two distinct phospho-isomers, 1- or 3-pHis, resulting from the covalent attachment of phosphate to the 1-N or 3-N position of the imidazole ring, depending on the specificity of the kinase. Studies of chemically phosphorylated His show that 1-pHis is slightly less stable than 3-pHis, with the possibility of transfer of phosphate from the 1-N to the 3-N position, and also reveal that diphosphorylated histidine can exist, which adds to the difficulty of studying His phosphorylation.22, 23, 24

Figure 1
figure 1

Chronology of phosphorylation discovery. Black circles with upper legends correspond to major events in histidine phosphorylation. For reference: 1883–1906,1, 2 1951–1956,5, 6 1962,17 1980,56 1981,26, 144 1995,29 2000,9 2010,18 2015.20

Phosphoproteomics remains a rapidly evolving field. The tremendous global effort since 2000 to better define the Ser-Thr-Tyr O-linked phosphoproteome has been highly successful, but the current human O-phosphoproteome could be significantly expanded by addition of the N-phosphoproteome (including His-Lys-Arg phosphorylation—see Figure 2), which up till now has been largely unexplored. A variety of techniques can be used to detect, enrich and analyze pHis as well as isolate specific types of proteins or 'subproteomes', but the standard phosphoproteomic methods developed over the last 50 years were not designed to conserve the pHis modification (discussed further below). That is the major reason why we are missing an entire part of the phosphoproteome.

Figure 2
figure 2

Overall phosphorylation and phosphate linkages. Phosphorylases and kinases are distinguished by the source of donor phosphate. Phosphorylases add a phosphate group (OP) to an acceptor (A) breaking an A-B bond using inorganic phosphate (H-OP), for example, glycogen phosphorylase, whereas kinases transfer a phosphate group (P) from a donor (B) (usually ATP) to an acceptor (A). SONAtes for S-O-N-A-phosphate categories: S-phosphate (thiophosphate), O-phosphate (phosphate-ester), N-phosphate (phosphoramidate) and A-phosphate (acyl-phosphate group).

The phosphoramidate linkage is acid labile, meaning that pHis, pArg and pLys are all sensitive to acid. Acid-labile histone phosphates were defined in 1973 corresponding to pLys on histone H1 and pHis on histone H4.25 H18 and H75 were subsequently confirmed by NMR as His phosphorylation sites in chemically phosphorylated histone H4,26 and a histone H4 histidine kinase (HHK) activity was linked to cancer as an oncodevelopmental marker in liver.27 However, although a yeast HHK has been characterized, the mammalian equivalent remains unidentified.28 That is why the identification of NM23 as the first mammalian His kinase provided renewed interest in the His phosphorylation field.29 NM23 was originally identified as a nucleoside diphosphate kinase (NDPK), which catalyzes the transfer of the γ-phosphate from nucleoside triphosphates (NTPs), such as GTP and ATP, to nucleoside diphosphates (NDPs), including ribo- and deoxyribo-GDP, ADP, UDP and CDP, with second-order rate constants for phosphorylation by natural NTPs varying between 0.7 and 13 × 106 M−1 s−1.30 The NDP kinases, NDPK-A (or NM23-1) and NDPK-B (or NM23-2), are ubiquitous housekeeping enzymes that have an important role in the appropriate balance of intracellular NTP and dNTP pools to maintain essential cellular processes such as RNA synthesis, translation and a high DNA replication fidelity. Subsequently, Nm23 was discovered by differential screening as an RNA that was expressed at much lower levels in metastatic cancer cells, and therefore proposed to be a metastasis suppressor (named also NME for non-metastatic).31 How altered cellular NTP/dNTP levels resulting from decreased Nm23 would promote metastasis was unclear, but the subsequent discovery that NM23 could act as a protein–histidine kinase by transfer of phosphate from autophosphorylated H118 onto protein targets began to suggest a possible mechanism.31, 32 The identification of key protein targets for NM23 has been limited by the lack of tools to detect phosphorylated His residues, and only a few His kinase substrates have been defined: NM23, TRPV5, SK4 channel, GNB1, KCa3.1, ACLY, with also potentially KSR1 and Annexin I, etc.29, 33, 34, 35, 36, 37, 38

Increased awareness of the missing pHis phosphoproteome is needed in order to encourage further research efforts in this area. This review discusses how phosphoproteomic analysis can be expanded to shed light on His phosphorylation data collected over the last 60 years, why our progress in uncovering this missing phosphoproteome has been delayed until now, and highlights the potential importance of the NM23 His kinase as an entrée into the world of histidine phosphorylation.

FROM PHOSPHORYLATION TO A PHOSPHOPROTEOME IN EXPANSION

By analogy with epigenetic modification and regulation of DNA function, PTMs of proteins provide a means to diversify the function(s) and regulation of proteins, and at the same time PTMs greatly expand the complexity of the proteome. One of the most important PTMs is phosphorylation, and according to the last update of the PhosphoSitePlusTM database,39 in total >283 000 non-redundant Ser/Thr/Tyr phosphorylation sites have been reported in human, mouse and rat mammalian cells from about ~20 000 non-redundant proteins, comprising ~60% pSer sites, ~25% pThr sites and ~15% pTyr sites. All of these phosphorylation events are catalyzed by phosphotransferases known as protein kinases, which fall into several related families: TKL (tyrosine kinase like), AGC (protein kinase A, G and C), CMGC (CDK/MAPK/GSK/CLK), TK (tyrosine kinase), STE (sterile serine–threonine kinase), CK1 (casein kinase 1), CAMK (calcium/calmodulin-dependent kinase), RGC (receptor guanylate cyclase kinases), PKL (protein kinase like) and HK (histidine kinases). Collectively, they are known as the kinome (see Table 1).40, 41

Table 1 Mammalian kinome and phosphoresidues

The phosphoproteome is larger than the currently known kinome can predict and is not restricted only to Ser, Thr and Tyr residues. One reason is that a subset of phosphotransferases use a phosphoenzyme intermediate that can selectively transfer their phosphate to another type of amino acid, as in the case of the pHis/pAsp phosphorelay used in the bacterial two-component systems (TCSs) discussed below. But a second reason is the probable existence of as yet unidentified kinases whose catalytic activity is targeted to non-Ser/Thr/Tyr residues. Here, one might consider the poly-specificity of substrates for lipid, carbohydrate and nucleoside kinases, whose targets can be switched according to cellular conditions (nutrient deprivation, Warburg effect, etc.) For example, PI3K is a well-known and important lipid kinase, but it has also been reported to possess protein kinase activity under some circumstances, phosphorylating Ser585 in the cytoplasmic domain of the interleukin 3 (IL-3) receptor.42 And NM23 illustrates the existence of a nucleoside kinase with a protein kinase activity acting on a non-standard phosphoamino acid.

Nine out of the 20 amino acids can be phosphorylated, and the resultant phosphoamino acids can be grouped into four categories based on the nature of the phosphate-bond. These are termed SONAtes, indicating phosphate linkages to the S-, O-, N- elements, as well as the A-phosphate referring to Acyl-phosphate groups (Figure 2). Sonate is the French word for Sonata, which reflects the possibilities of phosphorylation in living forms. As is the case for musical notes, different phosphorylations can be combined simultaneously or successively in order to accurately and harmoniously regulate life thanks to the variety of factors (enzymes) and instruments (substrates).

According to a phoshoproteomic study of nine different mouse tissues (brain, liver, brown fat, heart, lung, kidney, spleen, testis and pancreas), where 36 000 sites of phosphorylation were defined on 6300 non-redundant proteins, the experimental proportions of Ser, Thr and Tyr phosphoresidues are similar from a tissue to tissue, with 83%, 15% and 2% of pSer, pThr and pTyr respectively.43 However, only 3% of the phosphoproteome was common to all of the tissues, although overall proteomic analysis revealed that this is not because of protein expression specificity, suggesting a high level of tissue-dependent phosphorylation events. This means that even if a His kinase-like NM23 is expressed ubiquitously, its targets may be different in different tissues.

It is also important to consider temporal aspects of phosphorylation, either in response to different stimuli or in different cell cycle states. For instance, quantification of 20 443 unique sites of phosphorylation on 6027 proteins in different cell cycle phases 44 revealed that the major part of the phosphoproteome is observed only during mitosis, with fewer sites being found during S phase. In this regard, it is interesting to note that 3-pHis-containing proteins are localized to specific mitotic structures.20 This emphasizes that the dynamic aspects of cell physiology and consequently protein phosphorylation need to be borne in mind.

We are in a computer science age, and there is interest in developing computational kinome profiling to predict signaling networks; for example, phosphorylation networks for mass spectrometry method.45, 46 One major issue for the future is that all these computational analyses are based on an incomplete phosphoproteome, because they do not integrate information about His kinases and His phosphorylation.

HISTIDINE PHOSPHORYLATION FROM PROKARYOTES TO EUKARYOTES

Protein–histidine phosphorylation is well known in prokaryotes, particularly in bacteria, because of the large family of signaling systems called two-component regulatory systems (TCS). These systems are composed of a sensor protein (receptor) and a response regulator protein (effector). The activation of the His kinase domain in the sensor occurs generally in response to ligand binding to the receptor domain, and the HK domain then catalyzes phosphate transfer from ATP to the active site His. This phosphate is then transferred from the receptor to an Asp in the effector protein to activate the response regulator, or the phosphate can also be re-transferred from the pAsp onto a His in a secondary regulator in a His-Asp-His relay system (Figure 3).

Figure 3
figure 3

TCS for autophosphorylation of a transmembrane histidine kinase (HK). A sensor perceives an extracellular stimulus. The substrate for the His kinase is the response regulator, which becomes phosphorylated on a specific Asp residue. In bacteria, the response regulator generally has a common domain with at least two Asp and one Lys residue.51 HK receptors are homodimeric proteins, and both trans and cis phosphorylations have been reported.

The widespread existence of archaeal and bacterial TCS systems is why His phosphorylation was considered to be a 'primitive' signaling circuitry, but this system is not restricted to prokaryotes, and homologous systems can be also found in some simple eukaryotes, like fungi, and in plants, where they are generally used to transmit and respond to stress stimuli from the environment. Although TCS systems are not found in higher eukaryotes, the finding that mammalian cells use a different type of His kinase, demonstrates that His phosphorylation is important in all life kingdoms.

Two-Component Systems

Prokaryotes

Histidine phosphorylation was first demonstrated in TCSs in bacteria in 1980.47, 48 A single bacterial species can have a large number of TCS (see Table 2 for examples), but TCS are also common in Archaea, although less extensively studied. CheA and CheY are well-characterized sensor and response regulator proteins in Halobacterium salinarum.49, 50

Table 2 List of some TCS bacterial histidine kinases and their main functions

In contrast, at least 50 TCSs and their cognate His kinases can operate in a single bacterium to monitor osmolarity, temperature, pH, nutrient concentration and cell density, etc.51 Some of these bacterial His kinases reveal a minimum kinase domain, with five regions of similarity involved in autophosphorylation. Four conserved blocks of amino acids (N, G1, F and G2 boxes), whose Gly-rich domains correspond to the nucleotide binding and phosphatase activities, and an H-Box containing at least one His, which is phosphorylated.52 A non-exhaustive list of bacterial sensor His kinases that are part of a TCS system is shown in Table 2. Interestingly, NM23 kinase was reported to be able to transfer a phosphate to His residues in the TCS bacterial His kinases EnvZ, CheA and Taz1 (a Tar/EnvZ chimera) in Escherichia coli.53 However, it is possible that this result was due to contaminating ADP, as suggested by Levit et al,54 which could result in γ32P-ATP generation, and potentially induce the autophosphorylation of the bacterial His kinases.

Eukaryotes

The level of pHis in eukaryotic cells was originally estimated from analysis of a basic nuclear extract of the slime mold Physarum polycephalum to represent 6% of the global phosphoamino acid, with <1% for pArg and pAsp, with the remaining 93% being pSer, pThr and pTyr;55 this compares with the estimated 0.3% level of pTyr in chick cells when it was first discovered.56 It has recently been shown that TCS His kinase genes are highly represented in the Physarum polycephalum genome (51 HK genes), and the high level of pHis in this organism would imply that His phosphorylation is involved in eukaryotic signaling pathways. Furthermore, eukaryotic homologs of the 'primitive' bacterial TCSs, Sln1 and ETR1, are found in budding yeast and plants, respectively. Sln1 is a yeast (fungal) protein similar to bacterial TCS regulators and acts as an osmosensor His/Asp kinase,57, 58 whereas the ethylene response (ETR) pathway is involved in plant differentiation. The Arabidopsis thaliana ETR1 protein has a strong domain homology with the 'primitive' TCSs, except that the substrate-Asp domain is also part of the receptor unlike the bacterial configuration shown in Figure 3. Several different His kinases are known to act in TCSs and multistep phosphorelay in plants, for example, AHK, a sensor His kinase involved in the cytokinin signal transduction pathway in Arabidopsis,59 and HK1, an osmosensor His-Asp kinase defined in Populus trichocarpa.60

NM23: A Histidine Kinase to Define Histidine Signaling Pathway

True histidine kinase or NDP kinase artifact?

Owing to the scarcity of TCS family His kinase genes in eukaryotes compared with prokaryotes, it has been suggested that eukaryotic His kinases may have originated from a single prokaryotic source by lateral transfer gene from bacteria, leading to the notion of coevolution between His kinases and response regulators but also the possibility of hybrid kinases.61, 62 These hybrids His kinases possess both a His kinase domain and an Asp response regulator domain, encoded by a single gene in contrast to the prokaryotic TCSs. However, the fact that there are fewer TCS family His kinase genes in eukaryotes could be also because of selection and loss of this kinase family, as illustrated by kinome comparison between primitive and more evolved metazoans.63 One possible reason why TCS family His kinases were lost during evolution of metazoans is that the signals generated by the His-Asp relay system are too unstable for efficient transfer of signals to the nucleus of eukaryotic cells, which require significantly more time than signal transfer to the bacterial chromosome, which lies close to the plasma membrane. Alternatively, this could be due the development of receptor Tyr kinases or G-protein coupled receptor (GPCR) systems, which can couple to Ser/Thr/Tyr phosphorylation, as new types of cell surface sensors systems, instead of the TCS.

NDPKs are housekeeping enzymes highly conserved from E. coli to humans (45% identity), but for microbial Ndks, the conserved function corresponds to a NTPase and NTP-generation activities.64 Initially, it was suggested that the phosphotransferase activity of NM23 might be an artifact in bacteria, because the regulation of gene expression in E. coli, could be explained by the ability of NM23 as NDP kinase to bind and cleave DNA,54 but this idea is no longer in vogue.65 In addition, the protein phosphotransferase activity of NDP kinase has been reported only in the presence of ADP, mediating the phosphotransfer from the phospho-NDP kinase to the target enzymes in catalytic amounts (1 nM), suggesting that contaminating ADP would be sufficient to mimic phosphotransferase activity.54 Moreover, it has been found that the E. coli Ndk gene is not essential, and that deletion mutants are capable of normal growth, due in part to compensation by pyruvate kinase and succinyl CoA synthetase.64 Furthermore, there is no correlation between NDP kinase activity and binding to DNA or PuF transcriptional activity.66 Similarly, it seems that there is no clear correlation between NDP kinase activity and His autophosphorylation or metastasis-suppressor activities.67, 68 This is consistent with the notion that the primary function of NM23 is not as a housekeeping NDP kinase, and suggests that NM23 is a multifunctional protein. This is illustrated by the fact that purified NM23 preparations from human, Drosophila, Dictyostelium and yeast exhibit a protein phosphotransferase activity when incubated with colon carcinoma cell lysates, although a co-purified protein co-factor may be involved.69 For example, GAPDH was suggested as a binding partner to activate the phosphotransferase activity of NM23-1.70 In this context, an important unanswered question is how this enzyme can accommodate different NDP substrates, as well as specific His residues in individual protein substrates. NDP and NTP appear to bind the same pocket based on comparison of the Arabidopsis thaliana NDK2+GTP and the human NM23-1+ADP crystal structures with the autocatalytic site H197 and H118, respectively,71, 72 raising the question of how a single NTP/NDP binding pocket can also accommodate a positively charged His residue in a peptide backbone and be properly oriented for transfer of phosphate from pHis118 to the N1 or N3 position?

The multiple personalities of a kinase

Orthologues of the mammalian His kinase NM23 (or NME) are present in all eukaryotes, with 10 NM23 family genes identified in humans encoding full-length, tandemly repeated NM23 domains, or in one case a truncated NM23 domain.73 These genes are divided into two main groups: NM23 Group I (NM23-1 to NM23-4) and NM23 Group II (NM23-5 to NM23-9), with NM23-10 (=XRP2) having a different evolutionary history to Groups I and II.74 Isoforms contained in Group I are well conserved, with 58-88% overall identity, and are catalytically active, whereas Group II are more divergent with only NM23-6 shown so far to be active. More recently, Fuhs et al20 showed that NM23-5 and NM23-7 could be autophosphorylated in vitro using GST-NME fusion proteins in an E. coli lysate. It appears that these genes expanded late in evolution, because the complexity of the NM23 family is mainly a eukaryotic innovation, with the exception that NM23-8 is likely to be a choanoflagellate/metazoan innovation.75 Comparisons of the domain structure of each NM23 family member in basal metazoans has been reviewed previously.76 Even if NM23 family members are mainly considered to be cytosolic/nuclear, some members have specific subcellular localizations. NM23-4 is a mitochondrial isoform, whereas NM23-7 is associated with the centrosome.77, 78 NM23-3 is found at DNA damage sites, NM23-5 is observed in nuclear bodies, and NM23-6 is associated with vesicles.79, 80 Even the ubiquitous NM23-1 and NM23-2 were shown to be expressed at the cell surface in specific cell lineages and differentiation stages.81

In addition to their arborizing evolution, NM23 orthologues have a mutator gene activity in E. coli, 64 as well as multiple functions from NDP kinase to protein kinase (His phosphotransferase) and transcription factor activity 66 in multiple eukaryote organisms, but also lipid bilayer binding because of electrostatic interaction with anionic phospholipids, which might also suggest a potential lipid kinase activity.82, 83 In this case, protein–lipid complexes inhibit NDP kinase activity, as shown for the mitochondrial NM23-4 protein, but are necessary for selective intermembrane lipid transfer.84 In addition, endocytosis mechanisms because of NM23-1/2 seems to be conserved and have been associated with its GTPase activity providing GTP to dynamins to power membrane fission.85, 86 All these personalities make it difficult to know exactly how NM23 acts and why it is conserved.

This complexity is also reflected by the numerous names given to these proteins. The catalytic His of NM23, H118, which is primed for transfer by NTP-driven autophosphorylation, is highly conserved and is required for both NDP and protein phosphorylation. Histidine phosphorylation has been linked with homologs of the NM23 kinase in several eukaryotic species (Table 3), which have been given various names dependent on the organism and the cellular localization: for example, the nuclear isoform NDPK-In in Arabidopsis thaliana,87 NmeGp1Sd in the marine sponge Suberites domuncula, considered as one of the closest common ancestors to animals,74 Gip17 and Guk or Ndkm for mitochondrial Ndpk in the slime mold Dictyostelium discoideum,88, 89 YNK1 in yeast Saccharomyces cerevisiae,90, 91 NDK in the nematode Caenorhabditis elegans,92 AWD or abnormal wing disc in Drosophila melanogaster insect,93, 94 and the original NDPK-A/B NDPKs in mammalian cells, also known in humans as NME1/2 or more standardized NM23-H1/H2 (H for human).

Table 3 List of different names for the NM23 homologous His kinases in eukaryotes and the organisms concerned

The Nm23-M1/2 (M for mouse) double knockout is lethal in mice, but the different members of this His kinase family could be separately 'non-essential' because they compensate each other in single knockout mice, which are viable, fertile with a growth defect (hypotrophy) but a normal lifespan. Nevertheless, together NM23-M1 and 2 could be responsible for 80% of all cellular NM23 activity, which could effectively explain why the double knockout is lethal. Similarly, in Drosophila a null mutation in Awd, the single orthologue of NM23-1/2 recognized in Drosophila, induces lethality because of impaired differentiation in the prepupal stage. The same thing is observed with killer of prune (kpn), a conditional dominant Ser97Pro mutation in the AWD protein, which is lethal in individuals that do not have a functional prune gene.95, 96 This mutation does not affect catalytic activity but the mechanism of action is still related to NDPK activity because it alters the NDPK substrate binding.97, 98 In the nematode, 50 % of ndk-1(ok314) homozygous C. elegans mutants, where the ORF is removed, die as embryos, but the remainder develop into sterile adults, which display a protruding vulva phenotype but became fully penetrant in a ndk-1(–) ksr-2(–) double mutant background, suggesting the importance of kinase suppressor of Ras (KSR), which is known to interact with Ndk.86, 92 Ynk1 ablation in yeast does not appear to be lethal but induces a delay in DNA repair.99

Although His kinases and His phosphorylation are conserved, it is not yet possible to define global pHis signaling pathways. In the last few years, the notion of kinotype has emerged, which suggests that signaling is more dependent on cell type rather than species or even individuals from a same species.45, 100 A different pattern of phosphorylation could be defined for each NM23 orthologue dependent on the cellular tissue, consistent with tissue specificity of the Ser/Thr/Tyr phosphoproteome.43 Although NDP kinase (NDPK) activity is intrinsic to many of the Nm23 gene products, as demonstrated for the four first members of the family (NM23-1 to NM23-4), this activity is apparently not responsible for all of its biological functions, which suggests that its His kinase activity contributes to these functions.101 In principle, many of the pHis-containing proteins identified in mammalian cells could be phosphorylated by one or several members of the NM23 family of His kinases, particularly considering that the different NM23 family members can have distinct intracellular localization, for example, NM23-4 is localized to mitochondria, via an N-terminal targeting sequence,78 and NM23-7 is associated with the centrosome.77

The high conservation of His kinases, and particularly NM23, throughout evolution supports the idea that His phosphorylation has an important role in basic cellular processes. A better knowledge of the proteins that are partners or substrates of His kinases is needed to define pathways related to His phosphorylation. For example, the scaffold protein KSR1, a mammalian homolog of CTR1 involved in the ERK MAP kinase pathway, is known to be involved in TCSs in Arabidopsis thaliana because of its interaction with ETR1 and ERS. Interestingly, NM23-H1 interacts with and phosphorylates KSR on S392 but this phosphorylation is dependent on NM23 His phosphorylation.35 H348 and H381 have been suggested as His phosphorylation sites on KSR1,20, 102, 103 but these sites have not been biochemically identified. A previous review proposed that NM23-H1 directly modulates Ras-ERK MAP kinase signaling through interaction with KSR1 and phosphorylation of S392,104 but the possible existence of His phosphorylation sites in KSR1 and the fact that the ndk-1(-) ksr-2(–) double mutant in C. elegans is lethal, suggest the existence of a signaling pathway related to His phosphorylation.

The somatically acquired S120G NM23 mutation, associated with advanced neuroblastoma, was also shown to abrogate the motility inhibitory effect of exogenously expressed NM23 in breast carcinoma cells, without significantly influencing its NDPK or His kinase activities. S120 lies in the active site pocket two residues downstream of H118, and has also been reported to be phosphorylated, potentially by autophosphorylation. The S120G mutation could simply interfere with the structure of the catalytic pocket or more directly with the autophosphorylation motif surrounding the catalytic site H118. Alternatively, the S120G mutation could selectively affect His phosphorylation of specific protein substrates either because of the Ser to Gly change itself, or because the Ser can no longer be phosphorylated, and the resultant alteration in target protein phosphorylation might explain the inability of S120G NM23 to inhibit neuroblastoma cell motility.

Currently, only a few direct substrates for NM23 have been identified:102 the histidine kinase NM23 itself (H118; 1-pHis), the calcium channel TRPV5, which regulates Ca2+ reabsorption (H711; 3-pHis), Gβ (GNB1) for GPCR signal transduction (H266; 3-pHis), the calcium-activated potassium channel KCa3.1 or SK4 (H358; 3-pHis) whose histidine phosphorylation activates KCa3.1 by antagonizing copper-mediated inhibition of the channel,103 and the ATP-citrate synthase ACLY (H760; 3-pHis).102 Preferential 1- and 3-pHis isomers have been inferred from specific 1- and 3-pHis enrichment or detection using antibodies, except for NM23, which was already unambiguously defined. Other NM23 substrates could exist, particularly given that Fuhs et al identified 786 different proteins by immunoaffinity purification using 1- and 3-pHis monoclonal antibodies. Of these proteins, 280 and 156 were found to be exclusive to 1-pHis and 3-pHis antibody purifications, respectively. Although the His phosphorylation sites in these proteins were not identified in these experiments, these data imply the existence of large number of pHis-containing proteins. We still need to define what mechanisms regulate these His phosphorylations and determine if there is real 1- or 3-pHis target specificity or if isomer specificity is simply because of steric hindrance and balance between stability or interconversion of these two isomers. His kinases other than NM23s could be also responsible for these phosphorylations. Another question is whether there is a common primary consensus sequence for these His phosphorylations and how many are substrates for NM23 kinase family members. Based on the canonical and isoform sequences in the human proteome (>42 000 protein isoforms), we estimate 625 730 His residues potentially available for phosphorylation. Given that there are >2 million Ser, 1.3 million Thr and 637 000 Tyr residues available, of which 8%, 5% and 7% are, respectively, known to be phosphorylated, can we expect to find a similar proportion for histidine residues?

Histidine Phosphatases

From prokaryotes to eukaryotes, our knowledge of how His phosphorylation is regulated remains poor, but the existence of pHis phosphatases suggests that it is well organized. For example, a bacterial phosphatase, signal inhibitory factor-X (SixA), is specific for pHis residues and associated with TCS regulation. SixA phosphatase activity requires H8 in the RHG signature motif, which presumably functions as a nucleophilic acceptor in the attack of the target pHis.105, 106 In addition to pHis phosphatases associated with TCS systems in bacteria, several mammalian pHis phosphatases have been identified, including PHPT1, LHPP, PGAM5, PP1/2A/2C and T-cell ubiquitin ligand-2 (TULA-2), although in most cases their phosphatase activities are not restricted to pHis residues.

The protein–histidine phosphatase PHPT1 efficiently dephosphorylates pH358 on KCa3.1, which negatively regulates TCR signaling.107 For lysine–histidine–pyrophosphate phosphatase (LHPP), tests in vitro reveal phosphatase activity on pLys and 3-pHis but LHPP protein substrates have not yet been identified in vivo.108 The mitochondrial protein phosphoglycerate mutase family member 5 (PGAM5), which was previously shown to be an unconventional pSer/pThr protein phosphatase, utilizing its active phosphoacceptor site H105 to attack the pHis target site,109 was recently established as a new pHis phosphatase, associating specifically with NM23-H2 and dephosphorylating pH118, as revealed by the use of 1- and 3-pHis monoclonal antibodies.110

In addition, PP1, PP2A and PP2C display activity toward pHis, as well as pSer/pThr, and the PP2A family member PP4 might even prefer pHis substrates according to the lower Km observed for pHis dephosphorylation compared with pSer.55, 111 It is possible that many of the cellular pSer and pThr phosphatases could also act as a pHis phosphatases.112 This duality was also defined for TULA-2/STS-1, a His/Tyr phosphatase in the PGAM family that negatively regulates bone differentiation and controls platelet glycoprotein VI signaling.113, 114 Bearing this in mind, it would be worth investigating the use of standard pSer/pThr and pTyr phosphatase inhibitors, like okadaic acid and orthovanadate, respectively, for histidine phosphorylation studies.

WHY DO STANDARD METHODS NOT ALLOW DETECTION OF PHOSPHOHISTIDINE?

The catalytic activities of Ser/Thr or Tyr kinases are not so different, when compared with His kinases autophosphorylation but their chemistry and phosphoamino acid product stabilities differ.115 The pHis imidazole phosphoramidate bonds (P-N1 and P-N3) have ~2-fold higher ΔG°, than the phosphoester bonds (P-O) on Ser, Thr and Tyr residues. This high-free energy of hydrolysis makes it easier to hydrolyze pHis in a pH-dependent (acid-lability) and thermosensitive manner; for instance, in 1 M HCl at 49 °C, 1-pHis and 3-pHis have half-lives of 18 and 25 s, respectively, whereas in the presence of 1 M HCl at 100 °C, the half-lives of free pSer and pThr are about 18 h, and that of pTyr about 5 h.116 Furthermore, exposure to certain primary amines can also cause dephosphorylation of pHis. However, the positive charge on the primary α-NH2 group, which is responsible for the relative instability of 1-pHis compared with 3-pHis, would be lacking in the context of a protein because the primary α-NH2 group is lost as a result of formation of the peptide bond. However, by monitoring with 1- and 3-pHis monoclonal antibodies, Fuhs et al20 recently showed that even moderate heat and acid (pH 6 at 60 °C for 30 min) is sufficient to significantly reduce pHis signals in many proteins. In contrast, 24-h acid hydrolysis at 110 °C in 6 M HCl is required to completely remove phosphate from pSer and pThr (pTyr is completely hydrolyzed at shorter times under these conditions). Complete dephosphorylation of Ser and Thr is also observed when exposed to strong alkaline conditions, for example with 1 M NaOH at 37 °C for 18–20 h, whereas pHis is totally stable under these conditions.117

In order to detect specific His phosphorylation sites, pHis-containing proteins and tryptic peptides need to be purified because of the low abundance and the substoichiometric nature of phosphorylation. But it is still a challenge, because phosphopeptide enrichment methods generally utilize acidic conditions to improve binding, and the succeeding phosphoproteomic analyses in positive ion mode MS use a low pH condition to obtain efficient ionization. Furthermore, autonomous and intermolecular phosphate transfer can occur during MS analysis.118

However, some success has been reported in identification of pHis by MS, although the phosphate on pHis is subject to elimination by neutral loss during the peptide fragmentation step. For instance, analysis of a tryptic digest of autophosphorylated recombinant protein NM23-H1 permitted detection of His phosphorylation by conventional LC-MS/MS, using pHis-compatible conditions at pH 5.119 Nevertheless, the use of higher pH, like pH 5, is already a concern for complex samples, because of the decreasing ionization efficiency in positive ion mode, which makes it more difficult to detect low abundance PTMs. Differentiating the site of phosphorylation in peptides that have several phosphorylatable residues is also a challenge, even with the development of the MSA fragmentation strategy. But the pHis neutral loss fingerprint has some unique features than can be used for identification, referred to as a triplet signature. Indeed, it was shown that during CID fragmentation, the dominant base peak because of loss of phosphoric acid from pHis peptides via a side chain carboxylate corresponds to a neutral loss of 98 Da from the parent ion, whereas ions with lower intensity derived from losses of 80 Da corresponding to the phosphate alone and 116 Da for the phosphate associated with two molecules of water.120 From this, a TRIPLET software was developed by Oslund et al120 in order to detect the triplet signatures from MS data and according to their result, this parameter may be specific enough to distinguish pHis neutral loss. It should be noted, however, that TRIPLET software provides only a list of peptides but no spectrum file to check the peptide, meaning that other non-pHis phosphopeptides undergoing neutral phosphate loss with the same mass might generate a false–positive identification, and direct identification of the pHis site is required to be certain.

Another possible MS approach to identify pHis peptides and sites is the use of electron transfer dissociation fragmentation, which is also run in positive ion mode.121 An interesting new method to fragment phosphopeptides in gas-phase using alkaline buffers by negative electron transfer dissociation in negative ion mode has been developed,122 and such alkaline conditions should preserve the pHis. Another option would be the use of ultraviolet photodissociation to improve further the detection of His phosphorylation sites.123 In contrast to the polar residues Ser, Thr and Tyr, the basic His residue is electrically charged because of the protonated imidazole ring at physiological pH, which decreases ionization efficiency but can also create electrostatic interactions. The addition of a phosphate on His modifies the charge from +1 to –1.5, which can in principle modify the binding energy by electrostatic affinity with protein or nucleic acid elements.102

Clearly, conditions that stabilize His phosphorylation and methods to enrich pHis-containing tryptic peptides without loss need to be developed. Up till now, the methods developed to study the phosphoproteome (liquid chromatography, IMAC, MOC, etc.) have been restricted to pSer/pThr/pTyr-containing peptides and use several steps requiring acidic conditions that are not compatible with pHis stability, which explains in part why it has been so difficult to detect His phosphorylation sites.124 But the availability of new alternatives compatible with pHis stability, like the use of hydroxyapatite (HAP) for global phosphopeptide enrichment and the anti-pHis mAbs for specific pHis peptide enrichment (K Adam, unpublished observations), as well as the development of alternative chromatic separation using strong anion-exchange chromatography (SAX) allowing an unbiased phosphopeptide enrichment strategy based on SAX (UPAX) method (C Eyers, Personal communication) should facilitate the detection of specific histidine phosphorylation sites.

PHYSIOLOGICAL EVIDENCE AND COMPATIBILITY WITH HIS PHOSPHORYLATION

The phosphoproteome differs from one tissue to another. Immunofluorescence staining with 1- and 3-pHis mAbs revealed pHis signals in cultured human cells with a specific subcellular localizations, around the phagosome and at spindle poles in mitosis, respectively.20 This suggests that specific cellular compartments can be preferential sites for an individual type of protein phosphorylation either because of the localization of a kinase/substrates, or because of local regulation/signaling pathway (stem cells vs differentiated cells, for example). In this regard, given the lability of pHis under mild acid conditions, intracellular pH status might be used to regulate His phosphorylation. Human cells in different tissues can be in contact with various pHs, acid for the stomach or slightly more alkaline than the rest of the body for brain compartment. Extracellular pH does not necessarily directly affect the intracellular pH, but it could modify the cellular microenvironment, which would influence intracellular signaling through secreted proteins, endocytic mechanisms and membrane receptors. NM23 is involved in endocytosis 85 and can furthermore be expressed extracellularly being present at the cell surface in tumor cells, with inverse correlation to differentiation.81, 125 NM23 is even found in plasma of patients with acute myeloid leukemia suggesting a potential extracellular mechanism for this His kinase, as already suggested by secreted bacterial Ndk in Pseudomonas aeruginosa.126 Other cellular conditions might be involved, such as hypoxia and oxidative stress (ROS) or glucose starvation, which locally modify the intracellular pH.127 pH variations are associated with different human diseases, like acidosis in tumor development,128, 129, 130 and also in ischemic stroke, neurodegenerative disease, seizures and respiratory control.131, 132 Indeed, pH modification can be envisaged as a therapeutic approach for these pathologies; for instance, non-invasive photodynamic therapy has been shown to cause focal intracellular acidification in cancer cells.133, 134

In bacterial TCSs, His kinase autophosphorylation is regulated by growth temperature as demonstrated by the kinetic experiments of Greenswag et al135 from TCS trans-autophosphorylation in Thermotoga maritima in a temperature-dependent way. Regulation of His kinase autophosphorylation could be important for other His kinases, like the mammalian NM23 kinase. Currently, there is no way of monitoring this autophosphorylation but the use of the phospho-NM23-1/2 (phospho-H118) rabbit polyclonal antibodies we have developed (K Adam, unpublished) demonstrates that in vivo autophosphorylated H118 is more stable when the lysates of two human cancer cell lines, HeLa and Alva-31, were treated at pH 10, 4 °C than at pH 3, 90 °C for 5 min (Figure 4). These antibodies can now be used to test whether NM23 autophosphorylation changes under specific growth conditions, which could in turn affect its His kinase activity toward protein substrates.

Figure 4
figure 4

Temperature/pH compatibility in function of phosphoresidue. (a) Temperature is inversely correlated to pH gradient for pHis stability. The converse is observed for pSer/Thr residues, whereas pTyr residues are highly stable over the whole pH range. This representation is designed to indicate the recommended experimental conditions to conserve different types of phosphorylation but needs to be adapted according to the experimental buffer solution used. In addition, stability can also be impacted by salt conditions, tertiary structure and neighboring amino acids. Black stars on pH gradient represent different human physiological states. (b) A dot blot of recombinant pNM23-1/2 H118 is shown in the upper right hand panel, using purified NM23-H1 (according to Fuhs et al20), either unphosphorylated, or autophosphorylated in the presence of ATP, or autophosphorylated in the presence of ATP followed by boiling followed by incubation with pre-immune serum (PI 95) or antibodies purified on immobilized antigen peptide from serum raised against a peptide representing residues 114–122 of NM23-1/2 containing 1-pTza in place of His118. The non-hydrolyzable 1-pTza NM23-1/2 peptide sequence analog for H118 is used as a control. (c) Immunoblots of SDS gel-fractionated lysates of HeLa cells or ALVA-31 prostate cancer cells with anti-pNM23-1/2 antibodies (top), a mixture of anti-1-pHis and 3-pHis monoclonal antibodies (middle) in the lower right hand panel, comparing conditions where pHis is stable (pH 10, 4 °C) or where it is unstable (pH 3 (0.1% formic acid CH2O2), 90 °C for 5 min, restored to pH 8.8 with 1 M Tris). The parenthesis indicates the pNM23-1/2 bands, and the arrow indicates a heat-insensitive band that is apparently recognized non-specifically by the purified antibodies. The levels of NM23-H1/2 and actin in each sample were determined by immunoblotting with specific antibodies (lower panels).

Current methods to study the phosphoproteome utilize acidic pH because pSer/pThr and pTyr are stable under these conditions, but paradoxically, physiological conditions are slightly alkaline. If, as generally believed, life evolved as a result of UV exposure in a primordial soup close to sea water conditions, it is not surprising that our most important primitive signaling pathways conserve the same chemical environment.136 For this reason, it will be important to develop methods that conserved this alkaline pH for the study of labile phosphate linkages like pHis, pArg and pLys.

THE NEED TO CONSIDER NEW PARAMETERS FOR PHOSPHOPROTEOME 2.0

The technical and biochemical difficulties in detection of His phosphorylation have been a barrier to understanding this important PTM, but with the recent development of new tools for the study of pHis we can expect to see a significant expansion of the phosphoproteome by addition of unconventional phosphorylation events.

In the future, the links between the kinome and phosphoproteome need to be better refined. Kinome analysis and kinase families are useful to delineate protein kinase specificities and search for common substrates, as defined by specific phosphorylation motifs, where positionally conserved residues are required for efficient phosphotransfer. Most Ser/Thr/Tyr kinases recognize their target phosphorylation sites based on the identities of the amino acids from –5 to +5 positions around the phosphoacceptor site, and this together with primary sequence relationships in the catalytic domain allows them to be grouped by similarity. A comparative genomic/phosphoproteomic analysis showed that some phosphorylation sites (pSer) have evolved to replace acidic amino acids (Asp and Glu) in order to allow phosphorylation-dependent regulation,137 raising the possibility of the same being true during the evolution of other types of phosphorylation. In this connection, however, it should be noted that it is not known whether His kinases, such as NM23, select substrate His targets based on a primary sequence motif or through some other mechanism.

The size of the pHis phosphoproteome is not insignificant, and with this realization it becomes increasingly important that phosphoproteome analysis involves every possible SONAte phosphoacceptor site within the proteome, so as to provide unbiased information about phosphorylation independently of the sequence, function or effect. For this reason, experimental conditions need to be adapted for the type of phosphorylation group being studied. Finally, in the same way that not all proteins can be phosphorylated by one kinase, not all phosphorylations can be detected using one method. For this reason, the complementary use of several methods and approaches is essential.138 The interrogation of multidimensional phosphoproteomic data sets, encompassing the whole range of phosphorylation profiles, might also help to define new driver kinases that could be therapeutic targets.139

Conclusion

Studies of the NM23 His kinase have helped illuminate a missing part of the phosphoproteome. NM23 is a useful model to define His phosphorylation mechanisms in eukaryotes. Based on the high energy pH118 bond, NM23 could also be a good model for phosphorylation of other amino acids such as Asp, by pHis to Asp phosphotransferase activity, as originally shown for aldolase C,140 and also other phosphorylation of amino acids, as suggested by the phosphorylation of KSR1 S392.

Although not all the substrates and functions of NM23 His kinases have been fully defined, the available data reveal a tumor suppressive nature (promote differentiation, metastasis-suppressor activity, direct interaction with the p53 tumor suppressor, etc.96, 141, 142, 143). This opens up therapeutic possibilities either for pharmaceutical (mimetic molecules) or diagnostic (detection or downstream signaling pathways) approaches. One challenge will be to devise methods to conserve, stabilize and detect His phosphorylation in order to identify the key NM23 targets.

No doubt the availability of the pHis antibodies, which can still be improved for increased affinity, and the development of sequence-specific phospho-NM23-1/2 polyclonal antibodies that recognize the catalytic autophosphorylation site H118, as well as any other site-specific pHis antibodies, will facilitate the study of His phosphorylation in the future. In parallel, further improvements in the detection of His phosphorylation by MS are needed to expand the pHis phosphoproteome and allow functional analysis of individual His phosphorylation sites of interest.119

Phosphorylation of His has aroused a lot of recent interest, and the number of studies concerning this PTM is growing. The rapid progress in this area reveals that we have still a lot to learn about our expanding phosphoproteome.