Introduction

Plant-based systems are gaining acceptance as alternative production platforms for recombinant biopharmaceuticals (reviewed in1) with the first product (Elelyso by Protalix) released to the market. With regard to the differences existing in posttranslational modifications between humans and plants considerable progress was achieved in the humanization of Asparagin (N)-linked glycosylation of plant-made pharmaceuticals. The attachment of immunogenic plant-specific β1,2-xylose and α1,3-fucose residues to the core N-glycan was abolished in different plant systems2,3,4,5. In addition, plant-produced recombinant human EPO (rhEPO) devoid of Lewis A epitopes on N-glycans was reported recently6. Lewis A is a trisaccharide structure which occurs only rarely on glycoproteins of healthy adult humans but is widespread on plants. Further humanization of the N-glycosylation on plant proteins was achieved by expression of the human β1,4 galactosyltransferase7,8 and additional heterologous enzymes necessary for engineering sialylation9,10. Despite this progress in engineering N-glycosylation, O-glycosylation, which means the attachment of glycans to the hydroxyl group of amino acids, can affect product quality. Plant O-glycosylation differs explicitly from the typical human mucin-type O-glycosylation (reviewed by11) and induces antibody formation in mammals12,13. Immunogenicity of biopharmaceuticals may result in reduced product efficacy and is a potential risk for the patients14,15. Such adverse effects hamper the broad use of plants as production hosts for biopharmaceuticals. In plants, the main anchor for O-glycosylation is 4-trans-hydroxyproline (Hyp) (reviewed in16,17) while no further modification of Hyp occurs in mammals18. Although Hyp is always synthesized post-translationally by prolyl-4-hydroxylases (P4Hs) via hydroxylation of the γ carbon of proline, recognition sequences on the target proteins differ between mammals and plants18. The action of both, mammalian and plant P4Hs leads to Hyp, while its diastereomer 4-cis-hydroxyproline has not been found in a natural protein yet19. Hyp is an important structural component of plant cell walls and of the extracellular matrix of animals. Here, Hyp plays a key role in stabilizing the structure of collagen, one of the most abundant proteins in mammals, in which the second proline of the tripeptide PPG is usually hydroxylated by collagen P4Hs. In plants, Hyp residues are the attachment sites for O-glycosylation of hydroxyproline-rich glycoproteins (HRGPs), the most abundant proteins in the plant extracellular matrix and cell wall. HRGPs include extensins, proline-rich glycoproteins and arabinogalactan proteins16,20,21. Prolyl-hydroxylation and subsequent glycosylation of plant cell wall proteins is of major importance for growth, differentiation, development and stress adaption22,23.

The target motifs for Hyp-anchored O-glycosylation in plants, so-called glycomodules, were defined and validated20,21. From these, the consensus motif [A/S/T/V]-P(1,4)–X(0,10)–[A/S/T/V]-P(1,4) (where X can be any amino acid) was derived for predicting prolyl-hydroxylation in plants11. According to in silico analysis of the human proteome, approximately 30% of all proteins contain this motif, making them candidates for non-human prolyl-hydroxylation and subsequent O-glycosylation when expressed in plant systems11. Indeed, undesired plant-typical prolyl-hydroxylation24,25,26 and in some cases subsequent arabinosylation of biopharmaceuticals was reported27,28,29. On the other hand, the artificial introduction of Hyp-O-glycosylation motifs was suggested as an alternative to PEGylation (the attachment of polyethylene glycol-oligomers to proteins or peptide drugs) to increase the serum half-life of biopharmaceuticals30,31. However, non-human prolyl-hydroxylation does not only alter the native sequence of the protein, but also serves as anchor for O-glycans, which in turn may be immunogenic. Thus, the elimination of the anchor Hyp is the only safe way to avoid adverse O-glycosylation in PMPs.

Among plants, the moss Physcomitrella patens offers the unique possibility for precise targeted genetic engineering via homologous recombination (e.g.3,32). Further, several recombinant proteins have been produced in the moss bioreactor, including rhEPO33, one of the top-ten biopharmaceuticals world-wide34. EPO is a highly glycosylated peptide hormone stimulating erythropoiesis. Recombinant hEPO produced in CHO (Chinese hamster ovary) cells is used for prevention or treatment of anaemia in nephrology and oncology patients and can be abused for illegal doping activities. A glyco-engineered version of EPO (asialo-EPO) has no hematopoietic activity but can serve as a safe drug with neuro- and tissue-protective functions after stroke and additional hypoxia stress35,36. Production of correctly N-glycosylated asialo-EPO in the moss bioreactor was reported recently6. However, plant-derived rhEPO from moss and Nicotiana benthamiana was shown to be hydroxylated within the motif SPP (amino acids 147–149)24,25 which may have adverse effects on patients.

In this study, we aimed to identify and destroy the genes responsible for undesired non-human prolyl-hydroxylation of rhEPO produced in the moss bioreactor.

Results

As the proline hydroxylation on moss-produced rhEPO occured within a consensus motif recognized by plant P4Hs, we searched for homologues of these enzymes in the moss genome. Via BLAST (basic local alignment search tool) searches in the Cosmoss database (www.cosmoss.org) we were able to identify six sequences from the P. patens genome with homology to P4H enzymes: Pp1s8_114V6.1 (P4H1), Pp1s192_51V6.1 (P4H2), Pp1s19_322V6.1 (P4H3), Pp1s172_91V6.1 (P4H4), Pp1s12_247V6.1 (P4H5) and Pp1s328_29V6.1 (P4H6). As sequence information was not complete for P4H2, 3 and 6 mRNA, 5′ RACE-PCR was employed to obtain full length sequences (Supplementary Fig. S1 online). Two different cDNAs were amplified for the P4H6 gene, corresponding to alternative splice forms of the mRNA, from which two protein variants with different N-termini could be predicted (P4H6a and P4H6b, with P4H6a containing an N-terminal extension not present in P4H6b). All deduced protein sequences had a prolyl-4-hydroxylase alpha subunit catalytic domain (SMART 0702). N-terminal transmembrane domains were predicted for all homologues except P4H2 (TMHMM server v.2.0, http://www.cbs.dtu.dk/services/TMHMM/). As none of the sequences possessed CXXXC motifs typical for prolyl-3-hydroxylases37, we exclude 3-hydroxylation activity of the predicted proteins.

In order to gain more information about the predicted P4H enzymes, the deduced amino acid sequences were aligned with sequences of already characterized P4Hs from human, Arabidopsis thaliana and Nicotiana tabacum. The catalytic domain in the C-terminal end of the protein is highly conserved in all seven P. patens homologues (Supplementary Fig. S2 online). The seven putative P4Hs share 16–24% identity with the human catalytic α (I) subunit and 30–63% identity with AtP4H1. Among the moss sequences the degree of identity is between 30 and 81%. All sequences contain the motif HXD and a distal histidine, which are necessary to bind the cofactor Fe2+. Further, they contain the basic residue lysine which binds the C-5 carboxyl group of 2-oxoglutarate (Supplementary Fig. S2 online). These residues are indispensable for the activity of collagen P4Hs38 and of P4H1 from A. thaliana39, indicating that all seven sequences from P. patens are functional prolyl-4-hydroxylases.

Non-human prolyl-hydroxylation occurred on moss-derived rhEPO which has been secreted to the medium of the moss bioreactor culture. Therefore, we concluded that the P4H enzyme responsible for post-translational rhEPO modification is located in the secretory compartments, i.e. the endoplasmic reticulum (ER) or the Golgi apparatus. All sequences display a hydrophobic segment near the N-terminus following a positively charged residue, suggesting a localization of the proteins in the secretory compartments40. However, we examined the subcellular localization of the seven moss P4Hs in silico with four different programs based on different algorithms. As no consistent prediction was obtained by this approach (Supplementary Table S1 online) we subsequently studied the in vivo intracellular localization of each of the seven moss P4Hs by expressing them as GFP (green fluorescent protein) fusions (P4H-GFP) in moss cells. Transfected cells were analysed via Confocal Laser Scanning Microscopy. In optical sections recorded 3–14 days after transfection, GFP signals from all seven different P4H fusion proteins were predominantly detected as defined circular structures around the nucleus, indicating labelling of the nuclear membrane (Fig. 1). As the nuclear membrane is part of the endomembrane continuum of eukaryotic cells, these signals reveal that all seven moss P4Hs were targeted to the secretory compartments. An ER-targeted GFP version (ASP-GFP-KDEL41) as well as GFP without any signal peptide displaying GFP fluorescence in the cytoplasm as well as the nucleus41 served as controls. Thus, these experiments provided no clear indication of a specific P4H responsible for generation of Hyp on secreted rhEPO but left all of them as candidates.

Figure 1
figure 1

Subcellular localization of P. patens P4H homologues.

Fluorescence of P4H-GFP fusion proteins in P. patens protoplasts was observed by confocal microscopy 3 to 14 days after transfection. The images obtained for P4H1-GFP, P4H3-GFP and P4H4-GFP are taken as example of the fluorescence pattern which was observed for all homologues. (a–c) P4H1-GFP, (d–f) P4H3-GFP, (g–i) P4H4-GFP, (j–l) ASP-GFP-KDEL as control for ER localization, (m–o) GFP without any signal peptide as control for cytosolic localization. (a, d, g, j and m) single optical sections emitting GFP fluorescence (494–558 nm), (b, e, h, k and n) merge of chlorophyll autofluorescence (601–719 nm) and GFP fluorescence, (c, f, i, l and o) transmitted light images. The arrows indicate the cell nucleus membrane.

In order to definitely identify those homologues responsible for plant-typical prolyl-hydroxylation of moss-produced rhEPO we aimed to ablate the gene functions of each of the moss P4Hs. Accordingly, gene targeting constructs (Supplementary Fig. S3 online) were designed for the six P4H genes and transferred to the rhEPO-producing moss line 174.1624 to generate specific deletion (knockout, KO) lines by allele replacement via homologous recombination for each of the P4H genes. After antibiotic selection, surviving plants were screened for homologous integration of the KO construct into the correct genomic locus. Loss of the respective transcript was proven by RT-PCR (Supplementary Fig. S4 online), confirming successful gene ablation. Even if truncated N-terminal P4H fragments might exist, we can exclude any residual enzymatic activity as critical catalytic residues are located in the C-terminus of the P4H family (Supplementary Fig. S2 online). One line for each genetic modification was chosen for further analysis and stored in the International Moss Stock Center (http://www.moss-stock-center.org; Supplementary Table S3 online).

To investigate the effect of each of the P4H ablations on the prolyl-hydroxylation observed for moss-produced rhEPO, the recombinant protein from each of the KO lines (ΔP4H) was analysed via mass spectrometry. For this purpose, total soluble proteins were precipitated from the culture supernatant of the parental plant and one knockout line from each P4H homologue and separated by SDS-PAGE. Subsequently, the main rhEPO-containing band was cut from the Coomassie-stained gel, digested with trypsin and subjected to mass spectrometry for an analysis of the tryptic peptide EAISPPDAASAAPLR (144–158). In the parental plant 174.16, almost half of the rhEPO was hydroxylated (Fig. 2), mainly in the second proline from the SPP motif, as shown by MS/MS (Supplementary Fig. S5 online). The stereochemistry of the observed hydroxyproline - assumably 4-trans-hydroxyproline - was not experimentally verified.

Figure 2
figure 2

Mass spectrometric analysis of the hydroxylation of moss-produced rhEPO.

(a) Reversed-phase liquid chromatogram of tryptic peptides showing peaks of oxidized and non-oxidized peptide EAISPPDAASAAPLR (144–158) derived from rhEPO produced in moss lines 174.16 (control parental plant), ΔP4H1 #192, ΔP4H2 #6, ΔP4H3 #21, ΔP4H4 #95, ΔP4H5 #29 and ΔP4H6 #8. Selected ion chromatograms for the doubly charged ions of non-oxidized (m/z = 733.4) and oxidized peptide (m/z = 741.4) are shown. (b) Broad band sum spectra for peptide 144–158 showing the absence of prolyl-hydroxylation (Pro) in the line ΔP4H1 #192 and the presence of hydroxylated peptide (Hyp) in the line ΔP4H4 #95, as an example. The singly charged peak between “Pro” and “Hyp” is caused by the incidentally co-eluting peptide YLLEAK. Retention time deviations are technical artefacts.

Surprisingly, while rhEPO produced in moss lines with ablated P4H2, P4H3, P4H4, P4H5 or P4H6, respectively, was hydroxylated in similar levels to those found on the parental plant, the ablation of exclusively the P4H1 gene was sufficient to completely abolish the prolyl-hydroxylation of the SPP motif (Fig. 2). Growth rate, rhEPO productivity and secretion of the protein to the culture medium were not impaired in these knockout plants compared to the parental line (data not shown).

We showed the complete lack of Hyp on rhEPO produced by the ΔP4H1 lines. To verify P4H1 enzymatic activity in prolyl-hydroxylation we ectopically expressed this gene in the ΔP4H1 knockout line #192. Strong overexpression of the P4H1 transcript was confirmed in the resulting lines via semi-quantitative RT-PCR (Supplementary Fig. S4 online). Five P4H1 overexpression lines (P4H1OE) were analysed for rhEPO-Pro-hydroxylation. LC-ESI-MS measurements revealed that P4H1 overexpression restored prolyl-hydroxylation of the moss-produced rhEPO (Fig. 3). The proportion of hydroxylated rhEPO, as well as the hydroxylation pattern, was altered by the elevated expression levels of the gene. In the parental plant 174.16, with native P4H1 activity, approximately half of rhEPO displayed Hyp (Fig. 2), whereas nearly all rhEPO was oxidized in the P4H1 overexpressors (Fig. 3). Furthermore, in the overexpressors not only Pro149 in the peptide EAISPPDAASAAPLR (144–158) was hydroxylated as also seen in the parental plant, but a second proline of this peptide was converted to Hyp (Fig. 3).

Figure 3
figure 3

Effect of overexpression of the prolyl-hydroxylase gene P4H1.

Comparison of reversed-phase chromatograms showing the retention time for the moss-produced rhEPO peptide EAISPPDAASAAPLR (144–158) and its hydroxylated versions in the knockout moss line ΔP4H1 #192 (upper panel) and in the overexpressing line P4H1OE #32 (lower panel). The spectra of each peak are shown below the chromatograms. In the overexpressing line, the doubly hydroxylated peptide and two singly hydroxylated isomers – one coeluting with the parent peptide - were found.

As hydroxylation and arabinosylation of the human epithelial mucin MUC1 at the sequence APP was reported upon expression in N. benthamiana28, we analysed the rhEPO N-terminal peptide APPRLICDSRVL for prolyl-hydroxylation in moss. After chymotryptic digestion of rhEPO derived from the parental plant 174.16, the knockout plant P4H1 #192 and the overexpressor P4H1OE-45, LC-ESI-MS analysis revealed that this peptide was not hydroxylated in any of these cases (Supplementary Fig. S6 online). The other Pro containing peptides, the two EPO glycopeptides, have been analysed by mass spectrometry before in the parental plant 174.16 and no Hyp residues were detected (e.g.6,24). Thus we can exclude any additional prolyl hydroxylation on moss-produced rhEPO.

Discussion

One of the most common posttranslational modifications in higher eukaryotes is prolyl-4-hydroxylase (P4H)-catalysed formation of hydroxyproline (Hyp) residues, though sequence recognition sites on target proteins differ between animals and plants. Moreover, Hyp in plants is the main anchor for O-glycosylation, which again diverges from mammalian O-glycosylation. The engineering of the human O-glycosylation machinery in plants was tackled recently25,26,29,42 leading to plant proteins with so-called mucin-type O-glycosylation. Nevertheless, the absence of plant-specific prolyl-hydroxylation and subsequent O-glycosylation should be guaranteed for the production of safe biopharmaceuticals26. Several human proteins expressed in plants were shown to be hydroxylated24,25,26,27,28,29. Recombinant hEPO produced both in moss and in N. benthamiana was shown to be hydroxylated in the consensus sequence SPP24,25 but O-glycosylation was not observed. The Hyp proportion in the moss line 174.16 was higher than that found on rhEPO from N. benthamiana25 which might be due to different production systems, transient expression in N. benthamiana vs. stable production in moss. It was reported before that less Hyp formation occurs in transient systems than in stable production29. This post-translational modification could be partially reduced by P4H inhibitors29,43, however, the complete elimination of the anchor Hyp on the biopharmaceutical by abolishment of P4H activity is the only reliable way to avoid this adverse modification. Due to the importance of P4Hs in plant growth and development22,23, or to the high number of isoforms of this enzyme in plants44 the engineering of plant genomes for mutation or deletion of P4H genes was suggested26,42 but to our knowledge not yet conducted.

In this work, we identified 6 P4H homologue genes in P. patens, from which 7 protein sequences were deduced containing the essential motifs for functionality. All of them were shown to be localized in the secretory compartments. By means of precise gene targeting via homologous recombination, knockout lines for each of these genes were generated in order to identify the P4H homologues involved in the prolyl-hydroxylation on secreted rhEPO in P. patens. As proven by MS analysis, the ablation of exclusively P4H1 leads to moss-produced rhEPO free from non-human Hyp. Thus, we demonstrated that the expression of P4H1 is essential and sufficient for the prolyl-hydroxylation of the moss-produced rhEPO.

By overexpressing the enzyme, we could also demonstrate that a higher expression level of P4H1 influences its enzyme activity, not only in the proportion of hydroxylated protein molecules but also in the pattern of hydroxylation. As opposed to moss lines with native P4H1 activity, which hydroxylate only one proline (mainly Pro149) in the peptide EAISPPDAASAAPLR (144–158) of rhEPO, moss lines overexpressing P4H1 produce rhEPO with a second proline hydroxylated in this motif. We demonstrated that of all the moss P4H proteins only P4H1 was active on rhEPO. Of the biopharmaceuticals expressed in P. patens so far, rhEPO was the only one on which Hyp formation was detected. However, considering that different P4H homologues may possess distinct substrate specificities45, it is possible that recombinant proteins bearing a different hydroxylation sequence as the one presented here, could be substrate for other P4H homologues. In that case, to completely exclude non-human Hyp formation on moss-produced biopharmaceuticals, multiple knockouts had to be performed in parallel. Due to high sequence identities among the P4H homologues, also knockdown strategies, which are feasible in both moss and higher plants46,47, might be conceivable to remove unwanted Hyp formation in both systems.

The consecutive prolines in the N-terminal sequence APPRLICDSRVL of rhEPO could be assumed to be a putative hydroxylation site. Therefore, we analysed the rhEPO N-terminal peptide, both in moss plants with endogenous activity of P4H1 and also in plants overexpressing this enzyme. No prolyl-hydroxylation of this peptide could be detected, indicating that the mere presence of contiguous proline residues preceded by an alanine in the protein of interest is not sufficient to be recognized by moss prolyl-hydroxylases. Thus we confirmed that in moss-produced rhEPO only the SPP motif was hydroxylated.

Analysing the effects of plant-typical O-glycosylated biopharmaceuticals in the human body would require cost-intensive clinical trials. Furthermore, even slight differences between PMPs and their native counterparts will hamper the approval of a drug by the relevant authorities. Thus, in our opinion the straight-forward approach is to precisely eliminate the attachment sites for plant-specific O-glycosylation, the hydroxylated proline residues, on the recombinant protein. As demonstrated here for the production of rhEPO, this can be achieved in the moss production system by the ablation and most likely also by a down-regulation, of a single P4H gene, thus paving the way to a further humanization of plant-made biopharmaceuticals in the moss bioreactor.

Methods

Identification of prolyl-4-hydroxylases in P. patens

For the identification of prolyl-4-hydroxylase homologues in P. patens, the amino acid sequence of the Arabidopsis thaliana P4H1 (AT2G43080.1) was used to perform a BLAST (basic local alignment search tool) search against the gene models in the Physcomitrella patens resource (www.cosmoss.org). The 5′ complete sequences of the P4H2, 3 and 6 cDNAs were obtained via RACE (rapid amplification of cDNA-ends)-PCR (GeneRacer™, Invitrogen, Karlsruhe, Germany) according to the manufacturer's protocol.

Protein sequence alignments were performed with the program CLUSTAL W48 (www.ebi.ac.uk/Tools/msa/clustalw2/) and visualized with Jalview (www.jalview.org/).

In silico prediction of intracellular localization

The in silico predictions for intracellular localization of P. patens P4H homologues were performed with four different programs: TargetP (http://www.cbs.dtu.dk/services/TargetP/), MultiLoc (https://abi.inf.uni-tuebingen.de/Services/MultiLoc), SherLoc (https://abi.inf.uni-tuebingen.de/Services/SherLoc2) and Wolf PSORT (http://wolfpsort.org/).

Plant material and transformation procedure

Physcomitrella patens (Hedw.) Bruch & Schimp was cultivated as described previously49. Moss-produced rhEPO was shown to be hydroxylated at the prolyl-hydroxylation consensus motif SPP (amino acids 147–149), therefore the rhEPO-producing P. patens line 174.1624 was used as the parental line for the P4H knockout generation and the line ΔP4H1 #192 was used for the generation of P4H1 overexpression lines. In these moss lines the α1,3 fucosyltransferase and the β1,2 xylosyltransferase genes are disrupted3. Wild-type moss was used for the subcellular localization experiments with P4H-GFP.

Protoplast isolation and PEG-mediated transfection was performed as described previously49,50. Mutant selection was performed with ZeocinTM (Invitrogen) or sulfadiazine (Sigma) as described before6.

For rhEPO production, P. patens was cultivated as described before6.

Generation of plasmid constructs

The cDNAs corresponding to the seven P4H homologues identified in Physcomitrella patens were amplified using the primers listed in Supplementary Table S2 online and cloned into pJET 1.2 (CloneJET™ PCR CloningKit, Fermentas, St Leon-Rot, Germany). Subsequently, the P4H coding sequences including a portion of the 5′ UTR were cloned into the plasmid mAV4mcs41 using the XhoI and BglII sites giving rise to N-terminal fusion P4H-GFP proteins under the control of the cauliflower mosaic virus (CaMV) 35S promoter. Unmodified mAV4mcs was used as a control for cytoplasmic and nuclear localization. As positive control for ER localization, pASP-GFP-KDEL was taken41.

To generate the P4H knockout constructs, P. patens genomic DNA fragments corresponding to the prolyl-4-hydroxylases were amplified using the primers listed in Supplementary Table S2 online and cloned either into pCR®4-TOPO® (Invitrogen, Karlsruhe, Germany) or into pETBlue-1 AccepTor™ (Novagen, Merck KGaA, Darmstadt, Germany). The pTOPO_P4H1 genomic fragment was first linearized using BstBI and SacI, thus deleting a 273 bp fragment and recircularized by ligating double-stranded oligonucleotide containing restriction sites for BamHI and HindIII. These sites were used for the insertion of a zeomycin resistance cassette (zeo-cassette). The zeo-cassette was obtained from pUC-zeo6 by digestion with HindIII and BamHI. For the P4H5 KO construct, a 1487 bp fragment was cut out from the pTOPO_P4H5 using SalI and BglII sites and replaced by double-stranded oligonucleotide containing restriction sites for BamHI and HindIII. These restriction sites were used for the insertion of the zeo-cassette obtained from the pUC-Zeo plasmid. The P4H2 KO construct was cloned into the pETBlue-1 AccepTor™ and the zeo-cassette replaced a 270 bp genomic fragment deleted by digestion with KpnI und HindIII. The zeo-cassette obtained from pRT101-zeo6 by HindIII digestion was inserted into the pET_P4H3 and the pTOPO_P4H4 KO constructs digested with the same enzyme, replacing a 990 bp and a 1183 bp genomic fragment, respectively. For the P4H6 KO construct, the zeo-cassette was obtained from the pUC-zeo via digestion with HindIII and SacI and inserted into pTOPO_P4H6, replacing a 1326 bp genomic fragment. In all KO constructs the regions homologous to the target gene had approximately the same size at both ends of the selection cassette, comprising between 500 and 1,000 bp.

For the overexpression construct, the P4H1 coding sequence and 79 bp of the 5′UTR were amplified from moss WT cDNA with the primers listed in Supplementary Table S2 online and cloned under the control of the 35S promoter and the nos terminator into the mAV4mcs vector41. For this purpose the GFP gene was deleted from the vector by digestion with Ecl136II and SmaI and subsequent religation of the vector. The P4H1 cDNA was inserted into the vector via XhoI and BglII restriction sites. The P4H1 overexpression construct was linearized via digestion with EcoRI and PstI and transferred into the line ΔP4H1 #192 together with pUC 18 sul6 for sulfadiazine selection.

Screening of transformed plants

Screening of stable transformed plants was performed via direct PCR51 with genomic DNA extracted as described before6. From these extracts, 2 μl each were used as template for PCR, using the primers listed in Supplementary Table S2 online to check the 5′ and 3′ integration of the knockout construct in the correct genomic locus and to check the integration of the overexpression construct into the moss genome, respectively. Plants which showed the expected PCR products were considered as putative knockouts or overexpression lines, respectively and subsequently analysed. The absence of the P4H transcripts in the KO lines was analysed via RT-PCR as described before6 using the primers listed in Supplementary Table S2 online. Expression of P4H1 in the overexpression lines was analysed via semi-quantitative RT-PCR. For this purpose, cDNA equivalent to 150 ng RNA was amplified with 24, 26 and 28 cycles using the P4H1 primers listed in Supplementary Table S2 online. The primers for the constitutively expressed TATA box-binding protein, TBP fwd and TBP rev (Supplementary Table S2 online) were used as controls.

Protein analysis

Total soluble proteins were recovered from 160 ml of a 16-days-old culture supernatant by precipitation with 10% (w/v) trichloroacetic acid (TCA, Sigma-Aldrich, Deisenhofen, Germay) as described52. The pellet was resuspended in sample Laemmli loading buffer (Biorad, Munich, Germany) and electrophoretic separation of proteins was carried out in 12% SDS-polyacrylamide gels (Ready Gel Tris-HCl, BioRad) at 150 V for 1 h under non-reducing conditions.

For peptide analysis, the proteins in the gels were stained with PageBlue® Protein Staining Solution (Fermentas) and the bands corresponding to 25 kDa were cut out, S-alkylated and digested with trypsin or chymotrypsin53. Analysis by reversed-phase liquid chromatography coupled to electrospray ionization mass spectrometry on a Q-TOF instrument (LC-ESI-MS and MS/MS) was performed as described previously53.

Quantification of the moss-produced rhEPO was performed using a hEPO Quantikine IVD ELISA kit (cat. no DEP00, R&D Systems) according to the manufacturer's protocol.

Subcellular localization of P4H homologues

Subcellular localization of the seven different P4H-GFP fusion proteins was analysed 3 to 14 days after transfection by Confocal Laser Scanning Microscopy (510 META; Carl Zeiss MicroImaging, Jena, Germany) and the corresponding software (version 3.5). Excitation at 488-nm was achieved with an argon laser and emission was measured with a META detector at 494–558 nm for GFP and at 601–719 nm for the chlorophyll. Cells were examined with a C-Apochromat 63x/1.2 W corr water immersion objective (Carl Zeiss MicroImaging). Confocal planes were exported from the ZEN2010 software (Carl Zeiss MicroImaging).