An in-depth characterization of maize-derived trypsin revealed an unusual nonconsensus N-linked glycosylation.
Recombinant bovine trypsin has been produced in transgenic maize as a pathogen-free alternative to the animal-derived reagent. Biological reagents intended for pharmaceutical purposes require comprehensive analysis including detailed information about sequence and posttranslational modifications. In this study, several techniques, including mass spectroscopy (MS) analysis of the intact protein, peptide mapping, and MS analysis of released glycans were applied and an in-depth characterization of maize derived trypsin was achieved, revealing an unusual nonconsensus N-linked glycosylation.
The protease enzyme trypsin is produced in the pancreas and has many uses in the pharmaceutical sector. As well as being used to digest proteins for laboratory analysis, where it specifically cleaves at arginine and lysine residues, it also has applications in pharmaceutical production, such as producing insulin from pro-insulin and making vaccines. For any pharmaceutical process, it is essential that the enzyme is free from pathogens, which may be present in animal-derived products.
The increasing desire of industry to avoid reagents from animal sources has inspired many attempts to express bovine trypsin in alternative platforms. For that reason, the expression of trypsin in maize for large-scale industrial and pharmaceutical applications was developed and optimized by Woodard et al. (1). The task was accomplished by expressing the enzyme in an inactive zymogen form that accumulates in the endosperm of the maize seeds. The zymogen gene was inserted into maize plants and cultivated in open fields. The purified enzyme is currently commercialized by Sigma-Aldrich under the trade name TrypZean. While more expensive than animal-derived trypsin, its cost is more than offset by the elimination of regulatory costs associated with viral-clearance studies that are needed when using the animal product.
The biophysical and chemical properties of TrypZean and native bovine trypsin are compared in Table I. TrypZean has the same amino-acid sequence as the bovine-sourced product. Functionally, its activity appears to be identical to that of the native bovine protein. However, pancreatic trypsin is not glycosylated, whereas characterization of the maize-derived trypsin has shown that corn glycosylates the enzyme. It was important to pinpoint precisely where the protein was glycosylated, so that the exact structure could be known for pharmaceutical applications, and to fully understand the biological processes. To provide this full characterization, a novel way of preparing the samples was developed.
Table I: Physicochemical properties of native bovine trypsin and TrypZean.
Glycosylation is a common post-translational modification, with around half of all human proteins bearing some form of sugar functionality. There are two major forms of glycosylation, O- and N-linked. For O-glycosylation, the sugar can be attached to the hydroxyl group of a serine or threonine residue anywhere in the protein. For N-glycosylation, a well-defined rule states where in a protein sequence N-glycosylation can occur: it is always attached to the amide group of an asparagine residue that is followed first by any amino acid other than proline, and then either serine or threonine. This so-called consensus sequence does not occur in trypsin, which implied that TrypZean was O-glycosylated.
The following are the three standard methods for the analysis of glycosylated proteins using mass spectrometry (MS):
1. Liberation of the glycans from the protein by chemical or enzymatic means, followed by derivatization of the glycan before MS analysis
2. MS analysis of the intact protein with no pretreatment (i.e., top-down strategy)
3. Peptide mapping, with proteolytic digestion of the glycosylated protein followed by MS analysis of the resulting digest (i.e., bottom-up strategy).
The choice of technique depends the amount of information required. Although the first, more conventional method of releasing the glycans for MS analysis is useful in assessing the overall glycan population, it requires more sample, time, and manipulations than the other two methods. MS analysis of the intact protein can give information about the protein's molecular weight and glycan masses, and in combination with tandem MS, can confirm the combination of glycan building blocks in the observed masses. The bottom-up strategy has advantages when there is more than one glycosylation site, or more sequence information is needed, including information about where the glycan is attached. These data can usually be generated in a single liquid chromatography–tandem mass spectrometry (LC–MS/MS) experiment, but the data analysis is more complex and time-consuming.
All three strategies were applied to TrypZean. First, the accurate molecular weight was established using a high-resolution electrospray ionization - time-of-flight (ESI–TOF) mass spectrometer on the intact protein. The deconvoluted spectrum of recombinant bovine trypsin (see Figure 1a) shows a peak with a mass of 23,294 Da, which is consistent with the theoretical molecular weight of native bovine trypsin. In addition, seven glycoforms, labelled with the numbers 1 to 7, were observed (2).
Figure 1: (a) Deconvoluted mass spectrometry (MS) spectrum of intact TrypZean. The delta masses above the nonglycosylated base peak correspond to glycan species; (b) matrix-assisted laser desorption/ionization mass spectrum of released permethylated glycans from TrypZean; (c) electrospray ionizationâMS spectrum of triply charged glycopeptide Ser70-Lys89. Peaks 1â7 in all spectra correspond to the same glycan compositions.
Previous attempts to release the glycans on TrypZean by N-glycosidase F and N-glycosidase A had not been successful (1). The investigators speculated that the protein was O-glycosylated, because the normal procedures to release N-linked glycans did not work. The standard O-glycosylation release method of [beta]-elimination also failed to work. The glycans were ultimately released using a more aggressive [beta]-elimination procedure and subsequently permethylated to enhance detection during matrix-assisted laser desorption/ionization mass (MALDI MS) analysis. Analysis of released and permethylated glycans provided additional evidence of the presence of the seven glycoforms (see Figure 1b). The release conditions used were so harsh that it remained unclear whether the glycosylation was N- or O-linked.
To identify the site of glycosylation, LC–MS/MS analysis of glycopeptides was carried out. First, a sodium dodecyl sulfate polyacrylamide gel (SDS-PAGE) separation was performed, as shown in Figure 2. The gel showed the presence of two distinct forms of the protein. The lower band is consistent with the migration of the 23 kDa native bovine trypsin, and the upper band corresponds to the glycosylated form. These two bands were excised, and peptide mapping carried out for each using LC–MS/MS. Data from each band correlated well with the sequence of native bovine trypsin, giving sequence coverages greater than 85%. One 20-residue tryptic peptide was absent from the peptide map of the upper, glycosylated, protein band, implying that the glycosylation was likely contained within the sequence of this tryptic peptide, 70SIVHPSYNSNTLNNDIMLIK89 (2).
Figure 2: (a) Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDSâPAGE) gel bands excised for electrospray ionization liquid chromatographyâtandem mass spectrometry analysis; (b) sequence coverage of glycosylated protein band (upper) and nonglycosylated band (lower).
To confirm this hypothesis, the LC-MS/MS data from the tryptic digest of the glycosylated band was manually searched for the presence of glycopeptides. The glycan moieties of the seven observed glycopeptides all seem to be attached to the same tryptic peptide (see Figure 1c), and had similar masses to those found in the intact molecular weight and released glycan analysis, as shown in Table II.
Table II: Summary of identified glycans found in TrypZean.
The 20-amino acid tryptic peptide identified as the location of the glycosylation contained four asparagine residues, three serines, and one threonine, any of which might have been glycosylated. Despite the best efforts using these standard characterization techniques, the authors were unable to determine the precise nature of the glycosylation. Standard tandem MS spectra of glycopeptides tend to yield fragments resulting from cleavage of the glycan rather than the peptide backbone, and this proved to be the case here (see Figure 3). New electron-driven dissociation technology (ETD), which fragments the peptide backbone without losing post-translational modifications such as glycosylation, was applied. The ETD data suggested that the glycosylation was at Asn-77, however, this could not be definitively confirmed because of the incomplete fragment ion series that was obtained (3).
Figure 3: Product-ion (CID) spectrum of TrypZean glycopeptide Ser70âLys89 (peptide + Hex3HexNAc2Xyl1Fuc1). The triply charged ion of m/z 1154.5253 was the precursor ion. Blue square is N-acetylglucosamine or HexNAc; red triangle is fucose or Fuc; green circle is hexose or Hex; yellow star is xylose or Xyl.
In collaboration with Michael Gross's group at Washington University in St. Louis, MO, the authors developed a novel method for preparing the sample that allowed the enzyme to be analyzed more precisely. The normal procedure for digesting a protein for characterization involves using porcine trypsin to generate peptides. Instead, a novel sample preparation was investigated that uses a different enzyme: pepsin. Unlike trypsin, which cuts at arginine and lysine residues only, pepsin has limited specificity and produces smaller fragments that are more amenable to ionization and MS analysis.
The pepsin was used in immobilized form attached to agarose beads, and the tryptic glycopeptide was exposed to it for varying amounts of time. The theory behind this strategy was that by exposing the glycopeptide briefly, a peptide with only the first amino acid cleaved would be generated. Then, taking a sample a little later in the digestion, a peptide with the first two residues gone could be generated, a longer digestion still would yield a peptide with the third residue cleaved, and so on. By nibbling at the end of the peptide in this way and taking mass spectral data at each point, it would become clear when the amino acid bearing the sugar was removed.
Figure 4: Pepsin cleavage chart of TrypZean glycopeptide Ser70âLys89; the distribution of modifications, including glycosylation and oxidation, on various peptic peptides is listed.
By working down the amino acid sequence of the glycopeptide in this way, a series of 12 peptic fragments was generated (see Figure 4). MS3 analysis of the various peptide fragments showed definitively that the glycan was attached to Asn-77 (see Figure 5). This result occurred despite the fact that the sequence was asparagine–serine–asparagine, which, according to the accepted consensus sequence rules, should have precluded N-glycosylation.
Figure 5: Product-ion (MS3) spectrum of TrypZean glycopeptide Ser70âAsn77 (peptide + Hex3HexNAc2Xyl1Fuc1).
The small fragments created using this technique make it easy to identify with confidence the exact site of glycosylation in cases such as this one where several possibilities exist. In straightforward cases, standard porcine trypsin digestion remains adequate, but additional pepsin digestion should prove useful where multiple post-translational modifications or modification sites occur. The less specific nature of pepsin digestion leads to a ladder series of fragments that can help definitively identify the site of modification.
The combination of techniques used in this investigation allowed the authors to show definitively the location of the glycosylation in TrypZean. As far as the authors are aware, this work is the first definitive experimental proof that a nonconsensus N-glycosylation occurs in maize-derived bovine trypsin. Small amounts of glycosylation may occur at other sites, but it is evident that glycosylation at the Asn-77 residue is by far the most abundant.
Kevin Ray, PhD,* is a manager of analytical R&D and Pegah R. Jalili, PhD, is a senior R&D scientist in analytical R&D, both at SAFC. *To whom correspondance should be addressed
PEER REVIEWED
Article submitted: Jul. 06, 2011.
Article accepted: Aug. 25, 2011.
1. S.L. Woodard et al., Biotechnol. Appl. Biochem. 38, 123–130 (2003).
2. P.R. Jalili et al., Proceedings of the 56th ASMS Conference on Mass Spectrometry and Allied Topics (Denver, CO, 2008) pp. 1–5.
3. H. Zhang et al., Analytical Chem. 82 (24), 10095–10101 (2010).
4. E.E. Hood et al., NABC Report 17: Agricultural Biotechnology: Beyond Food and Energy to Health and the Environment (National Agricultural Biotechnology Council, Ithaca, NY, 2005) pp. 147–158.