A review of protein and peptide tags used in protein expression and protein purification, and a summary of protein/peptide tags cited in formal publications.
Protein and peptide (epitope) tags are widely used in protein purification (Figure 1) and protein detection. This article comprehensively discusses major protein/peptide tags and their applications (Figure 2). In this article, protein tags refer to those with more than a dozen of amino acids, for example, green fluorescent protein; peptide tags and epitope tags are used interchangeably, referring to the likes of FLAG, Myc epitope, and polyhistidine.
Almost all protein preparations nowadays are done through tagged expression vectors, except for the preparations from native sources. These tags enable easy purification of expressed proteins. In addition, the tags affect the solubility/insolubility of expressed proteins and their activities (Figure 3). Another important application of tags is to enable the detection of expressed proteins. The localization and expression of proteins, against which no antibodies are available yet, can be readily detected through tags. Labome surveys the literature for instruments and reagents. Table 1 lists the number of publications for some of the protein/peptide tags in formal publications that Labome has surveyed.
tag | num |
---|---|
FLAG | 236 |
Myc | 218 |
GFP | 166* |
HA | 164 |
HIS | 133 |
GST | 125 |
V5 | 44 |
There are a few basic, yet essential considerations during the design of proper expression vectors. For example, the sequence of any tag must be in-frame with the sequence of the protein of interest. Codon usage must be considered when different hosts are utilized. Linker sequences can be added to enable, for example, easy cleavage of the tags, or non-interference of the activity of any expressed enzyme.
It is not necessary to remove the tags after the expression, especially for peptide tags due to their small size and for specific applications. Sometimes a protein tag can increase the solubility of target proteins.
Tags can be at either end of the target protein. Some epitope tags, such as FLAG, are often used in tandem to increase their desired features, or in combination with another tag, such as in the construct of His-Myc, His-V5. Tags can also be inserted into one or more tolerant folds of a target protein, singularly or in tandem, as in the case of "spaghetti monster" [3].
Table 2 lists the major tags cited in the publications and their important parameters.
name | amino acids | detection | purification | suppliers |
---|---|---|---|---|
FLAG | DYKDDDDK | antibody | FLAG peptide | antibody: MilliporeSigma [4, 5] ; ATCC HB-9259 (clone 4E11) [6] ; MBL [7, 8] ; CST 14793 [9] vector:Thermo Fisher |
GFP | ~220 aa | antibody or fluorescence | not used | vector: Thermo Fisher, OriGene [10], Lonza [11] antibody: see here |
GST | 218 | antibody | glutathione | antibody: SCBT, Abcam, Thermo Fisher purification: GE Healthcare GS4B [12] ; Thermo Fisher Scientific 16101 [13] ; CST (11847S) [14] vector:MilliporeSigma |
HA | YPYDVPDYA | antibody | HA peptide | antibody: Biolegend [4], Abcam [9, 15] |
poly-His | HHHHHH | antibody | nickel, imidazole | antibody: Novus Biologicals [16], Miltenyi Biotech [17] purification: Cytiva Ni Sepharose [18] ; Bio-Rad Nuvia IMAC resin [19] ; Thermo Fisher HisPur; GE Healthcare HisTrap HP [20] vector:Thermo Fisher |
Myc | EQKLISEED | antibody | Myc peptide | antibody:SCBT sc-40 [21-23], Abcam ab32 [24], ab24609 [21], Thermo Fisher [25], Cell Signaling [4], ICL [17] vector: Thermo Fisher |
V5 | GKPIPNPLLGLDST | antibody | V5 peptide | antibody: Thermo Fisher ( R960-25) [26], Cell Signaling Technology [27] vector:Thermo Fisher |
Tags are deployed for different purposes (Figure 2). Almost all tags can be used to enable protein detection through Western blot, ELISA, ChIP (Figure 4), immunocytochemistry, immunohistochemistry, and fluorescence measurement; most tags can be utilized for protein purification. Ranawakage DC et al. systematically evaluated the affinity of tags with their cognate antibodies under immunoprecipitation or ChIP (Figure 5) [2]. Several tags can be explored to solve significant protein expression problems, for example, extending protein half-lives and expressing lethal proteins [28]. Tags like thioredoxin [29], poly(NANP), MBP [14, 30], and GST can increase protein solubility, while others can help localize a target protein to a desired cellular compartment.
Six or eight [31] tandem histidine residues form a nickel-binding structure. Proteins tagged with poly-His can be easily purified by affinity to nickel or cobalt columns and eluted through imidazole. Poly-His tag is the most commonly used method for protein purification and preparation and can be used in almost all expression systems: bacteria, yeast, mammalian, and insect system. Elution agent imidazole may affect downstream studies like NMR, competition studies, and X-ray crystallography, and may induce protein aggregation. Useful recommendations for the use of His-tags can be found in [32].
A prevalent practice is to directly clone PCR products into expression vectors like Thermo Fisher or MilliporeSigma Tag expression vectors.
Thermo Fisher His tag expression vectors have been cited: pMT/V5His [33], pAc5.1/V5-His A [34], pAc5.1/V5-His-C [35], pBAD/mycHisA [36], pTricHisB vector [37], and pBAD/His B [38].

His-tag proteins can be readily detected through anti-His tag antibodies in Western blots, discussed in an article dedicated to His tag antibodies.
The resin is the typical carrier for His-tag protein purification.
Thermo Fisher HisPur Cobalt resin was used to study the role of cMyBP-C in the process of cardiac muscle contraction [39]. GE Healthcare HisTrap columns or Ni-NTA resins are very popular and have been used to to study biological condensates [40], and others [14, 20]. MilliporeSigma HIS-Select [20], Bio-Rad Nuvia IMAC resin [19], QIAGEN Ni-NTA agarose [41], Cube Biotech Ni-NTA resin [30], Clontech TALON metal affinity resin [41, 42], Promega MagneHis Protein Purification System [43] were cited as well.
Myc tag, containing a ten amino acid segment of human proto-oncogene Myc (EQKLISEEDL), is a widely used detection system in Western blotting, immunofluorescence and immunoprecipitation experiments, due to the availability of high specific anti-Myc monoclonal antibodies. It is, however, rarely utilized for protein purification.
Expression vectors from Thermo Fisher and Clontech were preferred choices. Thermo Fisher pBAD/mycHisA to investigate the structural features of Yersinia injectisome [36]. Several companies provide Myc-tagged protein expression plasmids. Kulkarni S et al, for example, expressed a plasmid encoding Raly cDNA with a Myc-tag (RC210723) from Origene to study the role of lncRNA CCR5AS in mRNA decay [25]. AddGene pLPC-N MYC ( 12540) is another choice [44].
Myc antibody is discussed in a dedicated article.
MilliporeSigma Myc immunoaffinity resin was used to purify Myc-tag proteins [45].
GST-tagged proteins are purified with immobilized glutathione and eluted through reduced glutathione (10 mM). The large GST tag tends to increase the solubility of a target protein. A GST-fusion protein is usually treated with thrombin or factor Xa to cleave the tag before other applications.
GST tag can be used in almost all expression systems: bacteria, yeast, mammalian cells, and baculovirus-based insect cells. GST-tagged proteins are detected through anti-GST antibodies.
Santa Cruz anti-GST antibody was used to carry out Western blot analyses to study RNA polymerase I transcription [46], and optineurin [47]. Abcam anti-GST tag antibody was used in a peptide array to study Ire1-unfolded protein interaction [48]. Anti-GST antibodies from Rockland [49], MilliporeSigma [50], and GenScript [51] were cited as well. Anti-GST sensors are available from Pall Fortebio which enable direct and rapid detection of quantitation of GST-tagged biomolecules and their kinetic binding measurements with other entities [52].
Thermo Fisher immobilized glutathione affinity beads were used to perform immunoprecipitation to study vacuolar H+-ATPase [53], and its B-PER GST purification kit was used to purify proteins to investigate the regulatory effect of HILDA complex on VEGF-A expression [43]. GE Healthcare/Amersham glutathione resins, for example, Glutathione Sepharose 4B GST-tagged protein purification resin [12], is one of the most popular choices. It has been used to study p75 Signaling [54], synaptic vesicle fusion [55] and IKK2 activation [56]. Magnetic-conjugated mouse anti-GST antibody beads from Cell Signaling Technology (11847S) [14] and from others [57] were cited .
MilliporeSigma GST-pErk2 expression vector was used to investigate the function of SpvC [58]. GE Healthcare/Pharmacia pGEX vectors with GST were used in protein labeling experiments [14, 59].
Similar to the Myc tag, HA tag is predominantly used to enable protein detection when the detection agent for the target protein is not available. It is a short segment YPYDVPDYA from human influenza hemagglutinin protein. HA tags were utilized in Western blotting, immunofluorescence and immunoprecipitation experiments, and occasionally, in ELISA and ChIP assays.
Laflamme C et al used anti-HA magnetic beads from Thermo Fisher Scientific (cat# 88837) to isolate lysosomes from HEK-293 cells transfected with Tmem192-3xHA (Addgene #102930) and to isolate mitochondria from those transfected with 3xHA-eGFP-OMP25 (Addgene #83356) [60]. Thermo Fisher anti-HA magnetic beads was used to perform immunoprecipitation to study FHL1 and FHL2 isoforms [61]. A comprehensive review on HA antibody is available.
V5 epitope tag, GKPIPNPLLGLDST, is from the P/V proteins of paramyxovirus SV5. Paramyxovirus, a negative-sense RNA virus, causes some common human diseases, such as mumps, measles, bronchiolitis, pneumonia, and croup. Mammalian and insect cell expression systems are commonly used for V5 tag. The tag tends to be useful to enable protein detection, such as in Western blotting, immunofluorescence and immunoprecipitation experiments, although it also finds applications for protein purification. The CCSB-Broad Lentiviral Expression Library, generated by researchers in Dana-Farber Cancer Institute and The Broad Institute, is a genome-wide expression-ready lentiviral system for over 15,000 human ORFs with a C-terminal V5 tag.
Thermo Fisher anti-V5 antibody was used to perform Western blots (R96025) [41, 62], immunoprecipitation [63-66], immunocytochemistry [67, 68], and immunohistochemistry [63]. MilliporeSigma anti-V5 rabbit antibody was used to perform immunoprecipitation to investigate the involvement of Fam20C in the phosphorylation of extracellular proteins [68]. Cell Signaling Technology anti-V5 antibody (D3H8Q) was used as well [27].
Thermo Fisher pcDNA 3.1 His-V5 expression vectors (with or without TOPO or other motifs) were commonly used [69]. Also cited were pLENTI6/V5 DEST [70, 71], pMT/V5His [33], pAc5.1/V5-His A [34] and C [35].
MilliporeSigma V5-agarose was used for protein purification [72].
GFP, green fluorescent protein, and its variants like EGFP, Clover [8, 73] are the most popular reporter proteins; they have also been used as tags, for example, as a reporter for TP53 MITE-seq screen [74] or when linked with LC3, as a reporter for autophagy [75]. “Multifunctional GFP” (mfGFP) variants that include multiple affinity tags within an internal loop of the EGFP were engineered to be used as both reporters and tags [76]. They are much larger than typical epitope tags. To minimize the size of the tag, an 16 aa GFP seqment can be tagged to a target protein while the complementary GFP fragment is expressed ubiquitously or tagged with another protein (the split-GFP strategy), for example, [77]. Tagged proteins, of course, can be detected by fluorescence, and by anti-GFP antibodies. GFP and its derivatives are sometimes co-transfected with other vectors to ease the selection of the transfected cells through fluorescence. For example, Saito T et al co-transfected HepG2 cells with CRISPR vector pX330-U6-Chimeric_BB-CBh-hSpCas9 and pEGFP-C1 from Clontech Laboratories and then sorted GFP-positive cells before expanding the GFP-positive cells and identifying knockout cells [78].
Thermo Fisher pcDNA 6.2-GW/EmGFP-miR was used to study the roles of miR-206 during the regeneration of neuromuscular synapses after acute nerve injury [79]. Open Biosystems pLOC-GFP was used to perform lentiviral production [80]. Origene pLenti-C-mGFP [10], pCMV6-AN-tGFP, [81] and pCMV6-Hes5-GFP [82] were used .
Clontech is the major supplier of expression vectors for EGFP and its variants. EGFP-C1 [78, 83], EGFP-N1 [84], H2B-GFP-N1 [85], pIRES-eGFP [65], pIRES2-AcGfp1 [86], pTRE-EGFP [87], pCMS-EGFP [87] have been cited. Expression vectors from , and Lonza pMAXGFP [11], Oligoengine pSUPER-GFP [88], OriGene pCMV6-AN-tGFP [81], pCMV6-Hes5-gfp [82], Evrogen pTurboGFP-B [89], BD Bioscience pIRES2-EGFP [90], and through Addgene, from individual investigators, GFP/EGFP expression vectors GFP-LC3 [71], CBFRE-GFP [91] were used.
Transgenic animals such as fly strain FRT82B Ubi-GFP [92], UAS-GFP [92] from Bloomington Drosophila Stock Center, and The Jackson Laboratory Gt(ROSA)26Sortm4(ACTB-tdTomato,-EGFP)Luo/J mice [86] also used GFP tags.
GFP antibodies are reviewed in a separate article.
FLAG epitope tag, N-DYKDDDDK-C, is the first epitope tag designed for fusion proteins and is the only patented tag. The multiple polyanionic amino acids in the FLAG tag are less likely to affect the activity of a target protein. FLAG-tagged proteins can be recognized or recovered with specific antibodies, M1, M3, M5. FLAG has been used in bacteria, yeast, and mammalian cell systems. 3 X FLAG can improve the detection of the FLAG tag and immunoprecipitation (Figure 4). BioLegend L5 clone has the higher affinity under immunoprecipitation/ChIP (Figure 4). N-terminal FLAG tags can be removed by enterokinase treatment.
Another 3xFLAG tag, DYKDHDGDYKDHDIDYKDDDDK, also designed by MilliporeSigma, has a stronger net negative charge and a lower pI, and was exploited by Kabayama H et al to generate ultra-stable intrabodies [93].
Abcam anti-FLAG antibody, for example, ab18230 [94], or those conjugated with allophycocyanin, was used in flow cytometry to investigate the virus-induced evolution of transferrin receptor [95] and in immunoprecipitation or western blot [66]. MilliporeSigma anti-FLAG antibodies were used in immunoprecipitation [27, 61], dot blots [1], Western blots [6, 15], in immunohistochemistry/immunocytochemistry [96, 97], and in ChIP assay [98]. For example, Yang J et al detected FLAG tag in live and fixed HEK293 cells with MilliporeSigma anti-FLAG antibody F1804 [97]. Herb M et al used the same antibody to visualize FLAG-tagged proteins in fixed cultured macrophages [99]. Saito T et al used anti-FLAG antibody from MBL (185-3L) in Western blots for HepG2 cell lysates and precipitates [78].
Thermo Fisher pcDNA3-based FLAG vectors were used to study the role of RACK1 and protein kinase C alpha in the mammalian circadian clock [100]. Invitrogen pDest-flag plasmid was used to investigate the role of FBW7 in the Notch signaling pathway [82]. FLAG-related constructs provided through Addgene were cited in publications as well [101]. MilliporeSigma 3X-FLAG CMV vectors are the most popular choice for FLAG expression, especially CMV-7 [102-104]. Other FLAG vectors from MilliporeSigma have been cited as well [45, 105-108].
MilliporeSigma anti-FLAG beads or affinity gels were commonly used to precipitate or purify FLAG-tagged proteins [15, 94]. 3 X FLAG peptides are used during immunoprecipitation, purification, or pulse-chase experiments involving FLAG-tagged proteins. MilliporeSigma 3xFLAG peptide is the common selection [45, 109].
Tag | Size (kDa) | Binding Partner | Characteristic | Sample References |
---|---|---|---|---|
ABDz1-tag | 5.2 | Albumin; Protein A | Enables orthogonal affinity purification | [110] |
Adenylate kinase (AK-tag) | 23.5 | Blue-sepharose (adenyl containing factors) | Increases solubility; single step purification; quantitation by activity assay; confers dual affinity when combined with other tags (e.g., His-tag) | [111] |
BC2-tag (PDRKAAVSHWQQ) | a corresponding high-affinity bivalent nanobody: bivBC2-Nb | Enables high-quality dSTORM imaging | [112] | |
Calmodulin-binding peptide | 5.3 | Ca2+ | Protein purity comparable with His-tagged protein; Increased stability | [113] |
CusF | 10 | Cu+ or Ag+ | Affinity purification; increased solubility | [114-116] |
Fc | 25 | Protein A, G, L | Simple detection of protein expression by ELISA kits; Simple affinity purification; Increased protein expression yield | [31, 117] |
Fh8 | 8 | Hydrophobic matrix | Highly increased solubility | [118, 119] |
HaloTag | 33 | Chloroalkane | Covalent binding to HaloLink resin coupled with proteolytic cleavage; increased purity; can bind to indicators such as chromophores | [120, 121] |
Heparin-binding peptide (HB-tag) | 4 | Heparin | One step purification; cost-effective | [122] |
Ketosteroid isomerase (KSI) | 14 | Hydrophobic resin | Effective at targeting inclusion bodies | [123] |
MBP | 42.5 | Cross-linked amylose | Increased solubility | [14, 30, 124] |
thioredoxin | 12 | None | Elevated thermal stability; Assists in the proper folding of proteins; | [125, 126] |
PA(NZ-1) | 1.15 | NZ-1 monoclonal antibody | Efficient one-step purification | [127] |
Poly-Arg | 0.80 | Cation-exchange resin | Small tag; cost-effective purification | [128] |
Poly-Lys | 0.69 | Cation-exchange resin | Small tag; cost-effective purification | [129] |
S-tag | 1.74 | S-protein | Elution under stringent conditions | [130] |
SBP / Streptavidin-Binding Peptide | streptavidin | 38-amino acid tag | [30] | |
SNAP | 20 | benzylguanine derivatives | [131] | |
Strep-II (Twin-Strep) | 0.88 | Streptavidin | Competition with biotin and derivatives | [132, 133] |
SUMO / SUMO2 | 14 | Requires an affinity tag | Increases solubility | [134, 135] |
Many fusion systems have been devised. C-terminal sortase ligation consensus sequence (LPETG) was used to ligate a phosphopeptide [6, 29] or fluorescent probes [132, 136]. SpyTag, a peptide tag of 13 amino acids, and its protein partner, SpyCatcher, with 138 amino acids, both of which were derived from a domain of Streptococcus pyogenes fibronectin-binding protein FbaB, form an amide/isopeptide bond to irreversibly link two proteins [137]. Akiba H et al used the SpyTag/SpyCatcher system to generate a biparatopic antibody from two scFV fragments [138]. Nakagawa T expressed CNIH3, a transmembrane protein, with a rho 1D4 tag [139]. Li Z et al expressed HTT proteins with an N-terminal protein A tag in HEKE293 cells and purified them with IgG monoclonal antibody-agarose [14]. Götzke H et al designed ALFA-tag and its cognate nanobodies for tag detection and protein purification [1]. A lipid-permeable transactivator of transcription (TATk) peptide tag enabled the fluorescent mCherry protein secreted from 4T1 breast cancer cells to label neighboring parenchymal cells [140]. Dai H et al cloned mouse PIRA1-6 into plasmid pFUSE-mIgG1-Fc from InvivoGen and expressed them in P3X63Ag8.653 cells [117]. Tao Y et al expressed and purified a Fc-tagged FZD6 extracellular domain from Expi293 cells using the pFUSE-hIgG1-Fc2 vector from Invitrogen [141]. Pastushok L et al linked an antibody fragment with an iPTD tag, which enabled the entry of the antibody fragment into HEK293T cells [142]. Avi-tag GLNDIFEAQKIEWHE or C-terminal biotin acceptor peptide (BAP, LNDIFEAQKIEWHE) enables the specific biotinylation of lysine (K) within the tag sequence in expression proteins through E coli biotin ligase BirA [143, 144]. Small ubiquitin-related modifier (SUMO) can improve the folding and solubility of the target proteins. SUMO protease, for example, ULP1 protease [40], specifically targets SUMO tag based on 3-dimensional structure (rather than sequence motif). In addition, the protease can cleave urea-solubilized fusion proteins. Horn V et al, for example, expressed RNF168-RING domain constructs in the petNKI-His-SUMO2-kan vector and purified using a His-SUMO tag and cleaved the tag using His-tagged SENP2 protease [145]. ickey TH et al expressed RIG-1 protein with a pET SUMO expression vector, with an additional serine residue between the SUMO tag and RIG-1 to improve tag cleavage [135]. MBP, maltose-binding protein from E. coli K12, enables the fusion protein purification by one-step amyloses affinity chromatography and elution with 10 mM maltose [29]. Table 3 lists other tags like thioredoxin, poly(NANP), poly-Arg, Strep-tag, S-tag, calmodulin-binding peptide, the PurF fragment, ketosteroid isomerase, PaP3.30, and TAF12 histone fold domain. Some tags require stringent elution conditions and they are not conducive to the preparation of active proteins. For example, proteins with S-tag are purified through beads conjugated with anti-S-tag antibody and eluted with 0.2 M citrate buffer at pH 2. PA tag (based on monoclonal antibody NZ-1 and its cognate dodecapeptide epitope from human podoplanin) [127], Inntags [146], BC2-tag [112] are thought to convey certain advantages over existing, more commonly used tags. Inntags, unlike all other tags, are within the tagged proteins (instead of on the N or C terminal). BC2-tag and its cognate nanobody bivBC2-Nb enable dSTORM imaging of both fixed and live cells [112]. Ty1 tag, for instance, 3x Ty1 tag, is preferable for crosslinking ChIP experiments, since FLAG and Myc tags contain lysine, the major target of formaldehyde crosslinking. T7 tag, an 11 amino acid peptide encoded in T7 bacteriophage major capsid protein, exploits the efficient T7 RNA polymerase expression system [93].
Aggregating / oligomerization tags were developed in the last few years, such as the pH-responsive CspB50 tag [147], and trimerization tag foldon (T4 fibritin trimerization motif) [132, 148]. By using such a tag, proteins can be purified with a column-free procedure, which is cost effective and efficient. Purification of proteins fused to these tags is based on their separation from soluble fractions and release by cleaving at specific cleavage sites. In vivo aggregation can lead to either functional or inactive protein aggregates that can be reactivated following purification. Some tags are able to maintain the solubility of the expressed fusion protein in vivo, but induce their aggregation in vitro. A detailed discussion of the advantages, disadvantages and further possible improvements of these tags, as well as required steps for optimizing the purification of aggregating-tags fusion proteins, was published recently [149].
The key selection criterion is that the protease must be very specific to a tag or a linker. In addition, the target protein does not have any recognition site. Coagulation cascade has some proteases with stringent requirements of sequence motif, and they cleave with high precision. Both thrombin and Factor Xa are from the coagulation cascade.
Thrombin is one of most commonly used tag cleavage proteases. Its consensus cleavage site, Leu-Val-Pro-Arg-Gly-Ser, is often included in the linker region, and upon cleavage, there are residual amino acids in the fusion partner protein. It is sensitive to reducing agents. Thermo Fisher Scientific thrombin was used in protein purification to investigate the next-generation drugs against Plasmodium [150]. Thrombin from MilliporeSigma was used in the purification and characterization of yeast NCR1 and 2 proteins [151], and a recombinant murine Wnt3a [152].
Another coagulation cascade protein, Factor Xa, is also a commonly used protease in removing tags from fusion proteins. It cleaves at the C-terminal of the consensus site, I-E/D-G-R, and thus can completely remove a tag from the N-terminus of a target protein. The cleavage temperature is from 4°to 25°C. It is sensitive to reducing agents. It also binds calcium ions and should not be used in the presence of chelating agents such as EDTA or EGTA.
Tobacco etch virus is a positive-sense, single-stranded RNA virus. Its protease has a specific recognition site and cleaves at high precision. It cleaves between Gln and (Gly/Ser) in the consensus site Glu-Asn-Leu-Tyr-Phe-Gln-(Gly/Ser) (ENLYFQ(G/S)). Its activity is not inhibited by a low concentration of urea, which can prevent protein aggregation, and increase protein solubility. Sigma TEV protease (T4455) is cited among the surveyed literature [14]. Life Technologies / Invitrogen TEV protease was used to remove a GST-tag to perform structural studies of the dengue virus protein 4A [153]. TEV protease removed His-tags from some proteins, but not others such as Streptococcus pyogenes quinolinate phosphoribosyltransferase and NH3-dependent NAD+ synthetase [154].
Enterokinase recognizes the motif D-D-D-D-K and cleaves at the carboxyl site of lysine. FLAG-tag DYKDDDK contains such a motif. The expression and purification of a functional glucagon-like peptide-1 (KGLP-1), after removal of its GST-tag, is one example of the use of enterokinase to cleave a fusion junction during protein purification [155]. It is calcium-dependent and so CaCl2 is essential for its activity. Maun HR et al, for example, added an enterokinase cleavage site to the N terminus of alpha- or beta-tryptase and a C-terminal Flag-tag to express and purify tryptase with enterokinase cleavage [156].
PreScission Protease recognizes the sequence Leu-Phe-Gln-Gly-Pro and cleaves between Gln and Gly. It has a GST tag attached to it for easier removal. PreScission Protease (27-084301) from GE Healthcare is a common choice [7, 157]. 3C protease cleavage site was used in a vector in a study of NLRP1B inflammasome [124] and GST-tagged 3C PreScission Protease from GE Healthcare was used in the study of the functions of secreted amyloid-beta precursor protein [158].
Tag removal by proteases is a simple and straightforward method, but it has several drawbacks: expensive highly specific proteases are needed and need to be purified out in subsequent steps, digestion of the target protein has to be avoided, and proteases do not generate a native terminus upon cleaving. For these reasons, self-cleaving purification tags are desirable.
Inteins form an important class of auto-cleaving proteases. They contain a splicing domain able to excises itself from mature proteins. Two conserved residues, a Cys at its N-terminus and an Asp at its C-terminus are required for self-cleavage. Mutations at any one of these two positions allow for selecting the end to be cleaved, while the identity of the replacing residue dictates the modality to induce the intein cleavage reaction. Currently, inteins of two types (thiol-induced and pH-induced) are used in multiple applications [159]. The use of the IMPACT™-System (intein-mediated purification with an affinity chitin-binding tag) from New England Biolabs is exemplified here [160]. For an example, Kaya-Okur HS et al purified the pA-Tn5 protein used in CUT&TAG experiments with chitin slurry resin from NEB (S6651S) and eluted with a buffer containing 100 mM DTT [161].
Other examples of auto-cleaving tags are the 2A peptides where the peptide bond between the proline and glycine in C-terminal of 2A peptide is broken [162, 163], for example, P2A ribosomal skip element [164] ; the FrpC protein, which undergoes cleaving upon the addition of calcium ions [165] ; the sortase A transpeptidase (SrtAc) for which the fusion of the catalytic core with its amino acid recognition sequence generates a self-cleaving section activated by the addition of calcium ion and triglycine [166] ; the human rhinovirus 3C (HRV3C) protease whose temperature-regulated expression permits the intracellular removal of the fusion tag [31, 132, 167] ; the protein VIC_001052 of the coral pathogen Vibrio coralliilyticus ATCC BAA-450 that contains a metal ion-inducible autocatalytic cleavage (MIIA) domain and activates upon addition of calcium or manganese(II) ions [168] ; or the Vibrio cholerae MARTX toxin cysteine protease domain (CPD), an autoprocessing enzyme activated by inositol hexaphosphate (InsP(6)) [169, 170].
More elaborate fusion systems have been devised. For example, Small Molecule-Assisted Shutoff (SMASh) system contains a drug-inhibitable internal protease linked with a degron to control the degradation of an expressed protein [171, 172], or auxin-inducible degron (AID) tag [173].
The addition of protein and epitope tags comes with the risk that activity or function may be affected and/or lost [174, 175]. For example, both N-terminal and C-terminal tags affect PIK3CA function: N-terminal tags increase kinase activity; C-terminal tags interfere with its membrane binding [176]. G Magupalli et al reported that GFP family of tags led to "nonphysiological localization" of ASC [177]. The location of the tag is a very important consideration. Although the standard design establishes tags to be added to the N-or C-terminus of the protein of interest, it is recommended to compare both alternatives for possible interferences with function. Also, determining the presence of domains at the site where the tag is sought to be added is crucial to minimize potentially perturbing interactions. The addition of spacer sequences between the protein and the tag is an excellent alternative to avoid functional interferences. However, these spacers should be long enough to ensure that the tag is flexible and able to reach out into solution away from the protein surface while maximizing the interaction between the tag and the matrix during purification.
An enzyme cleavage site can also be included between the tag and the protein allowing the removal of the tag after protein purification. Frequently used proteases for cleaving tags are discussed above.
Since neither the addition nor their removal of a particular tag can assure that the function of a protein is not disrupted, determining their effect on protein stability and functionality is highly recommended. Several methods can be used to analyze protein stability including differential scanning fluorimetry (DSF) [154, 178], nanoDSF [179], Dynamic light scattering (DLS) [180], differential scanning calorimetry (DSC) [181] and circular dichroism [182]. Booth WT et al. evaluated the efffect of an N-terminal polyhistidine tag on the thermal stability of ten proteins through DSF and found the presence of His-tag mostly decreased protein thermal stability [154]. Functional assays, on the other hand, depend on the particular biological activity of the protein under study and include enzymatic assays, chemotaxis or cell proliferation assays and functional ELISA among many others. MS Reid et al, for example, compared the activitiy of N- and C-terminally GFP-tagged mouse KCC4 and found no significant difference between them, contradicting an earlier report for KCC2 [183].
Dr. Macarena Fritz contributed to the section on The Interference of Tags on Protein Function in Oct 2018.
- Egorov M, Tigerström A, Pestov N, Korneenko T, Kostina M, Shakhparonov M, et al. Purification of a recombinant membrane protein tagged with a calmodulin-binding domain: properties of chimeras of the Escherichia coli nicotinamide nucleotide transhydrogenase and the C-terminus of human plasma membrane Ca2+ -ATPase. Protein Expr Purif. 2004;36:31-9 pubmed
- Franke S, Grass G, Rensing C, Nies D. Molecular analysis of the copper-transporting efflux system CusCFBA of Escherichia coli. J Bacteriol. 2003;185:3804-12 pubmed
- Lavallie E, DiBlasio E, Kovacic S, Grant K, Schendel P, McCoy J. A thioredoxin gene fusion expression system that circumvents inclusion body formation in the E. coli cytoplasm. Biotechnology (N Y). 1993;11:187-93 pubmed
- Levin G, Mendive F, Targovnik H, Cascone O, Miranda M. Genetically engineered horseradish peroxidase for facilitated purification from baculovirus cultures by cation-exchange chromatography. J Biotechnol. 2005;118:363-9 pubmed
- Hackbarth J, Lee S, Meng X, Vroman B, Kaufmann S, Karnitz L. S-peptide epitope tagging for protein purification, expression monitoring, and localization in mammalian cells. Biotechniques. 2004;37:835-9 pubmed
- Meier S, Güthe S, Kiefhaber T, Grzesiek S. Foldon, the natural trimerization domain of T4 fibritin, dissociates into a monomeric A-state form containing a stable beta-hairpin: atomic details of trimer dissociation and local beta-hairpin stability from residual dipolar couplings. J Mol Biol. 2004;344:1051-69 pubmed
- Mao H. A self-cleavable sortase fusion for one-step purification of free recombinant proteins. Protein Expr Purif. 2004;37:253-63 pubmed
- Panek A, Pietrow O, Filipkowski P, Synowiecki J. Effects of the polyhistidine tag on kinetics and other properties of trehalose synthase from Deinococcus geothermalis. Acta Biochim Pol. 2013;60:163-6 pubmed
- Niesen F, Berglund H, Vedadi M. The use of differential scanning fluorimetry to detect ligand interactions that promote protein stability. Nat Protoc. 2007;2:2212-21 pubmed
- Mehmood S, Corradi V, Choudhury H, Hussain R, Becker P, Axford D, et al. Structural and Functional Basis for Lipid Synergy on the Activity of the Antibacterial Peptide ABC Transporter McjD. J Biol Chem. 2016;291:21656-21668 pubmed
- Shiba K, Niidome T, Katoh E, Xiang H, Han L, Mori T, et al. Polydispersity as a parameter for indicating the thermal stability of proteins by dynamic light scattering. Anal Sci. 2010;26:659-63 pubmed
- Greenfield N. Using circular dichroism spectra to estimate protein secondary structure. Nat Protoc. 2006;1:2876-90 pubmed
- Materials and Methods [ISSN : 2329-5139] is a unique online journal with regularly updated review articles on laboratory materials and methods. If you are interested in contributing a manuscript or suggesting a topic, please leave us feedback.
- method
- Antibody Validation
- GFP Antibody
- HA Hemagglutinin Tag Antibody and FAQs
- Histidine Tag and Anti-Histidine Antibodies
- Incorporating Unnatural Amino Acids into Recombinant Proteins in Living Cells
- Live Cell Imaging
- Myc Antibody Review
- Nanodiscs: Membrane Protein Research in Near-Native Conditions
- Protein Companies
- Protein Expression
- Protein Purification
- Protein Quantitation
- Receptor-Ligand Binding Assays
- siRNAs and shRNAs: Tools for Protein Knockdown by Gene Silencing