Inteins (INTervening protEINS) are in frame intervening polypeptides with an ability to post-translationally excise themselves out of a precursor protein via a protein-splicing mechanism analogous to mRNA splicing [1-3]. The flanking protein fragments (exteins) are linked back together into a functional protein [1]. The link is a peptide-bond formed without the need of any exogenous cofactors or additional energy from high-energy molecules like ATP or GTP [4]. Thus, two or more stable proteins, the intein(s) and the extein are produced from one precursor gene.
The first example of an intein was discovered in encoded within the vma gene of Saccharomyces cerevisiae that encodes an ATPase protein [1]. Inteins are now known to be widely expressed and dispersed in nature as they are encoded in genomes of microorganisms from all three domains of life - bacteria, archaea and eukaryotes - and viruses [5] (see also Table 1). A search for “intein” in the UniProtKB database resulted in the retrieval of almost 35,000 computationally analyzed entries as of December, 2019. A much smaller number of intein entries, 188, were registered in the same database at the time of the search as manually curated (Table 1). Inteins exist in proteins with various functions (Table 1), but proteins involved in DNA metabolism, such as polymerases, helicases, recombinases, topoisomerases and ribonucleotide reductases appear to be the most common functional entities for inteins [6]. The size of these proteins is also variable, as they can be as short as tens of amino acids like the Arbitrium peptide ( ~41 amino acids) cleaved from the AimP (YopL) in Bacillus phage SPbeta or as long as 2300 amino acids, like the Hwa polC 1 intein cleaved from the DNA polymerase II large subunit of the archaea H. walsbyi.
Entry name | Protein names | Gene names | Organism | Refs. | |
---|---|---|---|---|---|
O59245 | tRNA-splicing ligase RtcB (EC 6.5.1.-) [Cleaved into: Pho hyp2 intein (EC 3.1.-.-)] | rtcB PH1602 | Pyrococcus horikoshii (strain ATCC 700860 / DSM 12428 / JCM 9974 / NBRC 100139 / OT-3) | [7, 8] | |
P9WHJ3 | Protein RecA (Recombinase A) [Cleaved into: Endonuclease PI-MtuI (EC 3.1.-.-) (Mtu RecA intein)] | recA Rv2737c MTV002.02c | Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv) | [9, 10] | |
P17255 | V-type proton ATPase catalytic subunit A (V-ATPase subunit A) (EC 7.1.2.2) (Vacuolar proton pump subunit A) [Cleaved into: Endonuclease PI-SceI (EC 3.1.-.-) (Sce VMA intein) (VMA1-derived endonuclease) (VDE)] | VMA1 CLS8 TFP1 YDL185W D1286 | Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) | [11] | |
Q58907 | Reverse gyrase [Cleaved into: Mja r-Gyr intein] [Includes: Helicase (EC 3.6.4.12); Topoisomerase (EC 5.6.2.2)] | rgy MJ1512 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
Q6F598 | Reverse gyrase [Cleaved into: Pko r-Gyr intein] [Includes: Helicase (EC 3.6.4.12); Topoisomerase (EC 5.6.2.2)] | rgy TK0470 | Thermococcus kodakarensis (strain ATCC BAA-918 / JCM 12380 / KOD1) (Pyrococcus kodakaraensis (strain KOD1)) | [12, 13] | |
O58530 | Reverse gyrase [Cleaved into: Pho r-Gyr intein] [Includes: Helicase (EC 3.6.4.12); Topoisomerase (EC 5.6.2.2)] | rgy PH0800 | Pyrococcus horikoshii (strain ATCC 700860 / DSM 12428 / JCM 9974 / NBRC 100139 / OT-3) | ||
E7FHX6 | Vitamin B12-dependent ribonucleoside-diphosphate reductase (B12-dependent RNR) (EC 1.17.4.1) (Ribonucleotide reductase) [Cleaved into: Endonuclease PI-PfuI (EC 3.1.-.-) (Pfu rnr-1 intein); Pfu rnr-2 intein (EC 3.1.-.-)] | rnr PF0440 | Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) | [14-16] | |
Q57532 | DNA gyrase subunit A (EC 5.6.2.2) [Cleaved into: Mle GyrA intein] | gyrA ML0006 | Mycobacterium leprae (strain TN) | [17-20] | |
P77933 | DNA polymerase (EC 2.7.7.7) [Cleaved into: Endonuclease PI-PkoI (EC 3.1.-.-) (IVS-A) (Pko pol-1 intein); Endonuclease PI-PkoII (EC 3.1.-.-) (IVS-B) (Pko pol-2 intein)] | pol TK0001 | Thermococcus kodakarensis (strain ATCC BAA-918 / JCM 12380 / KOD1) (Pyrococcus kodakaraensis (strain KOD1)) | [13, 21] | |
P30317 | DNA polymerase (EC 2.7.7.7) (Vent DNA polymerase) [Cleaved into: Endonuclease PI-TliII (EC 3.1.-.-) (IVPS2) (Tli pol-1 intein); Endonuclease PI-TliI (EC 3.1.-.-) (IVPS1) (Tli pol-2 intein)] | pol | Thermococcus litoralis | [22, 23] | |
P38078 | V-type proton ATPase catalytic subunit A (V-ATPase subunit A) (EC 7.1.2.2) (Vacuolar proton pump subunit A) [Cleaved into: Endonuclease PI-CtrI (EC 3.1.-.-) (Ctr VMA intein) (VMA1-derived endonuclease) (VDE)] | VMA1 | Candida tropicalis (Yeast) | ||
O73954 | DNA topoisomerase 1 (EC 5.6.2.1) (DNA topoisomerase I) (Omega-protein) (Relaxing enzyme) (Swivelase) (Untwisting enzyme) [Cleaved into: Endonuclease PI-PfuI (EC 3.1.-.-) (Pfu topA intein)] | topA PF0494 | Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) | [15, 24] | |
P74918 | DNA polymerase (EC 2.7.7.7) (Pol Tfu) [Cleaved into: Endonuclease PI-TfuI (EC 3.1.-.-) (Tfu pol-1 intein); Endonuclease PI-TfuII (EC 3.1.-.-) (Tfu pol-2 intein)] | pol | Thermococcus fumicolans | ||
Q8U4J3 | Replication factor C small subunit (RFC small subunit) (Clamp loader small subunit) (PfuRFC small subunit) [Cleaved into: Pfu RFC intein] | rfcS PF0093 | Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) | [15, 25-29] | |
P9WMR3 | Replicative DNA helicase (EC 3.6.4.12) [Cleaved into: Endonuclease PI-MtuHIP (EC 3.1.-.-) (Mtu dnaB intein)] | dnaB Rv0058 MTCY21D4.21 | Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv) | [30, 31] | |
Q18ER3 | DNA polymerase II large subunit (Pol II) (EC 2.7.7.7) (Exodeoxyribonuclease large subunit) (EC 3.1.11.1) [Cleaved into: Hwa polC 1 intein (Hwa pol II 1 intein); Hwa polC 2 intein (Hwa pol II 2 intein)] | polC polA2 HQ_3461A | Haloquadratum walsbyi (strain DSM 16790 / HBSQ001) | ||
O31875 | Ribonucleoside-diphosphate reductase NrdEB subunit alpha (EC 1.17.4.1) (Ribonucleotide reductase large subunit) [Cleaved into: Bsu nrdEB intein] | nrdEB yojP yosN BSU20060 | Bacillus subtilis (strain 168) | [32, 33] | |
Q58815 | Glutamine--fructose-6-phosphate aminotransferase [34] (EC 2.6.1.16) (D-fructose-6-phosphate amidotransferase) (GFAT) (Glucosamine-6-phosphate synthase) (Hexosephosphate aminotransferase) (L-glutamine--D-fructose-6-phosphate amidotransferase) [Cleaved into: Mja gf6p intein] | glmS MJ1420 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
Q51334 | DNA polymerase (EC 2.7.7.7) (Deep vent DNA polymerase) [Cleaved into: Endonuclease PI-PspI (EC 3.1.-.-) (Psp-GDB pol intein)] | pol | Pyrococcus sp. (strain GB-D) | ||
O57861 | DNA polymerase II large subunit (Pol II) (EC 2.7.7.7) (Exodeoxyribonuclease large subunit) (EC 3.1.11.1) [Cleaved into: Pho polC intein (Pho pol II intein)] | polC PH0121 | Pyrococcus horikoshii (strain ATCC 700860 / DSM 12428 / JCM 9974 / NBRC 100139 / OT-3) | ||
Q8TUS2 | tRNA-splicing ligase RtcB (EC 6.5.1.-) [Cleaved into: Mka hyp2 intein (EC 3.1.-.-)] | rtcB MK1682 | Methanopyrus kandleri (strain AV19 / DSM 6324 / JCM 9639 / NBRC 100938) | [35, 36] | |
Q5JGV6 | ATP-dependent DNA helicase Hel308 (EC 3.6.4.12) (ATP-dependent Holliday junction unwindase Hjm) [Cleaved into: Endonuclease PI-PkoHel (EC 3.1.-.-) (Pko Hel intein)] | hel308 TK1332 | Thermococcus kodakarensis (strain ATCC BAA-918 / JCM 12380 / KOD1) (Pyrococcus kodakaraensis (strain KOD1)) | [13, 37] | |
Q58524 | ATP-dependent DNA helicase Hel308 (EC 3.6.4.12) [Cleaved into: Endonuclease PI-MjaHel (EC 3.1.-.-) (Mja Hel intein) (Mja Pep3 intein)] | hel308 MJ1124 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
Q54IZ9 | DNA-directed RNA polymerase III subunit rpc2 (RNA polymerase III subunit C2) (EC 2.7.7.6) (DNA-directed RNA polymerase III subunit B) [Cleaved into: Ddi rpc2 intein] | polr3b rpc2 DDB_G0288449 | Dictyostelium discoideum (Slime mold) | [38, 39] | |
Q58192 | Transcription initiation factor IIB (TFIIB) [Cleaved into: Endonuclease Mja Tfb (EC 3.1.-.-) (Mja TFIIB intein) (Mja Tfb intein)] | tfb MJ0782 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
O67475 | Ribonucleoside-diphosphate reductase subunit beta (EC 1.17.4.1) (Ribonucleotide reductase small subunit) [Cleaved into: Aae NrdB intein (Aae RIR2 intein)] | nrdB aq_1505 | Aquifex aeolicus (strain VF5) | ||
O33845 | DNA polymerase (EC 2.7.7.7) (Pol Tfu) [Cleaved into: Tag pol-1 intein (Intein I) (Tsp-TY pol-1); Tag pol-2 intein (Intein II) (Tsp-TY pol-2); Tag pol-3 intein (Intein III) (Tsp-TY pol-3)] | pol | Thermococcus aggregans | ||
Q55418 | Replicative DNA helicase (EC 3.6.4.12) [Cleaved into: Ssp dnaB intein] | dnaB slr0833 | Synechocystis sp. (strain PCC 6803 / Kazusa) | [40-43] | |
B3LV44 | Protein hedgehog [Cleaved into: Protein hedgehog N-product; Protein hedgehog C-product] | hh GF18029 | Drosophila ananassae (Fruit fly) | ||
B4R1D8 | Protein hedgehog [Cleaved into: Protein hedgehog N-product; Protein hedgehog C-product] | hh GD18403 | Drosophila simulans (Fruit fly) | ||
B4NJP3 | Protein hedgehog [Cleaved into: Protein hedgehog N-product; Protein hedgehog C-product] | hh GK12833 | Drosophila willistoni (Fruit fly) | ||
Q9UYC6 | Archaeal Lon protease (EC 3.4.21.-) (ATP-dependent protease La homolog) [Cleaved into: Pab lon intein] | lon PYRAB15820 PAB1313 | Pyrococcus abyssi (strain GE5 / Orsay) | [44, 45] | |
Q1XDF3 | Probable replicative DNA helicase (EC 3.6.4.12) [Cleaved into: Pye dnaB intein] | dnaB | Pyropia yezoensis (Susabi-nori) (Porphyra yezoensis) | ||
P42379 | ATP-dependent Clp protease proteolytic subunit (EC 3.4.21.92) (Endopeptidase Clp) [Cleaved into: Ceu clpP intein (Insertion IS2)] | clpP | Chlamydomonas moewusii (Chlamydomonas eugametos) | [46, 47] | |
Q2FSF9 | DNA polymerase II large subunit (Pol II) (EC 2.7.7.7) (Exodeoxyribonuclease large subunit) (EC 3.1.11.1) [Cleaved into: Mhu polC intein (Mhu pol II intein)] | polC Mhun_2435 | Methanospirillum hungatei JF-1 (strain ATCC 27890 / DSM 864 / NBRC 100397 / JF-1) | ||
O64146 | DNA-directed DNA polymerase (EC 2.7.7.7) | Bacillus phage SPbeta (Bacillus phage SPBc2) (Bacteriophage SP-beta) | |||
O64095 | Protein AimP (YopL protein) [Cleaved into: Arbitrium peptide] | aimP yopL | Bacillus phage SPbeta (Bacillus phage SPBc2) (Bacteriophage SP-beta) | ||
O58822 | Probable translation initiation factor IF-2 [Cleaved into: Pho infB intein (Pho IF2 intein)] | infB PH1095 | Pyrococcus horikoshii (strain ATCC 700860 / DSM 12428 / JCM 9974 / NBRC 100139 / OT-3) | ||
O33149 | DNA gyrase subunit A (EC 5.6.2.2) [Cleaved into: Mma GyrA intein] (Fragment) | gyrA | Mycobacterium malmoense | ||
P97812 | Indian hedgehog protein (IHH) (HHG-2) [Cleaved into: Indian hedgehog protein N-product; Indian hedgehog protein C-product] | Ihh | Mus musculus (Mouse) | [48] | |
O58221 | Archaeal Lon protease (EC 3.4.21.-) (ATP-dependent protease La homolog) [Cleaved into: Pho lon intein] | PH0452 | Pyrococcus horikoshii (strain ATCC 700860 / DSM 12428 / JCM 9974 / NBRC 100139 / OT-3) | ||
P09932 | Homothallic switching endonuclease (Ho endonuclease) | HO YDL227C | Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) | [11] | |
O58001 | DNA repair and recombination protein RadA [Cleaved into: Pho RadA intein] | radA PH0263 | Pyrococcus horikoshii (strain ATCC 700860 / DSM 12428 / JCM 9974 / NBRC 100139 / OT-3) | ||
Q9F417 | Protein RecA (Recombinase A) [Cleaved into: Mch RecA intein] (Fragment) | recA | Mycolicibacterium chitae (Mycobacterium chitae) | ||
Q57962 | Probable phosphoenolpyruvate synthase (PEP synthase) (EC 2.7.9.2) (Pyruvate, water dikinase) [Cleaved into: Mja pep intein (Mja pepA intein)] | ppsA MJ0542 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
B4PN49 | Protein hedgehog [Cleaved into: Protein hedgehog N-product; Protein hedgehog C-product] | hh GE23980 | Drosophila yakuba (Fruit fly) | ||
O64173 | Ribonucleoside-diphosphate reductase nrdEB subunit alpha (EC 1.17.4.1) (Ribonucleotide reductase large subunit) [Cleaved into: SPBc2 bnrdE intein] | bnrdE | Bacillus phage SPbeta (Bacillus phage SPBc2) (Bacteriophage SP-beta) | ||
Q9V2F4 | DNA polymerase II large subunit (Pol II) (EC 2.7.7.7) (Exodeoxyribonuclease large subunit) (EC 3.1.11.1) [Cleaved into: Pab polC intein (Pab pol II intein)] | polC PYRAB01200 PAB2404 | Pyrococcus abyssi (strain GE5 / Orsay) | [44, 45] | |
Q9UZK7 | Probable translation initiation factor IF-2 [Cleaved into: Pab infB intein (Pab IF2 intein)] | infB PYRAB11390 PAB0755 | Pyrococcus abyssi (strain GE5 / Orsay) | [45] | |
Q8U1R8 | Probable translation initiation factor IF-2 [Cleaved into: Pfu infB intein (Pfu IF2 intein)] | infB PF1137 | Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) | ||
P72065 | DNA gyrase subunit A (EC 5.6.2.2) [Cleaved into: Mxe GyrA intein] (Fragment) | gyrA | Mycobacterium xenopi | [49, 50] | |
Q5JGR9 | Probable translation initiation factor IF-2 [Cleaved into: Pko infB intein (Pko IF2 intein)] | infB TK1305 | Thermococcus kodakarensis (strain ATCC BAA-918 / JCM 12380 / KOD1) (Pyrococcus kodakaraensis (strain KOD1)) | ||
Q49608 | DNA gyrase subunit A (EC 5.6.2.2) [Cleaved into: Mka GyrA intein] (Fragment) | gyrA | Mycobacterium kansasii | ||
Q49166 | DNA gyrase subunit A (EC 5.6.2.2) [Cleaved into: Mfl GyrA intein] (Fragment) | gyrA | Mycolicibacterium flavescens (Mycobacterium flavescens) | ||
A3CXE7 | DNA polymerase II large subunit (Pol II) (EC 2.7.7.7) (Exodeoxyribonuclease large subunit) (EC 3.1.11.1) [Cleaved into: Memar polC intein (Memar pol II intein)] | polC Memar_2124 | Methanoculleus marisnigri (strain ATCC 35101 / DSM 1498 / JR1) | ||
O59610 | DNA polymerase (EC 2.7.7.7) [Cleaved into: Pho pol intein (Pho Pol I intein)] | pol PH1947 PHBT047 | Pyrococcus horikoshii (strain ATCC 700860 / DSM 12428 / JCM 9974 / NBRC 100139 / OT-3) | ||
Q9HMX8 | DNA polymerase II large subunit (Pol II) (EC 2.7.7.7) (Exodeoxyribonuclease large subunit) (EC 3.1.11.1) [Cleaved into: Hsp-NRC1 polC intein (Hsp-NRC1 pol2 intein)] | polC polA2 VNG_2338G | Halobacterium salinarum (strain ATCC 700922 / JCM 11081 / NRC-1) (Halobacterium halobium) | ||
O64094 | AimR transcriptional regulator (Arbitrium communication peptide receptor) (YopK protein) | aimR yopK | Bacillus phage SPbeta (Bacillus phage SPBc2) (Bacteriophage SP-beta) | ||
Q49467 | DNA gyrase subunit A (EC 5.6.2.2) [Cleaved into: Mgo GyrA intein] (Fragment) | gyrA | Mycobacterium gordonae | ||
A7U6F1 | DNA polymerase (EC 2.7.7.7) [Cleaved into: CeV01 dpo intein] | dpo | Chrysochromulina ericina virus (CeV01) | ||
Q9HH84 | DNA polymerase (EC 2.7.7.7) [Cleaved into: Endonuclease PI-TspGE8I (EC 3.1.-.-) (Tsp-GE8 pol-1 intein); Endonuclease PI-TspGE8II (EC 3.1.-.-) (Tsp-GE8 pol-2 intein)] | pol pol-1 | Thermococcus sp. (strain GE8) | ||
P61969 | LIM domain transcription factor LMO4 (Breast tumor autoantigen) (LIM domain only protein 4) (LMO-4) | Lmo4 | Mus musculus (Mouse) | [51-59] | |
P51333 | Probable replicative DNA helicase (EC 3.6.4.12) [Cleaved into: Ppu dnaB intein] | dnaB | Porphyra purpurea (Red seaweed) (Ulva purpurea) | ||
Q5JET0 | DNA polymerase II large subunit (Pol II) (EC 2.7.7.7) (Exodeoxyribonuclease large subunit) (EC 3.1.11.1) [Cleaved into: Pko polC intein (Pko pol II intein)] | polC TK1903 | Thermococcus kodakarensis (strain ATCC BAA-918 / JCM 12380 / KOD1) (Pyrococcus kodakaraensis (strain KOD1)) | ||
B3P7F8 | Protein hedgehog [Cleaved into: Protein hedgehog N-product; Protein hedgehog C-product] | hh GG12458 | Drosophila erecta (Fruit fly) | ||
B4K4M0 | Protein hedgehog [Cleaved into: Protein hedgehog N-product; Protein hedgehog C-product] | hh GI24573 | Drosophila mojavensis (Fruit fly) | ||
B4HFB7 | Protein hedgehog [Cleaved into: Protein hedgehog N-product; Protein hedgehog C-product] | hh GM23589 | Drosophila sechellia (Fruit fly) | ||
P56674 | Protein hedgehog [Cleaved into: Protein hedgehog N-product; Protein hedgehog C-product] | hh | Drosophila hydei (Fruit fly) | ||
Q8TXJ4 | Elongation factor 2 (EF-2) [Cleaved into: Mka FusA intein] | fusA MK0679 | Methanopyrus kandleri (strain AV19 / DSM 6324 / JCM 9639 / NBRC 100938) | ||
Q02936 | Protein hedgehog [Cleaved into: Protein hedgehog N-product (Hh-Np) (N-Hh); Protein hedgehog C-product (Hh-Cp) (C-Hh)] | hh CG4637 | Drosophila melanogaster (Fruit fly) | [60, 61] | |
Q29AA9 | Protein hedgehog [Cleaved into: Protein hedgehog N-product; Protein hedgehog C-product] | hh-1 GA18321; hh-2 GA29124 | Drosophila pseudoobscura pseudoobscura (Fruit fly) | ||
B4G2I8 | Protein hedgehog [Cleaved into: Protein hedgehog N-product; Protein hedgehog C-product] | hh GL23598 | Drosophila persimilis (Fruit fly) | ||
B4LZT9 | Protein hedgehog [Cleaved into: Protein hedgehog N-product; Protein hedgehog C-product] | hh GJ22641 | Drosophila virilis (Fruit fly) | ||
P21505 | Homing endonuclease I-DmoI (EC 3.1.-.-) | Desulfurococcus mobilis | [62, 63] | ||
O30477 | Replicative DNA helicase (EC 3.6.4.12) [Cleaved into: Endonuclease PI-Rma43812IP (EC 3.1.-.-) (Rma dnaB intein)] | dnaB | Rhodothermus marinus (Rhodothermus obamensis) | [64] | |
Q57710 | Probable translation initiation factor IF-2 [Cleaved into: Mja infB intein (Mja IF2 intein)] | infB MJ0262 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
O78411 | Probable replicative DNA helicase (EC 3.6.4.12) [Cleaved into: Gth dnaB intein] | dnaB | Guillardia theta (Cryptophyte) (Cryptomonas phi) | ||
Q91610 | Desert hedgehog protein A (Cephalic hedgehog protein) (Desert hedgehog protein 1) (DHH-1) (X-CHH) [Cleaved into: Desert hedgehog protein A N-product; Desert hedgehog protein A C-product] | dhh-a chh | Xenopus laevis (African clawed frog) | ||
Q61488 | Desert hedgehog protein (DHH) (HHG-3) [Cleaved into: Desert hedgehog protein N-product; Desert hedgehog protein C-product] | Dhh | Mus musculus (Mouse) | [65, 66] | |
Q9F5P4 | Replicative DNA helicase (EC 3.6.4.12) [Cleaved into: Min DnaB intein] (Fragment) | dnaB | Mycobacterium intracellulare | ||
Q8YZA1 | Replicative DNA helicase (EC 3.6.4.12) [Cleaved into: Endonuclease PI-AspHIP (EC 3.1.-.-) (Asp dnaB intein)] | dnaB all0578 | Nostoc sp. (strain PCC 7120 / SAG 25.82 / UTEX 2576) | ||
P59966 | Replicative DNA helicase (EC 3.6.4.12) [Cleaved into: Endonuclease PI-MboHIP (EC 3.1.-.-) (Mbo dnaB intein)] | dnaB BQ2027_MB0059 | Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97) | [67, 68] | |
P9WMR2 | Replicative DNA helicase (EC 3.6.4.12) [Cleaved into: Endonuclease PI-MtuHIP (EC 3.1.-.-) (Mtu dnaB intein)] | dnaB MT0064 | Mycobacterium tuberculosis (strain CDC 1551 / Oshkosh) | ||
Q14623 | Indian hedgehog protein (IHH) (HHG-2) [Cleaved into: Indian hedgehog protein N-product; Indian hedgehog protein C-product] | IHH | Homo sapiens (Human) | [54, 66, 69-76] | |
Q9T1Q3 | Probable DNA polymerase (EC 2.7.7.7) (EC 3.1.11.-) (P45) | Acyrthosiphon pisum secondary endosymbiont phage 1 (Bacteriophage APSE-1) | |||
Q58295 | DNA polymerase (EC 2.7.7.7) [Cleaved into: Mja pol-1 intein; Mja pol-2 intein] | pol MJ0885 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
Q98862 | Indian hedgehog B protein (IHHB) (Echidna hedgehog protein) (EHH) [Cleaved into: Indian hedgehog B protein N-product; Indian hedgehog B protein C-product] | ihhb ehh ihh | Danio rerio (Zebrafish) (Brachydanio rerio) | [77, 78] | |
P46394 | Replicative DNA helicase (EC 3.6.4.12) [Cleaved into: Mle dnaB intein] | dnaB ML2680 MLCB1913.16c | Mycobacterium leprae (strain TN) | [18, 79] | |
Q60348 | Uncharacterized protein MJ0043 [Cleaved into: Mja hyp1 intein] | MJ0043 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
O34537 | SPbeta prophage-derived uncharacterized protein YosV | yosV yojX BSU19990 | Bacillus subtilis (strain 168) | [33, 80] | |
Q8TUT0 | V-type ATP synthase beta chain (V-ATPase subunit B) [Cleaved into: Mka AtpB intein] | atpB MK1673 | Methanopyrus kandleri (strain AV19 / DSM 6324 / JCM 9639 / NBRC 100938) | ||
O64076 | DNA-directed RNA polymerase YonO (EC 2.7.7.6) (DNA-dependent RNA polymerase YonO) | yonO | Bacillus phage SPbeta (Bacillus phage SPBc2) (Bacteriophage SP-beta) | [81, 82] | |
Q74ZN0 | Protein HIR1 | HIR1 AGR168W | Ashbya gossypii (strain ATCC 10895 / CBS 109.51 / FGSC 9923 / NRRL Y-1056) (Yeast) (Eremothecium gossypii) | [83, 84] | |
B4JTF5 | Protein hedgehog [Cleaved into: Protein hedgehog N-product; Protein hedgehog C-product] | hh GH23852 | Drosophila grimshawi (Hawaiian fruit fly) (Idiomyia grimshawi) | ||
Q92008 | Sonic hedgehog protein A (SHHA) (Shh unprocessed N-terminal signaling and C-terminal autoprocessing domains) (ShhNC) (VHH-1) [Cleaved into: Sonic hedgehog protein A N-product (Shh N-terminal processed signaling domains) (ShhNp) (Sonic hedgehog protein N-product) (ShhN)] | shha shh vhh1 | Danio rerio (Zebrafish) (Brachydanio rerio) | [78, 85-88] | |
O64046 | Probable tape measure protein (TMP) (Transglycosylase) (EC 4.2.2.n1) | yomI | Bacillus phage SPbeta (Bacillus phage SPBc2) (Bacteriophage SP-beta) | [81, 89] | |
G5EC21 | Protein qua-1 | qua-1 T05C12.10 | Caenorhabditis elegans | [90, 91] | |
Q9F416 | Protein RecA (Recombinase A) [Cleaved into: Mfa RecA intein] (Fragment) | recA | Mycolicibacterium fallax (Mycobacterium fallax) | ||
Q18E75 | Replication factor C small subunit (RFC small subunit) (Clamp loader small subunit) [Cleaved into: Hwa RFC intein] | rfcS HQ_3671A | Haloquadratum walsbyi (strain DSM 16790 / HBSQ001) | ||
Q8U0H4 | tRNA-splicing ligase RtcB (EC 6.5.1.-) [Cleaved into: Pfu hyp2 intein (EC 3.1.-.-)] | rtcB PF1615 | Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) | ||
Q58817 | Replication factor C small subunit (RFC small subunit) (Clamp loader small subunit) [Cleaved into: Mja RFC-1 intein; Mja RFC-2 intein; Mja RFC-3 intein] | rfcS MJ1422 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
Q94129 | Warthog protein 4 (Protein M75) [Cleaved into: Warthog protein 4 N-product; Warthog protein 4 C-product] | wrt-4 ZK678.5 | Caenorhabditis elegans | [90, 92] | |
Q57762 | Uncharacterized protein MJ0314 | MJ0314 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
Q9F414 | Protein RecA (Recombinase A) [Cleaved into: Mga RecA intein] (Fragment) | recA | Mycobacterium gastri | ||
A5CYP3 | Probable cell division protein WhiA | whiA PTH_2727 | Pelotomaculum thermopropionicum (strain DSM 13744 / JCM 10971 / SI) | ||
A7FYW9 | Probable cell division protein WhiA | whiA CLB_3431 | Clostridium botulinum (strain ATCC 19397 / Type A) | ||
A5I7A0 | Probable cell division protein WhiA | whiA CBO3375 CLC_3318 | Clostridium botulinum (strain Hall / ATCC 3502 / NCTC 13319 / Type A) | [93, 94] | |
B1IFW1 | Probable cell division protein WhiA | whiA CLD_1133 | Clostridium botulinum (strain Okra / Type B1) | ||
Q9CGX9 | Probable cell division protein WhiA | whiA LL0963 L190464 | Lactococcus lactis subsp. lactis (strain IL1403) (Streptococcus lactis) | ||
Q02ZN6 | Probable cell division protein WhiA | whiA LACR_1049 | Lactococcus lactis subsp. cremoris (strain SK11) | ||
P9WFP6 | UPF0051 protein MT1508 [Cleaved into: Endonuclease PI-MtuHIIP (EC 3.1.-.-) (Mtu pps1 intein)] | MT1508 | Mycobacterium tuberculosis (strain CDC 1551 / Oshkosh) | ||
O64032 | Sublancin immunity protein sunI | sunI yolF | Bacillus phage SPbeta (Bacillus phage SPBc2) (Bacteriophage SP-beta) | ||
P0A5U5 | Protein RecA (Recombinase A) [Cleaved into: Endonuclease PI-MboI (EC 3.1.-.-) (Mbo RecA intein)] | recA BQ2027_MB2756C | Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97) | [67, 68] | |
Q62226 | Sonic hedgehog protein (SHH) (HHG-1) (Shh unprocessed N-terminal signaling and C-terminal autoprocessing domains) (ShhNC) [Cleaved into: Sonic hedgehog protein N-product (ShhN) (Shh N-terminal processed signaling domains) (ShhNp) (Sonic hedgehog protein 19 kDa product)] | Shh Hhg1 | Mus musculus (Mouse) | [95-97] | |
O30601 | SPbeta prophage-derived ribonucleoside-diphosphate reductase subunit beta (EC 1.17.4.1) (Ribonucleotide reductase small subunit) | yosP yojQ/yojS BSU20040 | Bacillus subtilis (strain 168) | [32, 80] | |
Q58095 | tRNA-splicing ligase RtcB (EC 6.5.1.-) [Cleaved into: Mja hyp2 intein (EC 3.1.-.-)] | rtcB MJ0682 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
Q5JET4 | DNA repair and recombination protein RadA [Cleaved into: Pko RadA intein] | radA TK1899 | Thermococcus kodakarensis (strain ATCC BAA-918 / JCM 12380 / KOD1) (Pyrococcus kodakaraensis (strain KOD1)) | ||
Q9F410 | Protein RecA (Recombinase A) [Cleaved into: Msh RecA intein] (Fragment) | recA | Mycobacterium shimoidei | ||
Q8TZC4 | Replication factor C small subunit (RFC small subunit) (Clamp loader small subunit) [Cleaved into: Mkn RFC intein] | rfcS MK0006 | Methanopyrus kandleri (strain AV19 / DSM 6324 / JCM 9639 / NBRC 100938) | ||
Q9F415 | Protein RecA (Recombinase A) [Cleaved into: Mfl RecA intein] (Fragment) | recA | Mycolicibacterium flavescens (Mycobacterium flavescens) | ||
P35901 | Protein RecA (Recombinase A) [Cleaved into: Mle RecA intein] | recA ML0987 | Mycobacterium leprae (strain TN) | [18, 98] | |
Q15465 | Sonic hedgehog protein (SHH) (HHG-1) (Shh unprocessed N-terminal signaling and C-terminal autoprocessing domains) (ShhNC) [Cleaved into: Sonic hedgehog protein N-product (ShhN) (Shh N-terminal processed signaling domains) (ShhNp)] | SHH | Homo sapiens (Human) | [97, 99, 100] | |
O55716 | Ribonucleoside-diphosphate reductase large subunit (EC 1.17.4.1) (Ribonucleotide reductase large subunit) [Cleaved into: IIV-6 RIR1 intein] | IIV6-085L | Invertebrate iridescent virus 6 (IIV-6) (Chilo iridescent virus) | [101, 102] | |
Q90385 | Sonic hedgehog protein (SHH) [Cleaved into: Sonic hedgehog protein N-product; Sonic hedgehog protein C-product] | SHH | Cynops pyrrhogaster (Japanese fire-bellied newt) | ||
O30602 | Uncharacterized protein YojW | yojW BSU19999 | Bacillus subtilis (strain 168) | [33, 80] | |
Q9P997 | V-type ATP synthase alpha chain (EC 7.1.2.2) (V-ATPase subunit A) [Cleaved into: Tac AtpA intein (Tac VMA intein)] | atpA Ta0004 | Thermoplasma acidophilum (strain ATCC 25905 / DSM 1728 / JCM 9062 / NBRC 15155 / AMRC-C165) | [103, 104] | |
Q59KI0 | UTP--glucose-1-phosphate uridylyltransferase (EC 2.7.7.9) (UDP-glucose pyrophosphorylase) (UDPGP) (UGPase) | UGP1 CAALFM_CR04660CA CaO19.1738 CaO19.9305 | Candida albicans (strain SC5314 / ATCC MYA-2876) (Yeast) | [105, 106] | |
Q5JHP2 | Replication factor C small subunit (RFC small subunit) (Clamp loader small subunit) [Cleaved into: Pko RFC intein] | rfcS TK2218 | Thermococcus kodakarensis (strain ATCC BAA-918 / JCM 12380 / KOD1) (Pyrococcus kodakaraensis (strain KOD1)) | [13, 107] | |
Q38Y99 | Probable cell division protein WhiA | whiA LCA_0528 | Lactobacillus sakei subsp. sakei (strain 23K) | ||
Q8U4A6 | V-type ATP synthase alpha chain (EC 7.1.2.2) (V-ATPase subunit A) [Cleaved into: Endonuclease PI-Pfu2 (EC 3.1.-.-) (Pfu AtpA intein) (Pfu VMA intein)] | atpA PF0182 | Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) | ||
A6M2Y1 | Probable cell division protein WhiA | whiA Cbei_4855 | Clostridium beijerinckii (strain ATCC 51743 / NCIMB 8052) (Clostridium acetobutylicum) | ||
P02958 | Small, acid-soluble spore protein C (SASP) | sspC BSU19950 | Bacillus subtilis (strain 168) | [33, 80, 108] | |
Q91035 | Sonic hedgehog protein (SHH) (Shh unprocessed N-terminal signaling and C-terminal autoprocessing domains) (ShhNC) [Cleaved into: Sonic hedgehog protein N-product (ShhN) (Shh N-terminal processed signaling domains) (ShhNp)] | SHH | Gallus gallus (Chicken) | [109-111] | |
O34479 | SPbeta prophage-derived putative HNH homing endonuclease YosQ (EC 3.1.-.-) | yosQ yojR BSU20050 | Bacillus subtilis (strain 168) | [33, 80] | |
A7GIX1 | Probable cell division protein WhiA | whiA CLI_3559 | Clostridium botulinum (strain Langeland / NCTC 10281 / Type F) | ||
B2TQR9 | Probable cell division protein WhiA | whiA CLL_A3340 | Clostridium botulinum (strain Eklund 17B / Type B) | ||
B8CYG6 | Probable cell division protein WhiA | whiA Hore_15860 | Halothermothrix orenii (strain H 168 / OCM 544 / DSM 9562) | ||
O34342 | SPbeta prophage-derived thioredoxin-like protein YosR | yosR yojT BSU20030 | Bacillus subtilis (strain 168) | [33, 80] | |
A0PYB7 | Probable cell division protein WhiA | whiA NT01CX_1286 | Clostridium novyi (strain NT) | ||
Q0TU86 | Probable cell division protein WhiA | whiA CPF_0345 | Clostridium perfringens (strain ATCC 13124 / DSM 756 / JCM 1290 / NCIMB 6125 / NCTC 8237 / Type A) | ||
Q0SW34 | Probable cell division protein WhiA | whiA CPR_0337 | Clostridium perfringens (strain SM101 / Type A) | ||
B1MXG8 | Probable cell division protein WhiA | whiA LCK_00387 | Leuconostoc citreum (strain KM20) | ||
O64175 | Putative HNH homing endonuclease yosQ (EC 3.1.-.-) | yosQ | Bacillus phage SPbeta (Bacillus phage SPBc2) (Bacteriophage SP-beta) | ||
P91573 | Warthog protein 6 [Cleaved into: Warthog protein 6 N-product; Warthog protein 6 C-product] | wrt-6 ZK377.1 | Caenorhabditis elegans | [90, 112, 113] | |
Q63673 | Sonic hedgehog protein (SHH) (Shh unprocessed N-terminal signaling and C-terminal autoprocessing domains) (ShhNC) [Cleaved into: Sonic hedgehog protein N-product (ShhN) (Shh N-terminal processed signaling domains) (ShhNp)] | Shh Vhh-1 | Rattus norvegicus (Rat) | [85, 114] | |
P77966 | DNA gyrase subunit B (EC 5.6.2.2) [Cleaved into: Ssp GyrB intein] | gyrB sll2005 | Synechocystis sp. (strain PCC 6803 / Kazusa) | ||
Q92000 | Sonic hedgehog protein (Shh unprocessed N-terminal signaling and C-terminal autoprocessing domains) (ShhNC) (VHH-1) (X-SHH) [Cleaved into: Sonic hedgehog protein N-product (ShhN) (Shh N-terminal processed signaling domains) (ShhNp)] | shh | Xenopus laevis (African clawed frog) | [115-117] | |
Q9F407 | Protein RecA (Recombinase A) [Cleaved into: Mth RecA intein] (Fragment) | recA | Mycolicibacterium thermoresistibile (strain ATCC 19527 / DSM 44167 / CIP 105390 / JCM 6362 / NCTC 10409 / 316) (Mycobacterium thermoresistibile) | ||
Q59560 | Protein RecA (Recombinase A) | recA MSMEG_2723 MSMEI_2656 | Mycolicibacterium smegmatis (strain ATCC 700084 / mc(2)155) (Mycobacterium smegmatis) | [118-126] | |
B2UZY1 | Probable cell division protein WhiA | whiA CLH_3090 | Clostridium botulinum (strain Alaska E43 / Type E3) | ||
A2RLG3 | Probable cell division protein WhiA | whiA llmg_1555 | Lactococcus lactis subsp. cremoris (strain MG1363) | ||
Q94130 | Warthog protein 8 (Protein M89) [Cleaved into: Warthog protein 8 N-product; Warthog protein 8 C-product] | wrt-8 C29F3.2 | Caenorhabditis elegans | [90, 92] | |
Q9UXU7 | V-type ATP synthase alpha chain (EC 7.1.2.2) (V-ATPase subunit A) [Cleaved into: Pab AtpA intein (Pab VMA intein)] | atpA PYRAB17610 PAB2378 | Pyrococcus abyssi (strain GE5 / Orsay) | [44, 45] | |
O34775 | SPbeta prophage-derived putative transcriptional regulator YosT | yosT yojV BSU20010 | Bacillus subtilis (strain 168) | [33, 80] | |
Q58191 | Uncharacterized protein MJ0781 [Cleaved into: Mja klbA intein] | MJ0781 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
Q97LP1 | Probable cell division protein WhiA | whiA CA_C0513 | Clostridium acetobutylicum (strain ATCC 824 / DSM 792 / JCM 1419 / LMG 5710 / VKM B-1787) | ||
B1L268 | Probable cell division protein WhiA | whiA CLK_2807 | Clostridium botulinum (strain Loch Maree / Type A3) | ||
Q180P5 | Probable cell division protein WhiA | whiA CD630_33970 | Clostridioides difficile (strain 630) (Peptoclostridium difficile) | ||
Q890Y9 | Probable cell division protein WhiA | whiA CTC_02493 | Clostridium tetani (strain Massachusetts / E88) | ||
Q67T19 | Probable cell division protein WhiA | whiA STH189 | Symbiobacterium thermophilum (strain T / IAM 14863) | ||
B0TGK8 | Probable cell division protein WhiA | whiA Helmi_06440 HM1_1308 | Heliobacterium modesticaldum (strain ATCC 51547 / Ice1) | ||
Q9V168 | tRNA-splicing ligase RtcB (EC 6.5.1.-) [Cleaved into: Pab hyp2 intein (EC 3.1.-.-)] | rtcB PYRAB05600 PAB0383 | Pyrococcus abyssi (strain GE5 / Orsay) | [44, 45] | |
Q58445 | DNA-directed RNA polymerase subunit A' (EC 2.7.7.6) [Cleaved into: Mja rpoA1 intein (Mja rpol A' intein)] | rpoA1 MJ1042 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
P67126 | UPF0051 protein Mb1496 [Cleaved into: Endonuclease PI-MtuHIIP (EC 3.1.-.-) (Mtu pps1 intein)] | Mycobacterium bovis (strain ATCC BAA-935 / AF2122/97) | [67, 68] | ||
P9WFP7 | UPF0051 protein Rv1461 [Cleaved into: Endonuclease PI-MtuHIIP (EC 3.1.-.-) (Mtu pps1 intein)] | Rv1461 MTV007.08 | Mycobacterium tuberculosis (strain ATCC 25618 / H37Rv) | [10, 30, 31] | |
P68583 | SPbeta prophage-derived uncharacterized protein YosX | yosX yojZ BSU19970 | Bacillus subtilis (strain 168) | [33, 80] | |
O34919 | SPbeta prophage-derived deoxyuridine 5'-triphosphate nucleotidohydrolase YosS (dUTPase) (EC 3.6.1.23) (dUTP pyrophosphatase) | yosS yojU BSU20020 | Bacillus subtilis (strain 168) | [33, 80, 127-129] | |
Q8XNH7 | Probable cell division protein WhiA | whiA CPE0356 | Clostridium perfringens (strain 13 / Type A) | ||
Q03Z55 | Probable cell division protein WhiA | whiA LEUM_0395 | Leuconostoc mesenteroides subsp. mesenteroides (strain ATCC 8293 / NCDO 523) | ||
Q9HH05 | DNA polymerase (EC 2.7.7.7) [Cleaved into: Endonuclease PI-ThyII (EC 3.1.-.-) (Thy pol-1 intein); Endonuclease PI-ThyI (EC 3.1.-.-) (Thy pol-2 intein)] (Fragment) | pol | Thermococcus hydrothermalis | ||
Q58454 | Uncharacterized protein MJ1054 (EC 1.1.1.-) [Cleaved into: Mja UDPGD intein] | MJ1054 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
Q94128 | Warthog protein 1 [Cleaved into: Warthog protein 1 N-product; Warthog protein 1 C-product] | wrt-1 ZK1290.12 | Caenorhabditis elegans | [90, 92] | |
P9WHJ2 | Protein RecA (Recombinase A) [Cleaved into: Endonuclease PI-MtuI (EC 3.1.-.-) (Mtu RecA intein)] | recA MT2806 | Mycobacterium tuberculosis (strain CDC 1551 / Oshkosh) | ||
Q58446 | DNA-directed RNA polymerase subunit A'' (EC 2.7.7.6) [Cleaved into: Mja rpoA2 intein (Mja rpol A'' intein)] | rpoA2 MJ1043 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
Q9V2G4 | Replication factor C small subunit (RFC small subunit) (Clamp loader small subunit) (PabRFC small subunit) [Cleaved into: Pab RFC-1 intein; Pab RFC-2 intein] | rfcS PYRAB01100 PAB0068 | Pyrococcus abyssi (strain GE5 / Orsay) | [44, 45, 130] | |
O57852 | Replication factor C small subunit (RFC small subunit) (Clamp loader small subunit) [Cleaved into: Pho RFC intein] | rfcS PH0112 | Pyrococcus horikoshii (strain ATCC 700860 / DSM 12428 / JCM 9974 / NBRC 100139 / OT-3) | ||
P46946 | DNA endonuclease SAE2 (EC 3.1.-.-) (Completion of meiotic recombination protein 1) (Sporulation in the absence of SPO11 protein 2) | SAE2 COM1 YGL175C G1639 | Saccharomyces cerevisiae (strain ATCC 204508 / S288c) (Baker's yeast) | [131, 132] | |
O64174 | Ribonucleoside-diphosphate reductase subunit beta (EC 1.17.4.1) (Ribonucleotide reductase small subunit) | bnrdF yosP | Bacillus phage SPbeta (Bacillus phage SPBc2) (Bacteriophage SP-beta) | ||
O57728 | V-type ATP synthase alpha chain (EC 7.1.2.2) (V-ATPase subunit A) [Cleaved into: Endonuclease PI-Pho2 (EC 3.1.-.-) (Pho AtpA intein) (Pho VMA intein)] | atpA PH1975 | Pyrococcus horikoshii (strain ATCC 700860 / DSM 12428 / JCM 9974 / NBRC 100139 / OT-3) | ||
Q5UQR0 | DNA polymerase (EC 2.7.7.7) [Cleaved into: Mimv polB intein] | POLB MIMI_R322 | Acanthamoeba polyphaga mimivirus (APMV) | [133, 134] | |
Q5UZ40 | DNA polymerase II large subunit (Pol II) (EC 2.7.7.7) (Exodeoxyribonuclease large subunit) (EC 3.1.11.1) [Cleaved into: Hma polC intein (Hma pol II intein)] | polC polA2 rrnAC2691 | Haloarcula marismortui (strain ATCC 43049 / DSM 3752 / JCM 8966 / VKM B-1809) (Halobacterium marismortui) | ||
P74750 | DNA polymerase III subunit alpha (EC 2.7.7.7) [Cleaved into: Ssp dnaE intein] (Fragments) | dnaE-N slr0603; dnaE-C sll1572 | Synechocystis sp. (strain PCC 6803 / Kazusa) | [135, 136] | |
A2BGR3 | DNA excision repair protein ERCC-6-like (EC 3.6.4.12) (ATP-dependent helicase ERCC6-like) | ercc6l si:ch211-278b8.3 | Danio rerio (Zebrafish) (Brachydanio rerio) | [137, 138] | |
Q97CQ0 | V-type ATP synthase alpha chain (EC 7.1.2.2) (V-ATPase subunit A) [Cleaved into: Tvo AtpA intein (Tvo VMA intein)] | atpA TV0051 TVG0054274 | Thermoplasma volcanium (strain ATCC 51530 / DSM 4299 / JCM 9571 / NBRC 15438 / GSS1) | ||
Q50362 | Uncharacterized protein MG315 homolog | MPN_450 H08_orf314 MP391 | Mycoplasma pneumoniae (strain ATCC 29342 / M129) | [139, 140] | |
Q57841 | Uncharacterized protein MJ0398 | MJ0398 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) | ||
Q49689 | UPF0051 protein ML0593 [Cleaved into: Mle pps1 intein] | ML0593 B1496_C2_189 MLCL536.28c | Mycobacterium leprae (strain TN) | ||
Q58242 | Uncharacterized protein MJ0832 [Cleaved into: Mja rnr-1 intein; Mja rnr-2 intein] | MJ0832 | Methanocaldococcus jannaschii (strain ATCC 43067 / DSM 2661 / JAL-1 / JCM 10045 / NBRC 100440) (Methanococcus jannaschii) |

Based on their domain structure, inteins are categorized in four classes: full-length inteins, mini-inteins, split-inteins and alanine-inteins (Figure 1). Most inteins are full-length inteins expressed within a single polypeptide chain (cis-splicing inteins). They are bifunctional proteins that include two structural domains - the intein domain responsible for protein splicing out of the precursor polypeptide chain, and a homing endonuclease (HE) domain with a role in DNA-cutting and insertion of the associated mobile genetic element into the precursor protein-coding gene [141-144]. The HE domain splits the splicing domain into N and C-terminal splicing domains. Intein sequence alignments revealed important motifs and conserved regions that are mediating the splicing [145]. The splicing domain consists of conserved blocks A, B, F and G, while blocks C, D, and E are present in the HE domains. Blocks C and E contain conserved endonuclease active sites with catalytic residues Asp, Glu or Lys [146, 147]. Several inteins have mutations in these endonuclease active site residues and therefore may not be active endonucleases although the remainder of the motif is present. Blocks A, and B, localize near the N-terminus of the intein, while blocks F and G are located near the C-terminus Conserved amino acids important for the splicing process (Cys, Ser or Thr) are present both at the intein N-terminus as well as on the C-extein fragment near the intein-extein junction (Figure 2) [145]. In addition, a dipeptide His-Asn or His-Gln is present at the intein C-terminus in most full-length inteins. Split-inteins consist of two short intein segments - the N-terminal intein (IN), and the C-terminal intein (IC). The two fragments reassemble into a complete intein structure similar to full-length inteins through a trans-splicing mechanism [148-150]. Based on their conserved residues and mechanism of action, most full-length and split-inteins are classified as class 1 inteins (see below).
Alanine-inteins have an alanine, instead of a cysteine or a serine at the splicing junction (Figure 1A). Based on the mechanism of splicing, most alanine-inteins belong to either class 2 or class 3 inteins (see below).
The mini-inteins are typical N- and C-terminal splicing domains that lack the HE domain and have a continuous splicing domain, thus are also cis-splicing inteins. In contrast, split-inteins are mini-inteins whose N and C-terminal splicing domains are transcribed and translated with different exteins, and are referred to as a trans-splicing inteins. In this case, the fragments associate through a zipper-like interface prior to protein splicing [151].

Inteins use the same strategies as classical enzymes to perform catalysis [152]. They are single turnover enzymes that splice out during the maturation of their host proteins. Splicing occurs when nucleophilic residues and residues that assist catalysis belonging to the host-intein polypeptide chain are properly aligned as a result of intrachain protein folding [153]. Splicing occurs spontaneously, without the help of any known cofactor, chaperone, or energy source [154].
Standard intein splicing, also known as the splicing mechanism of class 1 inteins, involves the breaking of peptide-bonds at the intein-extein junctions, and the forming of a new peptide bond to connect the separated extein fragments. Class 1 inteins are characterized by motifs and conserved regions that are mediating the splicing [145]. The two conserved motifs near the N- and the C-termini of the intein as well as the C-extein fragment near the intein-extein junction contain amino acids important for the splicing process (Figure 2) [145].
A four-step protein splicing reaction starting with the formation of a thio(ester) bond by N-O or N-S shift at the N-terminal splice junction when the side chain of the first residue (either a Cys, a Ser or a Thr) nucleophilically attacks the peptide bond of the immediately upstream residue, which is the end residue of the N-extein. In a second reaction step, the newly formed (thio)ester is attacked by the side chain of the first residue of the C-extein to free the N-terminal end of the intein, in a trans-(thio)-esterification event. This results in a branched intermediate in which the N-extein and C-extein are attached, albeit not through a peptide bond. The branch is resolved in a next step that involves the last residue of the intein, an asparagine, and its amide nitrogen atom that cleaves apart the peptide bond between the intein and the C-extein. The result is an intein segment with a terminal cyclic imide that is subsequently opened via succinimide hydrolysis. In a final step, the linked N- and C-exteins undergo a finishing acyl rearrangement reaction that involves the breaking of the thio(ester) bond by an O-N or S-N shift and the creation of a new peptide bond [154, 155] (Figure 3).
Ser, Thr, Cys and Asn are essential residues that act as nucleophiles in the splicing mechanism of most inteins. However, inteins have been identified that present variations in the conserved amino acids involved in splicing. Some inteins have an Ala at their N-termini and cannot follow the standard splicing mechanism steps but are able to splice out. Examples are the KlbA and the DnaB families of Ala1 inteins [156, 157]. These inteins splice by different protein splicing mechanisms. KlbA inteins lack the conserved Ser or Cys at the intein N-terminus and the conserved intein penultimate His has been replaced by a Ser. Under these conditions, the C-extein nucleophile attacks a peptide bond at the N-terminal splice junction rather than a (thio)ester bond, eliminating the need to form the initial (thio)ester at the N-terminal splice junction. After this first step, the splicing reaction follows the standard steps: branch resolution by Asn cyclization and acyl rearrangement to form a native peptide bond between the ligated exteins [156]. Intein that work by this mechanism are classified as class 2 inteins (Figure 3).
DnaB and other inteins classified as class 3 inteins, use a modified mechanism that includes new conserved residues and a second branched intermediate [158-160]. The sequence signature of this class of inteins is a noncontiguous Trp-Cys-Thr (WCT) motif and the absence of the standard class 1 N-terminal Cys or Ser nucleophile. A conserved Cys at position F:4 directly attacks the peptide bond at the N-terminal splice site, resulting in the N-extein linked by a thioester to Cys thus forming the class specific branched intermediate. Next, the N-extein is transferred to the side chain of the Ser, Thr, or Cys at the C-terminal splice junction to form the standard branched intermediate [158] (Figure 3).
The implication of Cys residues in the mechanism of action of inteins requires the intein activity to be carried out in a reducing environment. This limits the applicability of split-inteins as many of the fused proteins cannot be exposed to reducing agents without structural or functional consequences. Most recently, Bhagawati et al (2019) engineering a cysteine-less split-intein (CL intein) active at ambient temperatures and in the absence of reducing agents, without requiring a denaturation step [161]. The group developed a strategy for N- and C- terminal labeling of proteins in which a noncatalytic cysteine in a cysteine-Tag is added to either the In or the Ic fragment of an intein, depending on the desired end-labeling. This tag is bioconjugated with a thiol-reactive labels and subsequently ligated to the target protein pre-fused to the complementary intein fragment, via protein trans-splicing. Thus, the target protein is not being exposed to reducing conditions or a thiol-reactive reagent, allowing for the preservation of the disulfide bonds as well as of any free cysteines in the protein (Figure 4).
Both cis-splicing and trans-splicing inteins have been used in various biotechnological applications. Natural inteins have been isolated from cells and engineered to create self-splicing proteins for specific functions. For example, the splicing domain is explored for expression and purification of recombinant proteins [162, 163], cyclization [164], site-specific modification [165, 166] or labeling of proteins [167, 168], post-translational processing, production of selenoproteins, protein regulation by conditional protein splicing, biosensors, or expression of trans-genes [169-171]. In addition, the HE domain proved to be a versatile biotechnological tool, as it has been used in various applications that involve genetic manipulation, including gene therapy for monogenic diseases, insect vector control or and development of transgenic crops [172, 173]. Inteins were also used for in vivo studies, like detecting protein-protein interactions, monitoring protein translocation through cellular organelles [174, 175], and even site-specific conjugation of quantum dots [176].
Epitope tags are used in the process of protein purification to simplify the process and reduce the production costs [177, 178]. Tags are engineered at either the N- or the C-terminal end of the protein and have high affinity for specific matrices. By exploiting this selective affinity, they facilitate the separation of the tagged proteins from any protein mixture. Affinity-based purification protocols generally require only a single step to obtain high yields of highly purified proteins and can easily be adapted for large scales [177-180]. Among the most commonly used tags are hexa-histidine tags [181, 182], maltose-binding protein (MBP) tags [183], TAP tags [184], GST-tags [185], various combinations of two or multiple tags [186] and other tags discussed elsewhere [178-180, 187]. Tags are generally removed at the end of the purification procedure via proteolytic cleavage by endopeptidases [188, 189], a step that often leads to lower overall yield of purified protein. Several methods to use inteins as efficient protein purification tags have been developed to avoid these drawbacks.

Development of the self-cleaving affinity tag is the first major application of inteins. The self-cleaving tags generally contain a modified intein fused to an affinity tag and the protein to be purified. Modified inteins are engineered inteins whose amino acid sequence has been changed such that an alanine replaces one of the conserved terminal residues - the cysteine at its N-terminus or the asparagine at its C-terminus. These mutant alanine-inteins cleave either at their carboxyl (C) or amino (N) terminus, thus allowing the intein to be used in purification tags (Figure 5).
The intein activity is induced following the affinity purification of the target protein, thus cleaving the purified protein from both the intein and the tag. The splicing reaction can be induced by the addition of a small molecule or by a change in the reaction conditions (pH or temperature) (Figure 6) - conditional protein splicing (CPS). The identity of the intein amino acid that is mutated to produce the cleaving mutant determines how the intein cleaving reaction is controlled. Inteins that retain a cysteine at the N-terminus have the target protein fused at their N-terminus and are induced by thiol-based activating reagents like 2-sodium sulfonate mercaptoetanomesna, thiophenol, β-mercaptoethanol, 1,4-dithiothreitol (DTT) [43, 136, 162, 190, 191]. Those that retain the C-terminal asparagine and have the target protein located at the C-terminal end of the construct are induced by changes in the reaction conditions: pH, salt or temperature [43, 192]. Other triggers include light [193-196], addition of non-reducing small molecules [197, 198], changes in redox state [199] or proteases [200]. For example, Wong et al (2015) engineered a synthetic photoactivatable intein (named LOVInC), by using the light-sensitive LOV2 domain from Avena sativa as a switch to modulate the splicing activity of the split DnaE intein from N. punctiforme [201], while the group of Muir created “zymogens”- inteins activated by proteases [200].
Both types of intein have advantages and limitations. Thiol-induced inteins allow the tight control of the cleaving reaction. As cleaving in the absence of a strong reducing agent is a very slow process, intein-target protein fusion products can be purified by using regular purification procedures. Separation of the target protein from the intein tag is than induced, when desired, by the addition of thiols. Being able to control the cleavage reaction is a definite advantage, however the use of thiol-induced inteins also has significant disadvantages. The potent reducing agents used to initiate the cleavage reaction are generally toxic and cannot be used to purify proteins that contain disulfide bonds. Disulfide bonds play an important role in stabilizing many recombinant proteins. The addition of reducing agents can lead to their destabilization, unfolding and precipitation [202, 203].
The major advantage of using the conditionally controlled inteins (CPS-inteins) is the simplicity of their control mechanism. A simple shift to lower pH or higher temperature promotes intein cleavage, without the need for addition of any harsh chemicals. Thus, there is no inherent limitation to the types of proteins that can be purified using these intein systems and the protein purification schemes are generally very simple. The primary limitation of inteins induced by reaction conditions is that the cleaving reaction cannot be tightly controlled. Such inteins can prematurely cleave out during the protein purification process leading to lower purity and yields. Special measures need to be taken to reduce the premature cleavage. For example, lowering the temperature for recombinant protein expression in bacteria leads to higher target protein yields [204].
Several intein-based protein purification systems were developed, starting with the ΔI-cleaving mutant (ΔI-CM) mini-intein system [205, 206] and following with several of its derivatives - the elastine-like polypeptide (ELP) based system [207-211], the affinity chitin-binding domain (CBD) tag system [212] and the polyhydroxybuterates (PHBs)- based system [213]. The ΔI-CM intein was derived from the M. tuberculosis (Mtu) RecA intein by deleting the endonuclease domain. After random mutagenesis and selection for increased splicing activity, active mini-inteins were isolated and used for purification and tag removal of various proteins expressed in E. coli [143, 205, 206, 210, 214, 215]. Insertion of the intein between phasin, a polyhydroxy butyrate (PHB) binding protein, and the target protein led to the development of a PHB-intein-based protein purification system [216, 217]. In this system, the phasin acts as an affinity tag for the target protein by specific binding intracellular PHB granules. Addition of the intein confers self-cleaving abilities to the fusion protein, thus eliminating the need for protease treatment to obtain the native target protein [213]. The other protein tags that have been used in conjunction with the ΔI-CM intein produced similar protease-independent affinity purification systems. Because the splicing reaction is initiated by a simple pH shift, rather than addition of thiol groups [206], ΔI-CM inteins are a great tool for the purification of disulfide bond containing proteins, including antibody fragments [218]. The tool becomes even more powerful when combined with commercially available rapid cloning systems like Invitrogen’s Gateway or the TOPO cloning technology [219-221].
Other intein-based protein purification systems, like the IMPACT and pTWIN systems developed by New England Biolabs, are commercially available [135, 162, 190, 222]. The intein-mediated purification with affinity chitin-binding tag (IMPACT) system is based on a modified S. cerevisiae vacuolar ATPase subunit A intein (Sce VMA intein) [162] (Figure 7) in which the C-terminal reactive Asn was mutated to Ala. This mutation prevents the C-terminal cleavage by blocking the splicing reaction at the N-S acyl shift stage, leading to the accumulation of an unspliced precursor protein that can be affinity purified via its chitin-binding tag. Once the N-terminal cleavage is initiated by the addition of thiols such as DTT, β-mercaptoethanol or cysteine, the target protein fused at its C-terminus to the intein-CBD domain is released from the matrix-bound intein-CDB fragment and is eluted out. The original IMPACT system has been enhanced to form the second-generation IMPACT-CN system that allows the fusion of an intein-tag fragment to wither the C-terminus or the N-terminus of the target protein and the purification of native target proteins without any construct-derived amino acids (e.g. vector-based link amino acids). In addition, the system allows the use of two independent inteins in combination to purify a single protein by sequential induction of the two splicing reactions [223]. New England Biolabs expanded the IMPACT system to pTWIN, a similar system that also allows the use a two CBD-bound inteins. One of the inteins is a mini-intein derived from the dnaE gene of Syechocysti sp and modified in order to perform pH and temperature controlled self-cleavage at its C-terminus [135]. The second intein is derived either from the M. xenopi gyrA gene [190] (pTWIN1) or from the M. thermoautotrophicum rir1 gene [222] (pTWIN2), two mini-inteins engineered to perform thiol-controlled self-cleavage at their N-termini. Thus, these dual intein systems allow the stepwise release and purification of target proteins.
In order to minimize premature cleaving of intein tags in vivo during protein expression, several groups have developed intein purification systems through the reassembly of trans-cleaving split-inteins. Efficient in vivo protein trans-splicing was observed when the unlinked N- and C-terminal regions of the Ssp DnaB [42, 224] or those of the M. tuberculosis RecA [225] inteins were used together to form a functional protein splicing domain independent from the endonuclease domains of the inteins. Purified N- and C-terminal segments of the M. tuberculosis RecA intein fused to appropriate exteins were also reconstituted into a functional protein splicing element proving that the N- and C-terminal protein-splicing domains can interact and work together in trans [225, 226]. These engineered trans-acting inteins were also adapted for tag-affinity protein purification by expression of two ELP-tagged segments [211, 227, 228]. They are efficient, however the separate end fragments cannot perform protein cleavage when alone, and expression, purification and assembly of the complete system can be inefficient [227]. The discovery and use of natural split-inteins [136], and trans-cleaving system derived from them diminished these problems [135].
An example of split-intein purification system based on a naturally trans-splicing intein is the protein cleaving system derived from the N. punctiforme DnaE intein, Npu, This intein with natural trans-splicing activity has been engineered by the introduction of the single point mutation Asp118Gly) characteristic of ΔI-CM inteins to rapidly and efficiently cleave at its C-terminus only, once reassembled [229]. Furthermore, the mutant intein was engineered for tag-affinity protein purification by utilizing the elastin-like-polypeptide and the chitin-binding protein tags, respectively [230]. More specifically, the N-terminal fragment of the intein was fused to the affinity tag and immobilized, while the C-terminal segment was fused to the target protein. The two fusions are at the C-termini of the intein fragments (Figure 8). Protein purification is initiated through the association of the intein fragments, a process controlled by the presence of zinc ions, while cleavage of the target protein is activated by addition of thiols [230]. Miller SM et al used this split-intein technology to generate SpCas9 variants compatible with non-G PAMs [231].
In addition to being used in affinity chromatography purification protocols, intein-based systems have been developed for non-chromatographic purifications based on selective aggregating tags. Such procedures, combined with the advantages of protein self-cleaving, reduce cost and sample preparation time and lead to reasonable yields of purified material.
One such system is the ELP-intein based system [210, 211]. This method is based on the selective and reversible aggregation of ELP-intein-tagged proteins. The aggregation conditions vary in respect with temperature, protein concentration and size, but precipitation is generally induced by high salt concentrations [207, 208, 232]. Once purified, the intein-containing precursor construct self-cleaves, and the released ELP tag is precipitate again, thus allowing the rapid separation of the pure target protein. In addition to large protein purification, this approach proved efficient and convenient for generating small antimicrobial peptides [233].
The method was recently adapted to the use of split-inteins. Fan et al have developed a variant of the non-chromatographic tag purification strategy using the ELP and other tags in combination with an engineered Npu split-intein active at high pH. This high pH-dependent cleavage minimizes sample loss during precipitation and provides rapid tag removal [228].
The second example of a non-chromatographic purification approach is the a multiple phasin tag-intein system. Phasins are the major bacterial polyhydroxy alkanoate (PHA) granule-associated proteins, that specifically bind to polyhydroxy butyrate granules (PHB) produced in vivo, in E. coli [234, 235]. Banki et al (2005) combine the production of PHB granules in E. coli, phasin’s affinity for these granules, and an engineered pH and temperature dependent self-cleaving protein to create a self-contained protein expression and purification system [213]. The phasin-intein acts as a self-cleaving purification tag, with affinity for the PHB granules. The PHB-bound target protein is purified from crude cell extracts by repeated washing, centrifugation and resuspension of the granules. The native target protein is then released from the bound tag through a pH-induced intein-mediated self-cleavage reaction, and the granule-bound tag is removed by centrifugation. The system permits the easy separation of significant amounts of target protein, is compatible with large-scale, robotic applications, and can be applied to other expression systems [216, 236].
The intein-mediated protein purification methods are simple and cost-effective. They involve few purification steps and low requirement for reagents thus they are suitable for large scale, industrial applications. Wood et al designed a large-scale affinity separation method based on the DTT-inducible IMPACT system. They determined the cost optimal reaction conditions and concluded that using a the Tris−HCl reaction buffer and the thiol-based activators increases the cost of the purification process, but using a phosphate-based buffer system and inteins activated by environmental changes in pH or temperature makes the system appropriate for large scale protein purification. Moreover, the use of non-chromatographic affinity tags, eliminates the need for expensive column chromatography procedures. The combination of Invitrogen's Gateway cloning technology with self-cleaving purification tags by Gilles et al led to a new system for rapid production of recombinant proteins suitable for large-scale production [219]. The system allows for the addition of any tag or promoter, increasing the number of expression vectors that can be used. When compared with the classical affinity chromatography followed by protease removal method, the intein-based self-splicing cleaving tag removal proved much more cost effective, but the procedure requires improvement as the yield is a lot lower [204].
By rearranging specific internal covalent and/or peptide bonds, proteins can be post-translationally modified. Inteins have been used as a tool to insert post-translational structural modifications into proteins. These methods are powerful techniques as they facilitate protein ligation and protein circularization [237, 238], the incorporation of unnatural amino acids [239, 240] and biophysical probes into polypeptide chains [241, 242], the unnatural post-translational modification of proteins when desired [222, 243, 244], protein immobilization on solid-supports [245, 246], as well as the study of protein structures or protein-protein interactions [247]. For each of these applications, a modified splicing reaction is performed by a mutant engineered intein resulting in a covalent modification in the target protein.

Cyclic proteins are exceptionally stable to chemical, thermal, or enzymatic degradation. Their increased stability is due to the inability of exopeptidases to digest closed, circular molecules. In addition, many circular polypeptides have higher specific activities that their open counterparts. The biological activities include anti-bacterial, uterotonic, hemolytic, and cytotoxic activity [248]. Expressed protein ligation (EPL) and protein trans-splicing (PTS) are two intein-based approaches that permit the engineered cyclization of polypeptide chains as well as the assembly of proteins from smaller fragments, either in vitro or in vivo [237, 238]. Intein-mediated EPL is a technique in which a recombinant protein with an N-terminal cysteine reacts with a recombinant protein thioester generated by an engineered intein fusion protein resulting in a new peptide bond formed via native chemical ligation (Figure 9) and [222].
Several approaches have been used to introduce N-terminal Cys residues in polypeptide chains. In some approaches a Cys residue is engineered immediately downstream the initiating methionine that will be naturally excised by methionyl-aminopeptidases [249] or downstream a highly specific protease recognition sequence (e.g. Factor Xa, [250], tobacco etch virus (TEV) cysteine protease [251] ) that allows for the proteolytic removal of the N-terminal Cys containing protein by specific proteases. N-terminal Cys proteins were also obtained by mutations introduced in expression vectors [252], by in vivo synthesis and removal of leader peptides [253] or by intein splicing at the C-terminal splice junction [190, 222]. Recombinant protein α-thioesters can be obtained by using a mutant intein in which the conserved Asn residue (Figure 3) was replaced by Ala. This change stops the splicing process (Figure 4) to generate an α-thioester linkage [254].
Protein trans-splicing (PTS) is a protein ligation reaction carried on by naturally split-inteins that facilitates the production and study of heterologous proteins’ structure and/or function [255]. Naturally split-inteins are characterized by a large N-terminal fragment and a shorter C-terminal intein fragment (Figure 1), and they are splicing with rapid kinetics making them a powerful tool for protein engineering [170]. The disadvantage is the tendency of the larger, N-terminal intein to rearrange structurally prior to its ligation to the C-terminal intein, often leading to protein aggregation [256, 257]. Recently, Gramespacher et al (2017, 2018) addressed this problem by using an engineering strategy to stabilize the N-terminal intein structure, control the activity, and improve the efficiency of PTS under a variety of reaction conditions [200, 258]. In this approach the split-intein was embedded within a protein sequence designed to stabilize either the intein fragment itself or the joined extein [258]. Similarly, to produce an on/off switch for intein activity, the authors fused each split-intein fragment to a shorter fragment of its cognate partner [200]. The PTS reaction is blocked and can be restarted rapidly by proteolytic release of the caging fragments [200]. This caging strategy led to significant improvement in PTS for some inteins (e.g. IMPDH‐1 and Nrdj‐1 when fused to the mAb‐HC), but no gains in soluble protein expression for others (e.g. Cfa and IMPDH‐1 in E. coli), making further improvements necessary [258].
Several groups have used the EPL for the backbone-cyclization of proteins or protein domains. For example, Camarero and Muir (1999) created a circular Src homology 3 (SH3) domain from the murine c-Crk protein that was fully active [259] and continued their work by successfully producing other circular small protein domains [260, 261] and applying the method to in vivo protein circularization [262]. Other examples of functional circular proteins produced by similar approaches are the β-lactamase [263] and the green-fluorescent protein (GFP) [264]. In addition, several cyclic-peptide have been created in E. coli by EPL: a library of mutant cyclic SFTI-1(sunflower trypsin inhibitor 1) peptides [265] ; a library of MCoTI-I cyclotides and libraries of antimicrobial peptides, defensins [266, 267].
The SICLOPPS (split-intein circular ligation of proteins and peptides) technique is based on PTS, it uses the naturally Ssp DnaE split-intein, and has been successfully used to create circular polypeptides by fusing the N- and C-intein fragments to the C and N termini of the polypeptide to be cyclized [238, 268]. This method has been used to produce libraries of both small and large backbone-cyclized polypeptides in bacteria [268-271] and in human cells [272]. Moreover, a split-intein based bacterial system for the production and evolution of cyclic peptides was designed to evolve an inhibitor of HIV protease by using an expanded genetic code and selection based on cellular viability in presence of the peptides [273].
Cyclotides containing an unnatural amino acid, p-azido-phenylalanine, were produced by combining the PTS-cyclization technique with the use of nonsense suppressing tRNAs [274]. Moreover, the use of EPL and/or PTS methods for the production of circular polypeptides has also made possible the introduction of active isotopes like 15N or 13C into the polypeptide backbone, thus allowing the use of NMR for the study of structure-function relationships in proteins [275] and circular polypeptides like MCoTI-I [276].
The TAIL method for intein-mediated polypeptides was recently developed by Thompson et al The method, named TAIL for transpeptidase-assisted intein ligation is a chemoenzymatic ligation method in which synthetic polypeptides can be added to an intein fragment prior to its assembly into an active intein and the initiation of a trans-splicing reaction with a recombinant protein-intein fusion [277]. The approach, illustrated in figure 10, consists of two independent steps. During the first step, a functional split-intein fragment is generated through enzymatic transpeptidation of a truncated intein and a synthetic peptide bearing a short polypeptide overhang that completes the intein sequence. In the second step, the newly generated split-intein fragment undergoes protein trans-splicing with a recombinant protein fused to the cognate split-intein fragment. Thus, any type of post-translational modification can be added to a polypeptide via semi synthesis, provided that the modified residue is included in the synthetic peptide.
Thompson et al (2019) used this approach to introduce chemical modifications and biochemical probes into several proteins including Cas9 nuclease and the transcriptional regulator MeCP2 [277].
Intein-based approaches used to perform protein ligation reactions, like EPL and PTS, are not only used for backbone circularization, but also for site-specific modification of proteins. Fluorescent dyes, biotin and radioisotopes are among the most used protein labels. Such labels can be applied to any sidechain including the N and C-terminal residues.
Kurpiers and Mootz have attached a short cysteine containing peptide to the C-terminus of a protein by PTS using either the Ssp DnaB [278] or the M. xenopi GyrA intein (Mxe GyrA) [279]. The C-intein fragments contained a Ser or a Thr as their C-terminal reactive nucleophilic residue, were fused to an extein sequence containing a single Cys residue and were labeled with probes such as fluorescein-iodoacetamide and polyethylene glycol (PEG)5000-maleimide, without loss of protein function. Mootz and co-workers (2009) were also able to introduce modifications into the N-terminal fragment of the E. coli porin OmpF [280] by using the split Psp-GBD Pol intein, which allowed the constitution of an active porin with altered conductance properties. This study paved the way for the site-specific modification of other membrane proteins like the human transferrin receptor [281].
split-inteins were used to add fluorescein and biotin labels to the N- or C-termini of other proteins: the red fluorescent protein [282] or the maltose binding protein (MBP) [283] among others [284]. For example, biotin was added to the N-terminus of MBP after it was linked to the Lys residue of the native Ssp DnaE C-extein sequence (CFNK) during chemical synthesis of the C-intein containing peptide. The biotinylated intein was incubated with a recombinant MBP fused to the N-intein to allow the transfer of the biotin to the C-terminus of MBP by PTS. This protein modification method formed the basis for the selective immobilization of proteins characteristic to microarray platforms based on PTS [283] and was followed by others with similar potential [281].
Expressed protein ligation (EPL) was used to incorporate fluorescent labels into proteins for ligand binding studies in which either changes in fluorescence of single probes or the fluorescence energy transfer (FRET) between two probes are measured. For example, Szewczuk et al, (2008) used used EPL and FRET spectroscopy to study the circadian rhythm enzyme serotonin N-acetyltransferase [285], while Xie et al, (2009) used EPL to fluorescently label histone acetyltransferase (HAT) proteins and FRET to identify HAT specific inhibitors [286].
The covalent conjugation of quantum dots to protein termini was also achieved by using split-inteins. For example, Charalambous and his coworkers described an intein based method to site-specifically conjugate Quantum Dots (QDs) to target proteins in vivo and labeled the C-terminus of a pleckstrin homology domain to prove it [287].
The PTS method was also used to insert radioisotopes into proteins or to introduce unlabeled protein tags into isotopically labeled proteins for NMR studies [275, 288]. The study of individual protein domains is commonly used with NMR spectroscopy in order to overcome the problem of signal overlapping characteristic to large proteins. Thus, isotope labeling of protein domains can be very useful. Segmental protein isotope labeling can be achieved by PTS when one of the split-intein-protein fusions is expressed in isotope containing culture medium, while the other split-intein fragment is expressed separately in media without isotopes [288]. This approached proved successful in the elucidation of the working mechanism of the F1-ATPase, a large molecular motor protein [289].
Selenocysteine (Sec), known as the 21st amino acid is encoded by a UGA codon both in prokaryotes and eukaryotes and is incorporated into the active site of several proteins during translation in a process known as recoding [290]. The UGA codon is generally read as a stop codon, and a specialized mRNA sequence, the Selenocystein Insertion Sequence (SECIS) is required for Sec incorporation [291-293]. Because eukaryotes and prokaryotes have different recoding machineries, it is often challenging to produce selenoproteins in host systems. Thus, incorporation of Sec residues is achieved by using chemical conversion of reactive Ser residues or by native chemical ligation (NCL) or intein-mediated protein-ligation reactions (IPL). In IPL the Sec-containing moiety is produced synthetically, while the remaining protein fragment is expressed as a recombinant protein and purified by intein-mediated protein purification in a process in which intein activity leads to the formation of a link between the target protein and the Sec-containing moiety [294].
An improved method for intein-mediated Sec incorporation into proteins has been developed by Arner et al [295]. This method, named SECTEIN, exploits the bacterial host translation machinery to produce selenoproteins. Because in bacteria, the UGA codon is translated into Sec only if a SECIS element is located downstream from the UGA codon, in SECTEIN, a SECIS element was introduced at the N-terminus of the splicing P. chrysogenum PRP8 intein fused with the N-terminus of the selenoprotein that acts as an N-extein and contains the UGA Sec codon (Figure 11). This method does not require the chemical production of a Sec-protein moiety and permits the insertion of Sec residues anywhere along the protein sequence since the SECIS element naturally directs to the UGA codon. Through protein splicing, the Sec-containing N-extein is fused to the C-extein to form a mature selenoprotein, while the SECIS element is excised with the intein.
CZ Chung et al designed a Sec-dependent intein system to serve as a reporter for Sec incorporation [296].
EPL has been used for the incorporation of non-natural amino acids into proteins for structure-function studies. For example, Valiyaveetil et al (2002), used a non-natural, D-amino acid, to study the ion selectivity function of the bacterial membrane channel KcsA [239]. KcsA is a potassium channels which permits the rapid and selective conduction of potassium ions across cellular membranes. Its selectivity filter includes a glycine residue (Gly77), which exists in a left-handed helical conformation and was considered essential for maintaining the correct helical conformation of the structure. Valiyaveetil et al replaced the Gly77 with D-Ala without affecting the structure and the conductive properties of the filter [239, 297].
Expressed protein ligation was also used in conjunction with other labeling methods to insert protein domains that include non-canonical amino acids into the full-length molecule. An example can be found in the work of Muralidharan et al. In this work, the authors used EPL to ligate two Src homology domains from the c-Crk-I protein, one of which was prelabeled with the Trp analogue 7-azatryptophan (7AW), to generate a larger protein with a domain specific label. Since 7AW acts as a fluorescent probe, the combination of protein ligation and labeling techniques allowed for the study of the ligand-binding properties of the conserved Src homology domains [241].
Structural studies of proteins based on x-ray crystallography or NMR spectroscopy require ultrapure, highly homogenous protein samples [298-300]. A great system for protein expression that can provide significant yields of purified protein at a low cost is E. coli, however the addition of correct post-translational modifications remains a problem as E. coli the natural; post-translational modifications in E. coli are rare and different from those in eukaryotes. Some engineering methods have been used to introduce eukaryotic glycosylation in proteins expressed in E. coli [301], but other modifications are still difficult to insert in host systems. Modifications include glycosylation and lipoglycosilation, phosphorylation, acetylation, biotinylation, ubiquitination, isotope labeling, and others [302, 303].
Hackenberger et al combined NCL and the use of the intein-based IMPACT system to produce glycosylated immunity protein Im7 to address the role of glycosylation in protein folding [304]. Becker and co-workers used a combination of NCL and PTS to attach lipid molecules to specific sites in the mouse prion protein PrP. In brief, they produced the murine PrP protein in E. coli as a fusion with the N-terminal fragment of the DnaE split-intein (DnaEN) and synthesized the C-terminal fragment of DnaE linked to a GPI anchor-mimicking peptide. The two DnaE fragments associate to form a functional intein, which acts to generate the desired modified rPrP protein [305]. A less common, but interesting modification, found in many eukaryotic small GTPases is penylation (also known as isoprenylation or lipidation). A method that combines recombinant protein production, chemical synthesis of lipidated peptides and peptide-to-protein ligation has been developed [306] and used for the detailed understanding of the mechanism of the prenylation reaction [307] and the membrane delivery of Ras or Rab GTPases to target membranes [308-310].
Another common post-translational modification in eukaryotes is the phosphorylation of Ser, Thr or Tyr residues. EPL was used to phosphorylate the Csk kinase at one of the C-terminal Tyr residues [311], to create a tetra-phosphorylated transforming growth factor-beta (TFG-β) receptor [312] and other phosphorylated components of the TFG-β signaling pathways, like Smad proteins [313], or to prepare phosphorylated histones [314]. The use of his technique was essential for obtaining high-resolution crystal structures of the phosphorylated proteins that helped deciphering the role of these post-translational modifications in the signaling activity [315, 316]. Similarly, preparation of phosphorylated [314], acetylated [317, 318], methylated [317] or ubiquitylated [319] histones lead the way to deciphering the role of nucleosomal histone modifications in chromatin structure and function studies.
Protein microarrays, known as protein chips, have emerged as important tools in biochemistry and molecular biology. They are miniaturized and parallel assay systems that contain small amounts of purified proteins immobilized on slides that can be used [320-322] in a variety of analytical and functional applications including screening protein-protein interactions, proteomics research, drug discovery, and diagnostics [323-325]. Inteins have been used to immobilize biologically active proteins on such solid support without prior purification of the protein to be attached, via protein-trans-splicing [283]. Because the method required only very diluted samples (≈1µM) it can be used for the immobilization of proteins from complex mixtures such as cellular lysates or cell-free expression systems. Girish et al reported a bacterium-based, intein-mediated strategy to generate N-terminal cysteine-containing proteins which are immobilized onto a glass slide to generate the corresponding protein microarray [245]. In addition, they provide preliminary data for a yeast-based intein-mediated protein-immobilization technique [245]. Lesaicherre et al used an intein-mediated expression system to generate biotinylated proteins suitable for immobilization onto avidin-functionalized glass slides [326] that has been further improved and adapted for future use in high-throughput proteomics [327, 328].
In addition to solid supports, inteins have been used to immobilize functional recombinantly expressed proteins on membranes such as liposomes or lipid-coated nanoparticles. Chu et al developed a method to anchor recombinant proteins into membrane structures. They used a double-palmitoylated peptide and protein trans-splicing immobilize proteins fused to split-intein segments into functional vesicles and membrane-coated silica nanoparticles [326].
The most significant advantage of the protein-ligation technique (PTS) for protein semi-synthesis is that it can be readily applied in vivo. A library of split inteins has been evaluated, which enable in vivo modular multi-peptide assemblies [329]. PTS can be used to insert site-specific labels in proteins in vivo. For example, Giriat and Muir described a method that allows ligation of synthetic molecules to target proteins in an intracellular environment [330]. In brief, they tagged a cellular protein with one-half of a split-intein and linked the complementary half in vitro to a synthetic probe. Association of the intein halves after the two constructs were delivered to the cytoplasm triggered protein trans-splicing, resulting in the formation of a new peptide bond and the ligation of the probe to the target protein. The authors showed that the process was specific and applicable to both cytosolic and membrane proteins [330]. In addition, Borra et al developed a technique for in-cell protein labeling/tracking based on the use of fluorescence resonance emission transfer (FRET)-quenched DnaE split-inteins. They used this approach to label the DNA binding domain (DBD) of the transcription factor YY1 in human cell lines and to demonstrate that this method of protein modification permits the monitorization of protein localization and biological activity [331]. Volkman and Liu used PTS to label the C terminus of the human transferrin receptor with 5-carboxy-fluorescein on the surface of Chinese hamster ovary (CHO) cells using the Ssp GyrB split-intein [281], Ando et al added a biotin tag to the N-terminus of the monomeric red fluorescent protein (mRFP) via PTS on the surface of same type of cells [282], while Dhar and Mootz used the naturally split Npu DnaE intein to ligate an exogenous polypeptide to membrane proteins on mammalian living cells [284].
The C-terminal intein fragment of the non-canonical Ssp DnaB S1 split-intein has been used as a site-specific protease for in vivo protein cleavage both in bacterial and eukaryotic cells [332]. In brief, the protease recognition sequence containing the 11-residue IN fragment and five native N-extein residues was co-expressed with the recognizing intein-derived protease (included in the IC fragment). Upon in trans interaction, the two fragments form a fully active splicing intein that cleaves the target protein. The intein-derived protease proved to be highly specific, demonstrating extremely low activity toward other cellular proteins not containing the recognition sequence [332].
Inteins have been used in vivo as genetic markers [333]. Ramsden et al provide a method to interrupt an intein with a selectable marker. They incorporated specific markers into the Pch PRP8 intein in place of the endonuclease without affecting splicing, thus provide genetic selection for the intein, and coupled the marked intein with GFP as the N-terminal extein. Thus, they created a cassette that can introduce a GFP label within any targeted protein in a single step [333].
Besides being a great tool for in vivo protein labeling, monitorization and manipulation, split-inteins have been used for a variety of other application in complex biological systems. One application is gene therapy, a therapeutic procedure in which large inactive viral vectors are used to deliver therapeutic genes [334]. Li et al used protein trans-splicing to circumvent the packaging size limit of gene therapy vectors. They demonstrated that a large therapeutic genetic material can be split into fragments, fused to split-intein genes and delivered as two smaller packages within two different ADVVs. Following delivery, the two fragments can be reassembled in vivo by PTS to generate the full therapeutic gene product [335]. More recently, a large coding sequence for the full-length dystrophin protein was delivered into mice muscles and neurons via triple-adeno-associated virus vector (AVV)-mediated trans-splicing [336]. Moreover, multiple AAV vectors each encoding one fragment of a target protein flanked by short split-inteins were delivered to the retina of mice and pigs or to human retinal organoids where they were used to reconstitute the full-length protein via protein trans-splicing [336]. This type of large protein reconstitution lead to an improved outcome in two mouse models of retinal diseases [337]. JM Levy et al used a split-intein strategy to deliver base editors to mouse tissues via adeno-associated viruses [338].
Inteins, Cre recombinase and PTS were used in a strategy that allows the genetic modification of specific eukaryotic cell types [339]. In this work, the authors engineered two split-intein-fused inactive fragments of Cre recombinase under the control of different conserved human enhancer elements and incorporating this engineered DNA into mice. They showed that the Cre recombinase was only assembled and active in cells where both enhancers were used to activate gene transcription and they succeeded in specifically modifying certain cell types in transgenic mice. Similarly, the recombination activity of split-Cre constructs was used to transgene excision in transgenic Arabidopsis plants [340]. More recently, Wang et al used a similar split-intein strategy to create a new method to control transgene gene expression in C. elegans [341]. In this work, the DNA binding domain and transcriptional activation domain of the transcription activator cGAL were split and fused to the N terminal of gp41-1-N-intein and the C terminal of gp41-1-C-intein, respectively. When both halves of cGAL are expressed, a functional cGAL is reconstituted via intein-mediated protein splicing. Thus, genetic access for each individual cell type can be achieved and controlled.
Conditional protein splicing (CPS) refers to the activation or inhibition of protein splicing by an eternal factor, such as a small molecule, light, temperature, pH or change in redox state [197, 342]. The ability to regulate inteins, and therefore protein splicing, leads to the possibility to regulate the activity of target proteins in vivo.
One way of regulating inteins is by small molecule ligands induced PTS. Mootz and Muir engineered a split S.cerevisiae vacuolar ATPase subunit (VMA) intein, whose in trans splicing activity between two polypeptides was triggered by the small molecule rapamycin [343], and showed that their conditional protein splicing system can be used in mammalian cells [197, 344]. Further application of the system lead to the control of the enzymatic function of the firefly luciferase by chemically induced dimerization [345] and the identification of the tobacco etch virus (TEV) protease variants that are conditionally activated by rapamycin [346]. Skretas and Wood designed an intein-based protein switch whose splicing activity is conditionally triggered in vivo by the presence of thyroid hormone or synthetic analogs and showed that several E. coli proteins could be inactivated by intein insertion and reactivated by ligand-induced splicing by upon addition of thyroid hormone [347]. Buskirk et al used directed evolution to select active ligand-dependent inteins from inteins whose natural splicing activity has been blocked by the insertion of a specific ligand-binding domain, in S. cerevisiae [348]. The concept was further extended to bacteria mammalian cells [349, 350]. Moreover, an evolved 4-hydroxytamoxifen-responsive intein was inserted in the Cas9 nuclease to allow for the conditional modification of specific genomic sites [351], while an estrogen-sensitive VMA intein was created by replacing the endonuclease region of VMA with the estrogen binding region of the human estrogen to create an efficient estrogen screening tool [352].
Protein splicing can also be modulated by temperature, light or pH. Several inteins, like the intein encoded in the DNA polymerase I gene of the hyperthermophile T. litoralis [353], the DNA polymerase gene from Pyrococus [354], or the yeast VMA [355] were shown to be naturally controlled by temperature. Other temperature sensitive inteins were developed in the laboratory. Zeidler et al inserted the naturally temperature sensitive intein VAM from S. cerevisiae into a galactose regulator protein to obtain a temperature-dependent regulator of the Gal4/upstream activation sequence (UAS) in Drosophila melanogaster [355].
Light has been also used to regulate the activity of inteins [342]. This was achieved either by the fusion of a photodimerization domain to an intein or by photo caging. For example, Tyszkiewicz and Muir fused a photodimerization system from A.thaliana to an artificially split S. cerevisiae intein leading to rapid activation of protein splicing and release of a new protein product [194] Berrade et al introduced two photocleavable protecting groups onto the backbone of the C-intein polypeptide of the Ssp DnaE split-intein to control its trans splicing-activity [195]. Vila-Perrello et al inserted an O-acyl linkage at the Ser35 side chain of the C-terminal fragment of the DnaE intein to abolish protein splicing, that could be recovered either by proteolytic or by photochemical removal of specific protecting groups [193]. Moreover, Ninschik et al used a photocaged N-terminal intein fragment and a C-terminal intein fused protein, staphylocoagulase, that activates prothrombin when light is applied [196].
Protein splicing side reactions may also be controlled by changes in pH. Such inteins have utility in affinity protein chromatography [356] and can provide insights into the structural and functional roles of some conserved residues in protein splicing.
Because most inteins initiate N-terminal cleavage by an N-S acyl shift that includes a free sulfhydryl group at the intein N-terminus blocking and releasing this sulfhydryl group of Cys1 allows the conditional protein splicing by changes in redox conditions.
For example, the intein that interrupts the large subunit of DNA polymerase II from M. marisnigri (Mma) displays lower protein splicing activity under nonreducing conditions because of the formation of a disulfide bond between two internal intein Cys residues [357]. Zhu et al improved the protein trans-splicing activity in a dual-vector factor VIII (FVIII) gene delivery system by replacing key amino acids with Cys and controlling the redox environment [358]. In vivo, this redox sensitivity can indicate differential activity in different strains or in different cell compartments which might have physiological and therapeutic effects.
In vitro, unspliced precursors can be isolated when split-intein fragments interact under oxidizing conditions and activity is induced by the addition of reducing agents. Mills et al purified the N and C-terminal segments of the M. tuberculosis RecA intein and reconstituted the splicing elements in an inactive disulfide-linked complex of the two fragments. The complex was activated by addition of reducing agents and reduction of the disulfide bonds [226]. Redox controlled inteins can be used in biotechnology applications. Cys residues can be introduced at specific sites in intein or extein segments in order to control the splicing activity as it was shown by the development of the IMPACT protein purification system [359] or in mechanistic studies [191]. Callahan et al engineered inteins that function as redox-responsive switches in bacteria by inserting a disulfide bond between the intein's catalytic cysteine and a cysteine in the extein sequence [360].
split-inteins have also been conditioned to function after addition of proteases whose role is to release a protein or peptide that inhibits the intein. As describe above, Vila-Perrello et al inserted an O-acyl linkage at the Ser35 side chain of the C-terminal fragment of the DnaE intein to abolish protein splicing, that could be recovered either by proteolytic or by photochemical removal of specific protecting groups [193]. Moreover, Gramespacher et al created intein “zymogens”, inteins that split in the presence of a protease [200] and applied these constructs to create proteins sensors responsive to various stimuli.
The characterization and application of conditional split reactions lead to the development of intein-based biosensors. Such biosensors are composed of three independent protein domains that have specific functions: sensing, signal transducing and output signal release. The sensing module recognizes a signal of interest and induces a change in the splicing activity of the intein module that acts as a signal transducer. The change in intein activity leads to a change in the activity of the target protein, that acts as a reporter. Since the three modules are independent proteins, they can be chosen such that a large variety of sensors with various sensor and receptor modules can be built for diverse applications: protein-protein interactions, changes in DNA methylation patterns, small molecules, protease activity and redox state of the cell.
Biosensors to detect protein-protein interactions consist of two fusion proteins each formed of a split-intein fragment and an interacting reporter protein fragment. When the two protein fragments interact, they bring the split-intein fragment together leading to their activation [361]. Several reporter gene assays were developed based of split-intein systems. A known example is the development and of a firefly luciferase reporter gene assay for detecting Ras-Raf-1 interactions based on protein splicing of transcription factors with DnaE inteins by Kanno et al (2006, 2009 [361, 362]. A similar biosensor was developed by Huang et al for reporting changes in DNA methylation in living cells. This biosensor consisted of two zinc finger domains fused to half inteins and to split-luciferase domains that can interact and emit luminescence after binding of two adjacent DNA fragments. The sensor was used to report changes in DNA silencing in human cells [363] Sensors for small ligand detection by nuclear receptors were obtained from small molecule-controlled inteins (above section on conditional split-inteins). Wood and coworkers used inteins as allosteric transmitters between receptors and reporter molecules [347, 364]. The intein-based protein switch consists of a four-domain fusion protein in which a nuclear receptor inserted in the non-splicing M.tuberculosis RecA intein is fused to the E. coli maltose binding protein and a thymidylate synthase reporter from bacteriophage T4 [365] and was used for the detection of diverse receptor ligands [364, 366-369] A different allosterically regulated splicing-dependent biosensor used a modified estrogen-sensitive S. cerevisiae VMA intein (VMA(ER)), obtained by replacement of the endonuclease region with the human estrogen receptor α and inserted in the lacZ gene. The resulting intein was activated by estrogenic ligands and produced an active β-galactosidase reporter [348, 370].
In vivo intein-induced cyclization of luciferase has been used to generate a biosensor for the caspase 3 protease activity [371]. The firefly luciferase gene was fused to a caspase-3 recognition sequence and cyclized by the inverted DnaE split-intein. When the protease is inactive, the cyclized luciferase acts poorly due because of steric hinderance, but its activity is restored upon caspase-dependent cleavage. This biosensor enabled the real-time quantitative sensing of caspase-3 activity in mice [371].
The ability to control disulfide-bonding in split-inteins lead to the construction of redox-bacterial sensors. For example, the activity of the yeast DnaE intein splicing activity could be controlled with an engineered disulfide trap that rendered it inactive under oxidizing conditions and active in a reducing environment [199, 360]. This sensor was fused to a FRET reporter and used to identify hyperoxic E. coli mutants [199].
Application | System | Advantage | Disadvantage | References |
---|---|---|---|---|
Protein purification | ΔI-CM mini-intein systems | Self-cleaving affinity tags; no need for purified proteases Inducible cleaving by pH, temperature, light, ligand, protease controlled, reducing agents No denaturation for non-thiol inducers-based systems Not thiol-dependent | Reducing agents can lead to their destabilization, unfolding and precipitation | [171, 372] |
Elastine-like polypeptide (ELP)-tag system | Great for purification of peptides | Involves protein aggregation and precipitation | [205, 206] | |
Chitin-binding domain (CDB) tag systems: IMPACT, IMPACT-CN, pTWIN, pTWIN2 | Thiol and temperature or pH controlled; second-generation systems use two CBD-bound inteins and stepwise protein cleavage; Adaptable to large-scale purification | [212, 223] | ||
Polyhydroxybuterates (PHBs)-tag system | Not thiol dependent; permits purification of large amount of protein | [213, 216] | ||
Protein modification | Expressed protein ligation (EPL) and protein trans-splicing (PTS) | Performed post-translationally in vitro or in vivo | ||
Permit protein ligation and cyclization leading to increased stability and specific activity | [222, 237, 238] | |||
Insertion of non-natural amino acids, of active isotopes or other chemical modifications or probes | [274, 277] | |||
Allows the mobilization of proteins to microarray platforms or on lipid structures | [281] | |||
Intein-mediated protein-ligation (ITL) | Production of selenoproteins | [294, 295] | ||
In vivo protein cleavage and regulation of protein function | Permits control of protein activity in vivo | [373, 374] | ||
Highly specific protease activity | Requires engineering of specific recognition sequences | [332] | ||
Genetic material carriers | Good tools for genetic protein labeling | [333] | ||
Allow delivery of large DNA fragments and the in-cell reconstruction of genetic material | [336] | |||
DNA recombination control | Permits the control of assembly and activity of recombinases in vivo, thus facilitating the control of genetic recombination events | [339, 341] | ||
Biosensors | Large variety of applications; relatively easy to engineer | [373] |
- Hirata R, Ohsumk Y, Nakano A, Kawasaki H, Suzuki K, Anraku Y. Molecular structure of a gene, VMA1, encoding the catalytic subunit of H(+)-translocating adenosine triphosphatase from vacuolar membranes of Saccharomyces cerevisiae. J Biol Chem. 1990;265:6726-33 pubmed
- Perler F, Davis E, Dean G, Gimble F, Jack W, Neff N, et al. Protein splicing elements: inteins and exteins--a definition of terms and recommended nomenclature. Nucleic Acids Res. 1994;22:1125-7 pubmed
- Anraku Y, Mizutani R, Satow Y. Protein splicing: its discovery and structural insight into novel chemical mechanisms. IUBMB Life. 2005;57:563-74 pubmed
- Noren -, Wang -, Perler -. Dissecting the Chemistry of Protein Splicing and Its Applications. Angew Chem Int Ed Engl. 2000;39:450-466 pubmed
- Perler F. InBase: the Intein Database. Nucleic Acids Res. 2002;30:383-4 pubmed
- Fukui T, Atomi H, Kanai T, Matsumi R, Fujiwara S, Imanaka T. Complete genome sequence of the hyperthermophilic archaeon Thermococcus kodakaraensis KOD1 and comparison with Pyrococcus genomes. Genome Res. 2005;15:352-63 pubmed
- Riera J, Robb F, Weiss R, Fontecave M. Ribonucleotide reductase in the archaeon Pyrococcus furiosus: a critical enzyme in the evolution of DNA genomes?. Proc Natl Acad Sci U S A. 1997;94:475-8 pubmed
- Maeder D, Weiss R, Dunn D, Cherry J, Gonzalez J, DiRuggiero J, et al. Divergence of the hyperthermophilic archaea Pyrococcus furiosus and P. horikoshii inferred from complete genomic sequences. Genetics. 1999;152:1299-305 pubmed
- Fsihi H, Vincent V, Cole S. Homing events in the gyrA gene of some mycobacteria. Proc Natl Acad Sci U S A. 1996;93:3410-5 pubmed
- Cole S, Eiglmeier K, Parkhill J, James K, Thomson N, Wheeler P, et al. Massive gene decay in the leprosy bacillus. Nature. 2001;409:1007-11 pubmed
- Guillemin I, Cambau E, Jarlier V. Sequences of conserved region in the A subunit of DNA gyrase from nine species of the genus Mycobacterium: phylogenetic analysis and implication for intrinsic susceptibility to quinolones. Antimicrob Agents Chemother. 1995;39:2145-9 pubmed
- Matrat S, Petrella S, Cambau E, Sougakoff W, Jarlier V, Aubry A. Expression and purification of an active form of the Mycobacterium leprae DNA gyrase and its inhibition by quinolones. Antimicrob Agents Chemother. 2007;51:1643-8 pubmed
- Perler F, Comb D, Jack W, Moran L, Qiang B, Kucera R, et al. Intervening sequences in an Archaea DNA polymerase gene. Proc Natl Acad Sci U S A. 1992;89:5577-81 pubmed
- Hodges R, Perler F, Noren C, Jack W. Protein splicing removes intervening sequences in an archaea DNA polymerase. Nucleic Acids Res. 1992;20:6153-7 pubmed
- Chute I, Hu Z, Liu X. A topA intein in Pyrococcus furiosus and its relatedness to the r-gyr intein of Methanococcus jannaschii. Gene. 1998;210:85-92 pubmed
- Matsumiya S, Ishino S, Ishino Y, Morikawa K. Physical interaction between proliferating cell nuclear antigen and replication factor C from Pyrococcus furiosus. Genes Cells. 2002;7:911-22 pubmed
- Ishino S, Oyama T, Yuasa M, Morikawa K, Ishino Y. Mutational analysis of Pyrococcus furiosus replication factor C based on the three-dimensional structure. Extremophiles. 2003;7:169-75 pubmed
- Oyama T, Ishino Y, Cann I, Ishino S, Morikawa K. Atomic structure of the clamp loader small subunit from Pyrococcus furiosus. Mol Cell. 2001;8:455-63 pubmed
- Cole S, Brosch R, Parkhill J, Garnier T, Churcher C, Harris D, et al. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature. 1998;393:537-44 pubmed
- Ghim S, Choi S, Shin B, Park S. An 8 kb nucleotide sequence at the 3' flanking region of the sspC gene (184 degrees) on the Bacillus subtilis 168 chromosome containing an intein and an intron. DNA Res. 1998;5:121-6 pubmed
- isomerizing.
- Slesarev A, Mezhevaya K, Makarova K, Polushin N, Shcherbinina O, Shakhova V, et al. The complete genome of hyperthermophile Methanopyrus kandleri AV19 and monophyly of archaeal methanogens. Proc Natl Acad Sci U S A. 2002;99:4644-9 pubmed
- Fujikane R, Ishino S, Ishino Y, Forterre P. Genetic analysis of DNA repair in the hyperthermophilic archaeon, Thermococcus kodakaraensis. Genes Genet Syst. 2010;85:243-57 pubmed
- Eichinger L, Pachebat J, Glockner G, Rajandream M, Sucgang R, Berriman M, et al. The genome of the social amoeba Dictyostelium discoideum. Nature. 2005;435:43-57 pubmed
- Kaneko T, Tanaka A, Sato S, Kotani H, Sazuka T, Miyajima N, et al. Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. I. Sequence features in the 1 Mb region from map positions 64% to 92% of the genome. DNA Res. 1995;2:153-66, 191-8 pubmed
- Kaneko T, Sato S, Kotani H, Tanaka A, Asamizu E, Nakamura Y, et al. Sequence analysis of the genome of the unicellular cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions. DNA Res. 1996;3:109-36 pubmed
- Wu H, Xu M, Liu X. Protein trans-splicing and functional mini-inteins of a cyanobacterial dnaB intein. Biochim Biophys Acta. 1998;1387:422-32 pubmed
- Mathys S, Evans T, Chute I, Wu H, Chong S, Benner J, et al. Characterization of a self-splicing mini-intein and its conversion into autocatalytic N- and C-terminal cleavage elements: facile production of protein building blocks for protein ligation. Gene. 1999;231:1-13 pubmed
- Cohen G, Barbe V, Flament D, Galperin M, Heilig R, Lecompte O, et al. An integrated analysis of the genome of the hyperthermophilic archaeon Pyrococcus abyssi. Mol Microbiol. 2003;47:1495-512 pubmed
- Huang C, Wang S, Chen L, Lemieux C, Otis C, Turmel M, et al. The Chlamydomonas chloroplast clpP gene contains translated large insertion sequences and is essential for cell growth. Mol Gen Genet. 1994;244:151-9 pubmed
- Wang S, Liu X. Identification of an unusual intein in chloroplast ClpP protease of Chlamydomonas eugametos. J Biol Chem. 1997;272:11869-73 pubmed
- Telenti A, Southworth M, Alcaide F, Daugelat S, Jacobs W, Perler F. The Mycobacterium xenopi GyrA protein splicing element: characterization of a minimal intein. J Bacteriol. 1997;179:6378-82 pubmed
- Klabunde T, Sharma S, Telenti A, Jacobs W, Sacchettini J. Crystal structure of GyrA intein from Mycobacterium xenopi reveals structural basis of protein splicing. Nat Struct Biol. 1998;5:31-6 pubmed
- Kenny D, Jurata L, Saga Y, Gill G. Identification and characterization of LMO4, an LMO gene with a novel pattern of expression during embryogenesis. Proc Natl Acad Sci U S A. 1998;95:11257-62 pubmed
- Sugihara T, Bach I, Kioussi C, Rosenfeld M, Andersen B. Mouse deformed epidermal autoregulatory factor 1 recruits a LIM domain factor, LMO-4, and CLIM coregulators. Proc Natl Acad Sci U S A. 1998;95:15418-23 pubmed
- Grutz G, Forster A, Rabbitts T. Identification of the LMO4 gene encoding an interaction partner of the LIM-binding protein LDB1/NLI1: a candidate for displacement by LMO proteins in T cell acute leukaemia. Oncogene. 1998;17:2799-803 pubmed
- Gerhard D, Wagner L, Feingold E, Shenmen C, Grouse L, Schuler G, et al. The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC). Genome Res. 2004;14:2121-7 pubmed
- Sum E, Peng B, Yu X, Chen J, Byrne J, Lindeman G, et al. The LIM domain protein LMO4 interacts with the cofactor CtIP and the tumor suppressor BRCA1 and inhibits BRCA1 activity. J Biol Chem. 2002;277:7849-56 pubmed
- Deane J, Mackay J, Kwan A, Sum E, Visvader J, Matthews J. Structural basis for the recognition of ldb1 by the N-terminal LIM domains of LMO2 and LMO4. EMBO J. 2003;22:2224-33 pubmed
- Deane J, Ryan D, Sunde M, Maher M, Guss J, Visvader J, et al. Tandem LIM domains provide synergistic binding in the LMO4:Ldb1 complex. EMBO J. 2004;23:3589-98 pubmed
- Jeffries C, Graham S, Stokes P, Collyer C, Guss J, Matthews J. Stabilization of a binary protein complex by intein-mediated cyclization. Protein Sci. 2006;15:2612-8 pubmed
- Aagaard C, Awayez M, Garrett R. Profile of the DNA recognition site of the archaeal homing endonuclease I-DmoI. Nucleic Acids Res. 1997;25:1523-30 pubmed
- Silva G, Dalgaard J, Belfort M, Van Roey P. Crystal structure of the thermostable archaeal intron-encoded endonuclease I-DmoI. J Mol Biol. 1999;286:1123-36 pubmed
- Liu X, Hu Z. A DnaB intein in Rhodothermus marinus: indication of recent intein homing across remotely related organisms. Proc Natl Acad Sci U S A. 1997;94:7851-6 pubmed
- Echelard Y, Epstein D, St Jacques B, Shen L, Mohler J, McMahon J, et al. Sonic hedgehog, a member of a family of putative signaling molecules, is implicated in the regulation of CNS polarity. Cell. 1993;75:1417-30 pubmed
- Chang D, L pez A, von Kessler D, Chiang C, Simandl B, Zhao R, et al. Products, genetic linkage and limb patterning activity of a murine hedgehog gene. Development. 1994;120:3339-53 pubmed
- Garnier T, Eiglmeier K, Camus J, Medina N, Mansoor H, Pryor M, et al. The complete genome sequence of Mycobacterium bovis. Proc Natl Acad Sci U S A. 2003;100:7877-82 pubmed
- Marigo V, Roberts D, Lee S, Tsukurov O, Levi T, Gastier J, et al. Cloning, expression, and chromosomal location of SHH and IHH: two human homologues of the Drosophila segment polarity gene hedgehog. Genomics. 1995;28:44-51 pubmed
- Pepinsky R, Zeng C, Wen D, Rayhorn P, Baker D, Williams K, et al. Identification of a palmitic acid-modified form of human Sonic hedgehog. J Biol Chem. 1998;273:14037-45 pubmed
- Liu T, Qian W, Gritsenko M, Camp D, Monroe M, Moore R, et al. Human plasma N-glycoproteome analysis by immunoaffinity subtraction, hydrazide chemistry, and mass spectrometry. J Proteome Res. 2005;4:2070-80 pubmed
- Gao B, Guo J, She C, Shu A, Yang M, Tan Z, et al. Mutations in IHH, encoding Indian hedgehog, cause brachydactyly type A-1. Nat Genet. 2001;28:386-8 pubmed
- McCready M, Sweeney E, Fryer A, Donnai D, Baig A, Racacho L, et al. A novel mutation in the IHH gene causes brachydactyly type A1: a 95-year-old mystery resolved. Hum Genet. 2002;111:368-75 pubmed
- Hellemans J, Coucke P, Giedion A, De Paepe A, Kramer P, Beemer F, et al. Homozygous mutations in IHH cause acrocapitofemoral dysplasia, an autosomal recessive disorder with cone-shaped epiphyses in hands and hips. Am J Hum Genet. 2003;72:1040-6 pubmed
- Currie P, Ingham P. Induction of a specific muscle cell type by a hedgehog-like protein in zebrafish. Nature. 1996;382:452-5 pubmed
- Zardoya R, Abouheif E, Meyer A. Evolutionary analyses of hedgehog and Hoxd-10 genes in fish species closely related to the zebrafish. Proc Natl Acad Sci U S A. 1996;93:13036-41 pubmed
- Fsihi H, De Rossi E, Salazar L, Cantoni R, Labò M, Riccardi G, et al. Gene arrangement and organization in a approximately 76 kb fragment encompassing the oriC region of the chromosome of Mycobacterium leprae. Microbiology. 1996;142 ( Pt 11):3147-61 pubmed
- Kunst F, Ogasawara N, Moszer I, Albertini A, Alloni G, Azevedo V, et al. The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature. 1997;390:249-56 pubmed
- Lazarevic V, Soldo B, D sterh ft A, Hilbert H, Mau l C, Karamata D. Introns and intein coding sequence in the ribonucleotide reductase genes of Bacillus subtilis temperate bacteriophage SPbeta. Proc Natl Acad Sci U S A. 1998;95:1692-7 pubmed
- Dietrich F, Voegeli S, Brachat S, Lerch A, Gates K, Steiner S, et al. The Ashbya gossypii genome as a tool for mapping the ancient Saccharomyces cerevisiae genome. Science. 2004;304:304-7 pubmed
- Roelink H, Augsburger A, Heemskerk J, Korzh V, Norlin S, Ruiz i Altaba A, et al. Floor plate and motor neuron induction by vhh-1, a vertebrate homolog of hedgehog expressed by the notochord. Cell. 1994;76:761-75 pubmed
- Ekker S, Ungar A, Greenstein P, von Kessler D, Porter J, Moon R, et al. Patterning activities of vertebrate hedgehog proteins in the developing eye and brain. Curr Biol. 1995;5:944-55 pubmed
- Fietz M, Concordet J, Barbosa R, Johnson R, Krauss S, McMahon A, et al. The hedgehog gene family in Drosophila and vertebrate development. Dev Suppl. 1994;:43-51 pubmed
- Muller F, Chang B, Albert S, Fischer N, Tora L, Strahle U. Intronic enhancers control expression of zebrafish sonic hedgehog in floor plate and notochord. Development. 1999;126:2103-16 pubmed
- Moak M, Molineux I. Peptidoglycan hydrolytic activities associated with bacteriophage virions. Mol Microbiol. 2004;51:1169-83 pubmed
- . Genome sequence of the nematode C. elegans: a platform for investigating biology. Science. 1998;282:2012-8 pubmed
- Hao L, Mukherjee K, Liegeois S, Baillie D, Labouesse M, Burglin T. The hedgehog-related gene qua-1 is required for molting in Caenorhabditis elegans. Dev Dyn. 2006;235:1469-81 pubmed
- Porter J, Ekker S, Park W, von Kessler D, Young K, Chen C, et al. Hedgehog patterning activity: role of a lipophilic modification mediated by the carboxy-terminal autoprocessing domain. Cell. 1996;86:21-34 pubmed
- Sebaihia M, Peck M, Minton N, Thomson N, Holden M, Mitchell W, et al. Genome sequence of a proteolytic (Group I) Clostridium botulinum strain Hall A and comparative analysis of the clostridial genomes. Genome Res. 2007;17:1082-92 pubmed
- Smith T, Hill K, Foley B, Detter J, Munk A, Bruce D, et al. Analysis of the neurotoxin complex genes in Clostridium botulinum A1-A4 and B1 strains: BoNT/A3, /Ba4 and /B1 clusters are located within plasmids. PLoS ONE. 2007;2:e1271 pubmed
- Davis E, Thangaraj H, Brooks P, Colston M. Evidence of selection for protein introns in the recAs of pathogenic mycobacteria. EMBO J. 1994;13:699-703 pubmed
- Jakob N, Muller K, Bahr U, Darai G. Analysis of the first complete DNA sequence of an invertebrate iridovirus: coding strategy of the genome of Chilo iridescent virus. Virology. 2001;286:182-96 pubmed
- Eaton H, Metcalf J, Penny E, Tcherepanov V, Upton C, Brunetti C. Comparative genomic analysis of the family Iridoviridae: re-annotating and defining the core set of iridovirus genes. Virol J. 2007;4:11 pubmed
- Senejani A, Hilario E, Gogarten J. The intein of the Thermoplasma A-ATPase A subunit: structure, evolution and expression in E. coli. BMC Biochem. 2001;2:13 pubmed
- Ruepp A, Graml W, Santos Martinez M, Koretke K, Volker C, Mewes H, et al. The genome sequence of the thermoacidophilic scavenger Thermoplasma acidophilum. Nature. 2000;407:508-13 pubmed
- Kitabayashi M, Nishiya Y, Esaka M, Itakura M, Imanaka T. Gene cloning and function analysis of replication factor C from Thermococcus kodakaraensis KOD1. Biosci Biotechnol Biochem. 2003;67:2373-80 pubmed
- Connors M, Setlow P. Cloning of a small, acid-soluble spore protein gene from Bacillus subtilis and determination of its complete nucleotide sequence. J Bacteriol. 1985;161:333-9 pubmed
- Riddle R, Johnson R, Laufer E, Tabin C. Sonic hedgehog mediates the polarizing activity of the ZPA. Cell. 1993;75:1401-16 pubmed
- Roelink H, Porter J, Chiang C, Tanabe Y, Chang D, Beachy P, et al. Floor plate and motor neuron induction by different concentrations of the amino-terminal cleavage product of sonic hedgehog autoproteolysis. Cell. 1995;81:445-55 pubmed
- Bumcrot D, Takada R, McMahon A. Proteolytic processing yields two secreted forms of sonic hedgehog. Mol Cell Biol. 1995;15:2294-303 pubmed
- Aspöck G, Kagoshima H, Niklaus G, Burglin T. Caenorhabditis elegans has scores of hedgehog-related genes: sequence and expression analysis. Genome Res. 1999;9:909-23 pubmed
- Kaji H, Saito H, Yamauchi Y, Shinkawa T, Taoka M, Hirabayashi J, et al. Lectin affinity capture, isotope-coded tagging and mass spectrometry to identify N-linked glycoproteins. Nat Biotechnol. 2003;21:667-72 pubmed
- Zeng X, Goetz J, Suber L, Scott W, Schreiner C, Robbins D. A freely diffusible form of Sonic hedgehog mediates long-range signalling. Nature. 2001;411:716-20 pubmed
- Stolow M, Shi Y. Xenopus sonic hedgehog as a potential morphogen during embryogenesis and thyroid hormone-dependent metamorphosis. Nucleic Acids Res. 1995;23:2555-62 pubmed
- Ekker S, McGrew L, Lai C, Lee J, von Kessler D, Moon R, et al. Distinct expression and shared activities of members of the hedgehog gene family of Xenopus laevis. Development. 1995;121:2337-47 pubmed
- Ruiz i Altaba A, Jessell T, Roelink H. Restrictions to floor plate induction by hedgehog and winged-helix genes in the neural tube of frog embryos. Mol Cell Neurosci. 1995;6:106-21 pubmed
- Papavinasasundaram K, Colston M, Davis E. Construction and complementation of a recA deletion mutant of Mycobacterium smegmatis reveals that the intein in Mycobacterium tuberculosis recA does not affect RecA function. Mol Microbiol. 1998;30:525-34 pubmed
- Pitcher R, Green A, Brzostek A, Korycka Machala M, Dziadek J, Doherty A. NHEJ protects mycobacteria in stationary phase against the harmful effects of desiccation. DNA Repair (Amst). 2007;6:1271-6 pubmed
- Stephanou N, Gao F, Bongiorno P, Ehrt S, Schnappinger D, Shuman S, et al. Mycobacterial nonhomologous end joining mediates mutagenic repair of chromosomal double-strand DNA breaks. J Bacteriol. 2007;189:5237-46 pubmed
- Datta S, Krishna R, Ganesh N, Chandra N, Muniyappa K, Vijayan M. Crystal structures of Mycobacterium smegmatis RecA and its nucleotide complexes. J Bacteriol. 2003;185:4280-4 pubmed
- Krishna R, Prabu J, Manjunath G, Datta S, Chandra N, Muniyappa K, et al. Snapshots of RecA protein involving movement of the C-domain and different conformations of the DNA-binding loops: crystallographic and comparative analysis of 11 structures of Mycobacterium smegmatis RecA. J Mol Biol. 2007;367:1130-44 pubmed publisher
- Persson R, McGeehan J, Wilson K. Cloning, expression, purification, and characterisation of the dUTPase encoded by the integrated Bacillus subtilis temperate bacteriophage SPbeta. Protein Expr Purif. 2005;42:92-9 pubmed
- Persson R, Harkiolaki M, McGeehan J, Wilson K. Crystallization and preliminary crystallographic analysis of deoxyuridine 5'-triphosphate nucleotidohydrolase from Bacillus subtilis. Acta Crystallogr D Biol Crystallogr. 2001;57:876-8 pubmed
- Henneke G, Gueguen Y, Flament D, Azam P, Querellou J, Dietrich J, et al. Replication factor C from the hyperthermophilic archaeon Pyrococcus abyssi does not need ATP hydrolysis for clamp-loading and contains a functionally conserved RFC PCNA-binding domain. J Mol Biol. 2002;323:795-810 pubmed
- Ogata H, Raoult D, Claverie J. A new example of viral intein in Mimivirus. Virol J. 2005;2:8 pubmed
- Raoult D, Audic S, Robert C, Abergel C, Renesto P, Ogata H, et al. The 1.2-megabase genome sequence of Mimivirus. Science. 2004;306:1344-50 pubmed
- Evans T, Martin D, Kolly R, Panne D, Sun L, Ghosh I, et al. Protein trans-splicing and cyclization by a naturally split intein from the dnaE gene of Synechocystis species PCC6803. J Biol Chem. 2000;275:9091-4 pubmed
- Martin D, Xu M, Evans T. Characterization of a naturally occurring trans-splicing intein from Synechocystis sp. PCC6803. Biochemistry. 2001;40:1393-402 pubmed
- Dirksen L, Proft T, Hilbert H, Plagens H, Herrmann R, Krause D. Sequence analysis and characterization of the hmw gene cluster of Mycoplasma pneumoniae. Gene. 1996;171:19-25 pubmed
- Himmelreich R, Hilbert H, Plagens H, Pirkl E, Li B, Herrmann R. Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucleic Acids Res. 1996;24:4420-49 pubmed
- Pietrokovski S. Conserved sequence features of inteins (protein introns) and their use in identifying new inteins and related proteins. Protein Sci. 1994;3:2340-50 pubmed
- Perler F, Olsen G, Adam E. Compilation and analysis of intein sequences. Nucleic Acids Res. 1997;25:1087-93 pubmed
- Derbyshire V, Wood D, Wu W, Dansereau J, Dalgaard J, Belfort M. Genetic definition of a protein-splicing domain: functional mini-inteins support structure predictions and a model for intein evolution. Proc Natl Acad Sci U S A. 1997;94:11466-71 pubmed
- Gogarten J, Hilario E. Inteins, introns, and homing endonucleases: recent revelations about the life cycle of parasitic genetic elements. BMC Evol Biol. 2006;6:94 pubmed
- Galburt E, Stoddard B. Catalytic mechanisms of restriction and homing endonucleases. Biochemistry. 2002;41:13851-60 pubmed
- Duan X, Gimble F, Quiocho F. Crystal structure of PI-SceI, a homing endonuclease with protein splicing activity. Cell. 1997;89:555-64 pubmed
- Wu H, Hu Z, Liu X. Protein trans-splicing by a split intein encoded in a split DnaE gene of Synechocystis sp. PCC6803. Proc Natl Acad Sci U S A. 1998;95:9226-31 pubmed
- Paulus H. Inteins as enzymes. Bioorg Chem. 2001;29:119-29 pubmed
- Evans T, Xu M. Intein-mediated protein ligation: harnessing nature's escape artists. Biopolymers. 1999;51:333-42 pubmed
- Kubis M. [Relation of water hardness to the occurrence of acute myocardial infarct]. Acta Univ Palacki Olomuc Fac Med. 1985;111:321-4 pubmed
- Southworth M, Benner J, Perler F. An alternative protein splicing mechanism for inteins lacking an N-terminal nucleophile. EMBO J. 2000;19:5019-26 pubmed
- Yamamoto K, Low B, Rutherford S, Rajagopalan M, Madiraju M. The Mycobacterium avium-intracellulare complex dnaB locus and protein intein splicing. Biochem Biophys Res Commun. 2001;280:898-903 pubmed
- Chong S, Mersha F, Comb D, Scott M, Landry D, Vence L, et al. Single-column purification of free recombinant proteins using a self-cleavable affinity tag derived from a protein splicing element. Gene. 1997;192:271-81 pubmed
- Bentz W, Davis A. Perceptions of emotional disorders among children as viewed by leaders, teachers, and the general public. Am J Public Health. 1975;65:129-32 pubmed
- Evans T, Benner J, Xu M. The cyclization and polymerization of bacterially expressed proteins using modified self-splicing inteins. J Biol Chem. 1999;274:18359-63 pubmed
- Seyedsayamdost M, Yee C, Stubbe J. Site-specific incorporation of fluorotyrosines into the R2 subunit of E. coli ribonucleotide reductase by expressed protein ligation. Nat Protoc. 2007;2:1225-35 pubmed
- Otomo T, Teruya K, Uegaki K, Yamazaki T, Kyogoku Y. Improved segmental isotope labeling of proteins and application to a larger protein. J Biomol NMR. 1999;14:105-14 pubmed
- Cowburn D, Muir T. Segmental isotopic labeling using expressed protein ligation. Methods Enzymol. 2001;339:41-54 pubmed
- Shah N, Muir T. Inteins: Nature's Gift to Protein Chemists. Chem Sci. 2014;5:446-461 pubmed
- Ozawa T, Sako Y, Sato M, Kitamura T, Umezawa Y. A genetic approach to identifying mitochondrial proteins. Nat Biotechnol. 2003;21:287-93 pubmed
- Ozawa T, Nishitani K, Sako Y, Umezawa Y. A high-throughput screening of genes that encode proteins transported into the endoplasmic reticulum in mammalian cells. Nucleic Acids Res. 2005;33:e34 pubmed
- Amarasinghe C, Jin J. The Use of Affinity Tags to Overcome Obstacles in Recombinant Protein Expression and Purification. Protein Pept Lett. 2015;22:885-92 pubmed
- Bornhorst J, Falke J. Purification of proteins using polyhistidine affinity tags. Methods Enzymol. 2000;326:245-54 pubmed
- Bornhorst B, Falke J. Reprint of: Purification of Proteins Using Polyhistidine Affinity Tags. Protein Expr Purif. 2011;: pubmed
- Lebendiker M, Danieli T. Purification of Proteins Fused to Maltose-Binding Protein. Methods Mol Biol. 2017;1485:257-273 pubmed
- Suárez Richards M. [Chronic treatment of schizophrenia with injectable bromperidol decanoate]. Acta Psiquiatr Psicol Am Lat. 1985;31:222-8 pubmed
- Southworth M, Amaya K, Evans T, Xu M, Perler F. Purification of proteins fused to either the amino or carboxy terminus of the Mycobacterium xenopi gyrase A intein. Biotechniques. 1999;27:110-4, 116, 118-20 pubmed
- Mills K, Manning J, Garcia A, Wuerdeman L. Protein splicing of a Pyrococcus abyssi intein with a C-terminal glutamine. J Biol Chem. 2004;279:20685-91 pubmed
- Mootz H, Blum E, Tyszkiewicz A, Muir T. Conditional protein splicing: a new tool to control protein structure and function in vitro and in vivo. J Am Chem Soc. 2003;125:10561-9 pubmed
- Brenzel S, Mootz H. Design of an intein that can be inhibited with a small molecule ligand. J Am Chem Soc. 2005;127:4176-7 pubmed
- Singh S, Panda A. Solubilization and refolding of bacterial inclusion body proteins. J Biosci Bioeng. 2005;99:303-10 pubmed
- Wood D, Wu W, Belfort G, Derbyshire V, Belfort M. A genetic system yields self-cleaving inteins for bioseparations. Nat Biotechnol. 1999;17:889-92 pubmed
- Wood D, Derbyshire V, Wu W, Chartrain M, Belfort M, Belfort G. Optimized single-step affinity purification with a self-cleaving intein applied to human acidic fibroblast growth factor. Biotechnol Prog. 2000;16:1055-63 pubmed
- Banki M, Feng L, Wood D. Simple bioseparations using self-cleaving elastin-like polypeptide tags. Nat Methods. 2005;2:659-61 pubmed
- Shi C, Han T, Wood D. Purification of Microbially Expressed Recombinant Proteins via a Dual ELP Split Intein System. Methods Mol Biol. 2017;1495:13-25 pubmed
- Banki M, Gerngross T, Wood D. Novel and economical purification of recombinant proteins: intein-mediated protein purification using in vivo polyhydroxybutyrate (PHB) matrix association. Protein Sci. 2005;14:1387-95 pubmed
- Wu W, Wood D, Belfort G, Derbyshire V, Belfort M. Intein-mediated purification of cytotoxic endonuclease I-TevI by insertional inactivation and pH-controllable splicing. Nucleic Acids Res. 2002;30:4864-71 pubmed
- Evans T, Benner J, Xu M. The in vitro ligation of bacterially expressed proteins using an intein from Methanobacterium thermoautotrophicum. J Biol Chem. 1999;274:3923-6 pubmed
- Zhao Z, Lu W, Dun B, Jin D, Ping S, Zhang W, et al. Purification of green fluorescent protein using a two-intein system. Appl Microbiol Biotechnol. 2008;77:1175-80 pubmed
- Shingledecker K, Jiang S, Paulus H. Molecular dissection of the Mycobacterium tuberculosis RecA intein: design of a minimal intein and of a trans-splicing system involving two intein fragments. Gene. 1998;207:187-95 pubmed
- Mills K, Lew B, Jiang S, Paulus H. Protein splicing in trans by purified N- and C-terminal fragments of the Mycobacterium tuberculosis RecA intein. Proc Natl Acad Sci U S A. 1998;95:3543-8 pubmed
- Handrick R, Reinhardt S, Schultheiss D, Reichart T, Schüler D, Jendrossek V, et al. Unraveling the function of the Rhodospirillum rubrum activator of polyhydroxybutyrate (PHB) degradation: the activator is a PHB-granule-bound protein (phasin). J Bacteriol. 2004;186:2466-75 pubmed
- Georgiou G, Jeong K. Proteins from PHB granules. Protein Sci. 2005;14:1385-6 pubmed
- Kimura R, Camarero J. Expressed protein ligation: a new tool for the biosynthesis of cyclic polypeptides. Protein Pept Lett. 2005;12:789-94 pubmed
- Tavassoli A, Benkovic S. Split-intein mediated circular ligation used in the synthesis of cyclic peptide libraries in E. coli. Nat Protoc. 2007;2:1126-33 pubmed
- Valiyaveetil F, MacKinnon R, Muir T. Semisynthesis and folding of the potassium channel KcsA. J Am Chem Soc. 2002;124:9113-20 pubmed
- Valiyaveetil F, Leonetti M, Muir T, MacKinnon R. Ion selectivity in a semisynthetic K+ channel locked in the conductive conformation. Science. 2006;314:1004-7 pubmed
- Muralidharan V, Cho J, Trester Zedlitz M, Kowalik L, Chait B, Raleigh D, et al. Domain-specific incorporation of noninvasive optical probes into recombinant proteins. J Am Chem Soc. 2004;126:14004-12 pubmed
- Muralidharan V, Muir T. Protein ligation: an enabling technology for the biophysical analysis of proteins. Nat Methods. 2006;3:429-38 pubmed
- Grindl W, Wende W, Pingoud V, Pingoud A. The protein splicing domain of the homing endonuclease PI-sceI is responsible for specific DNA binding. Nucleic Acids Res. 1998;26:1857-62 pubmed
- Evans T, Benner J, Xu M. Semisynthesis of cytotoxic proteins using a modified protein splicing element. Protein Sci. 1998;7:2256-64 pubmed
- Girish A, Sun H, Yeo D, Chen G, Chua T, Yao S. Site-specific immobilization of proteins in a microarray using intein-mediated protein splicing. Bioorg Med Chem Lett. 2005;15:2447-51 pubmed
- Severinov K, Muir T. Expressed protein ligation, a novel method for studying protein-protein interactions in transcription. J Biol Chem. 1998;273:16205-9 pubmed
- Craik D. Chemistry. Seamless proteins tie up their loose ends. Science. 2006;311:1563-4 pubmed
- Hirel P, Schmitter M, Dessen P, Fayat G, Blanquet S. Extent of N-terminal methionine excision from Escherichia coli proteins is governed by the side-chain length of the penultimate amino acid. Proc Natl Acad Sci U S A. 1989;86:8247-51 pubmed
- Erlanson D, Chytil M, Verdine G. The leucine zipper domain controls the orientation of AP-1 in the NFAT.AP-1.DNA complex. Chem Biol. 1996;3:981-91 pubmed
- Tolbert T, Wong C. New methods for proteomic research: preparation of proteins with N-terminal cysteines for labeling and conjugation. Angew Chem Int Ed Engl. 2002;41:2171-4 pubmed
- Hauser P, Ryan R. Expressed protein ligation using an N-terminal cysteine containing fragment generated in vivo from a pelB fusion protein. Protein Expr Purif. 2007;54:227-33 pubmed
- Xu M, Perler F. The mechanism of protein splicing and its modulation by mutation. EMBO J. 1996;15:5146-53 pubmed
- Camarero J, Fushman D, Sato S, Giriat I, Cowburn D, Raleigh D, et al. Rescuing a destabilized protein fold through backbone cyclization. J Mol Biol. 2001;308:1045-62 pubmed
- Camarero J, Cotton G, Adeva A, Muir T. Chemical ligation of unprotected peptides directly from a solid support. J Pept Res. 1998;51:303-16 pubmed
- Camarero J, Fushman D, Cowburn D, Muir T. Peptide chemical ligation inside living cells: in vivo generation of a circular protein domain. Bioorg Med Chem. 2001;9:2479-84 pubmed
- Iwai H, Pluckthun A. Circular beta-lactamase: stability enhancement by cyclizing the backbone. FEBS Lett. 1999;459:166-72 pubmed
- Iwai H, Lingel A, Pluckthun A. Cyclic green fluorescent protein produced in vivo using an artificially split PI-PfuI intein from Pyrococcus furiosus. J Biol Chem. 2001;276:16548-54 pubmed
- Scott C, Abel Santos E, Wall M, Wahnon D, Benkovic S. Production of cyclic peptides and proteins in vivo. Proc Natl Acad Sci U S A. 1999;96:13638-43 pubmed
- Abel Santos E, Scott C, Benkovic S. Use of inteins for the in vivo production of stable cyclic peptide libraries in E. coli. Methods Mol Biol. 2003;205:281-94 pubmed
- Scott C, Abel Santos E, Jones A, Benkovic S. Structural requirements for the biosynthesis of backbone cyclic peptide libraries. Chem Biol. 2001;8:801-15 pubmed
- Cenac P, Dupoirieux L. [Complete avulsion of the scalp. Apropos of a case]. Rev Stomatol Chir Maxillofac. 1991;92:120-3 pubmed
- Kinsella T, Ohashi C, Harder A, Yam G, Li W, Peelle B, et al. Retrovirally delivered random cyclic Peptide libraries yield inhibitors of interleukin-4 signaling in human B cells. J Biol Chem. 2002;277:37512-8 pubmed
- Iwai H, Züger S. Protein ligation: applications in NMR studies of proteins. Biotechnol Genet Eng Rev. 2007;24:129-45 pubmed
- Kurpiers T, Mootz H. Regioselective cysteine bioconjugation by appending a labeled cystein tag to a protein by using protein splicing in trans. Angew Chem Int Ed Engl. 2007;46:5234-7 pubmed
- Ando T, Tsukiji S, Tanaka T, Nagamune T. Construction of a small-molecule-integrated semisynthetic split intein for in vivo protein ligation. Chem Commun (Camb). 2007;:4995-7 pubmed
- Kwon Y, Coleman M, Camarero J. Selective immobilization of proteins onto solid supports through split-intein-mediated protein trans-splicing. Angew Chem Int Ed Engl. 2006;45:1726-9 pubmed
- Züger S, Iwai H. Intein-based biosynthetic incorporation of unlabeled protein tags into isotopically labeled proteins for NMR studies. Nat Biotechnol. 2005;23:736-40 pubmed
- Yagi H, Tsujimoto T, Yamazaki T, Yoshida M, Akutsu H. Conformational change of H+-ATPase beta monomer revealed on segmental isotope labeling NMR spectroscopy. J Am Chem Soc. 2004;126:16632-8 pubmed
- Driscoll D, Copeland P. Mechanism and regulation of selenoprotein synthesis. Annu Rev Nutr. 2003;23:17-40 pubmed
- Bock A, Forchhammer K, Heider J, Baron C. Selenoprotein synthesis: an expansion of the genetic code. Trends Biochem Sci. 1991;16:463-7 pubmed
- Berry M, Banu L, Harney J, Larsen P. Functional characterization of the eukaryotic SECIS elements which direct selenocysteine insertion at UGA codons. EMBO J. 1993;12:3315-22 pubmed
- Ge S, Xu B. 38 cases of optic atrophy treated by needling qiuhou point. J Tradit Chin Med. 1989;9:171-2 pubmed
- Arner E, Sarioglu H, Lottspeich F, Holmgren A, Bock A. High-level expression in Escherichia coli of selenocysteine-containing rat thioredoxin reductase utilizing gene fusions with engineered bacterial-type SECIS elements and co-expression with the selA, selB and selC genes. J Mol Biol. 1999;292:1003-16 pubmed
- Brown W, Bissey L, Logan K, Pedersen N, Elder J, Collisson E. Feline immunodeficiency virus infects both CD4+ and CD8+ T lymphocytes. J Virol. 1991;65:3359-64 pubmed
- Edwards A, Arrowsmith C, Christendat D, Dharamsi A, Friesen J, Greenblatt J, et al. Protein production: feeding the crystallographers and NMR spectroscopists. Nat Struct Biol. 2000;7 Suppl:970-2 pubmed
- Hackenberger C, Friel C, Radford S, Imperiali B. Semisynthesis of a glycosylated Im7 analogue for protein folding studies. J Am Chem Soc. 2005;127:12882-9 pubmed
- Olschewski D, Seidel R, Miesbauer M, Rambold A, Oesterhelt D, Winklhofer K, et al. Semisynthetic murine prion protein equipped with a GPI anchor mimic incorporates into cellular membranes. Chem Biol. 2007;14:994-1006 pubmed
- Goody R, Durek T, Waldmann H, Brunsveld L, Alexandrov K. Application of protein semisynthesis for the construction of functionalized posttranslationally modified rab GTPases. Methods Enzymol. 2005;403:29-42 pubmed
- Brunsveld L, Watzke A, Durek T, Alexandrov K, Goody R, Waldmann H. Synthesis of functionalized rab GTPases by a combination of solution- or solid-phase lipopeptide synthesis with expressed protein ligation. Chemistry. 2005;11:2756-72 pubmed
- Brunsveld L, Kuhlmann J, Alexandrov K, Wittinghofer A, Goody R, Waldmann H. Lipidated ras and rab peptides and proteins--synthesis, structure, and function. Angew Chem Int Ed Engl. 2006;45:6622-46 pubmed
- Pylypenko O, Rak A, Durek T, Kushnir S, Dursina B, Thomae N, et al. Structure of doubly prenylated Ypt1:GDI complex and the mechanism of GDI-mediated Rab recycling. EMBO J. 2006;25:13-23 pubmed
- Muir T, Sondhi D, Cole P. Expressed protein ligation: a general method for protein engineering. Proc Natl Acad Sci U S A. 1998;95:6705-10 pubmed
- Flavell R, Huse M, Goger M, Trester Zedlitz M, Kuriyan J, Muir T. Efficient semisynthesis of a tetraphosphorylated analogue of the Type I TGFbeta receptor. Org Lett. 2002;4:165-8 pubmed
- Ottesen J, Huse M, Sekedat M, Muir T. Semisynthesis of phosphovariants of Smad2 reveals a substrate preference of the activated T beta RI kinase. Biochemistry. 2004;43:5698-706 pubmed
- Shogren Knaak M, Fry C, Peterson C. A native peptide ligation strategy for deciphering nucleosomal histone modifications. J Biol Chem. 2003;278:15744-8 pubmed
- Qin B, Lam S, Correia J, Lin K. Smad3 allostery links TGF-beta receptor kinase activation to transcriptional control. Genes Dev. 2002;16:1950-63 pubmed
- Chacko B, Qin B, Tiwari A, Shi G, Lam S, Hayward L, et al. Structural basis of heteromeric smad protein assembly in TGF-beta signaling. Mol Cell. 2004;15:813-23 pubmed
- He S, Bauman D, Davis J, Loyola A, Nishioka K, Gronlund J, et al. Facile synthesis of site-specifically acetylated and methylated histone proteins: reagents for evaluation of the histone code hypothesis. Proc Natl Acad Sci U S A. 2003;100:12033-8 pubmed
- Shogren Knaak M, Ishii H, Sun J, Pazin M, Davie J, Peterson C. Histone H4-K16 acetylation controls chromatin structure and protein interactions. Science. 2006;311:844-7 pubmed
- Zhu H, Bilgin M, Bangham R, Hall D, Casamayor A, Bertone P, et al. Global analysis of protein activities using proteome chips. Science. 2001;293:2101-5 pubmed
- MacBeath G, Schreiber S. Printing proteins as microarrays for high-throughput function determination. Science. 2000;289:1760-3 pubmed
- Camarero J. Recent developments in the site-specific immobilization of proteins onto solid supports. Biopolymers. 2008;90:450-8 pubmed
- Lesaicherre M, Lue R, Chen G, Zhu Q, Yao S. Intein-mediated biotinylation of proteins and its application in a protein microarray. J Am Chem Soc. 2002;124:8768-9 pubmed
- Holland Nell K, Beck Sickinger A. Specifically immobilised aldo/keto reductase AKR1A1 shows a dramatic increase in activity relative to the randomly immobilised enzyme. Chembiochem. 2007;8:1071-6 pubmed
- Lue R, Chen G, Hu Y, Zhu Q, Yao S. Versatile protein biotinylation strategies for potential high-throughput proteomics. J Am Chem Soc. 2004;126:1055-62 pubmed
- Giriat I, Muir T. Protein semi-synthesis in living cells. J Am Chem Soc. 2003;125:7180-1 pubmed
- Mootz H, Muir T. Protein splicing triggered by a small molecule. J Am Chem Soc. 2002;124:9044-5 pubmed
- Schwartz E, Saez L, Young M, Muir T. Post-translational enzyme activation in an animal via optimized conditional protein splicing. Nat Chem Biol. 2007;3:50-4 pubmed
- Skretas G, Wood D. Regulation of protein activity with small-molecule-controlled inteins. Protein Sci. 2005;14:523-32 pubmed
- Buskirk A, Ong Y, Gartner Z, Liu D. Directed evolution of ligand dependence: small-molecule-activated protein splicing. Proc Natl Acad Sci U S A. 2004;101:10505-10 pubmed
- Yuen C, Rodda S, Vokes S, McMahon A, Liu D. Control of transcription factor activity and osteoblast differentiation in mammalian cells using an evolved small-molecule-dependent intein. J Am Chem Soc. 2006;128:8939-46 pubmed
- Cambon Bonavita M, Schmitt P, Zieger M, Flaman J, Lesongeur F, Raguenes G, et al. Cloning, expression, and characterization of DNA polymerase I from the hyperthermophilic archaea Thermococcus fumicolans. Extremophiles. 2000;4:215-25 pubmed
- Xu M, Southworth M, Mersha F, Hornstra L, Perler F. In vitro protein splicing of purified precursor and the identification of a branched intermediate. Cell. 1993;75:1371-7 pubmed
- Zeidler M, Tan C, Bellaiche Y, Cherry S, Häder S, Gayko U, et al. Temperature-sensitive control of protein activity by conditionally splicing inteins. Nat Biotechnol. 2004;22:871-6 pubmed
- Hackenberger C, Chen M, Imperiali B. Expression of N-terminal Cys-protein fragments using an intein refolding strategy. Bioorg Med Chem. 2006;14:5043-8 pubmed
- Cui C, Zhao W, Chen J, Wang J, Li Q. Elimination of in vivo cleavage between target protein and intein in the intein-mediated protein purification systems. Protein Expr Purif. 2006;50:74-81 pubmed
- Kanno A, Ozawa T, Umezawa Y. Intein-mediated reporter gene assay for detecting protein-protein interactions in living mammalian cells. Anal Chem. 2006;78:556-60 pubmed
- Skretas G, Wood D. A bacterial biosensor of endocrine modulators. J Mol Biol. 2005;349:464-74 pubmed
- Omar A, Adham F, Solimnn F, Khedre A. Laboratory rearing of Wohlfahrtia nuba Wiedemann (Diptera-Sarcophagidae) in Egypt. J Egypt Soc Parasitol. 1992;22:271-8 pubmed
- Skretas G, Meligova A, Villalonga Barber C, Mitsiou D, Alexis M, Micha Screttas M, et al. Engineered chimeric enzymes as tools for drug discovery: generating reliable bacterial screens for the detection, discovery, and assessment of estrogen receptor modulators. J Am Chem Soc. 2007;129:8443-57 pubmed
- Kanno A, Yamanaka Y, Hirano H, Umezawa Y, Ozawa T. Cyclic luciferase for real-time sensing of caspase-3 activities in living mammals. Angew Chem Int Ed Engl. 2007;46:7595-9 pubmed
- Materials and Methods [ISSN : 2329-5139] is a unique online journal with regularly updated review articles on laboratory materials and methods. If you are interested in contributing a manuscript or suggesting a topic, please leave us feedback.