Full-length Protein Sequencing Based on Continuous Digestion Using Non-specific Proteases

doi:10.6023/A21010025

Acta Chimica Sinica ›› 2021, Vol. 79 ›› Issue (5): 663-669.DOI: 10.6023/A21010025 Previous Articles Next Articles

Article

基于非特异性蛋白酶连续酶解的蛋白质全序列测定方法

杨超^a^,^b, 单亦初^a^,^*(), 张玮杰^a^,^b, 戴忠鹏^a, 张丽华^a^,^*(), 张玉奎^a

a 中国科学院大连化学物理研究所中国科学院分离分析化学重点实验室大连 116023
b 中国科学院大学北京 100049

投稿日期:2021-01-27 发布日期:2021-03-30
通讯作者: 单亦初, 张丽华
基金资助:
项目受国家重点研发计划课题(2017YFF0205404); 项目受国家重点研发计划课题(2017YFA0505004); 国家自然科学基金(21675153); 国家自然科学基金(21725506)

Full-length Protein Sequencing Based on Continuous Digestion Using Non-specific Proteases

Chao Yang^a^,^b, Yi-Chu Shan^a^,^*(), Wei-Jie Zhang^a^,^b, Zhong-Peng Dai^a, Li-Hua Zhang^a^,^*(), Yu-Kui Zhang^a

a CAS Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Dalian 116023, China
b University of Chinese Academy of Sciences, Beijing 100049, China

Received:2021-01-27 Published:2021-03-30
Contact: Yi-Chu Shan, Li-Hua Zhang
About author:
*E-mail: shanyichu@dicp.ac.cn
E-mail:lihuazhang@dicp.ac.cn
Supported by:
Ministry of Science and Technology of China(2017YFF0205404); Ministry of Science and Technology of China(2017YFA0505004); National Natural Science Foundation of China(21675153); National Natural Science Foundation of China(21725506)

1. .pdf(300KB)

Abstract

Determining the complete sequence of the protein is helpful to analyze the structure of the protein and reveal the biological function of the protein. In traditional “bottom-up” proteomic strategy, database searching is used to identify sequences of peptides and proteins analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS). It is impossible to identify proteins with unknown sequences through database searching, so de novo sequencing is essential for protein characterization. To increase the accuracy and coverage of protein sequencing, a de novo protein sequencing method based on continuous digestion using various non-specific proteases has been developed. A continuous digestion device was constructed, and a variety of non-specific proteases were used to continuously digest the protein. Taking advantage of the non-specific cleavage sites of non-specific proteases, the complementarity of peptides produced at different time and by different kinds of proteases, the type and overlapping degree of digested peptides were improved. The sequence coverage of peptides after continuous digestion by each protease can reach 100%. Finally, a sequence assembly algorithm was developed to assemble the peptides obtained by de novo sequencing. At first, the candidate peptide sequences were splitted into sequence tags which contain 7 amino acids, and then the most frequently occurring sequence tag was chosen as the seed sequence. Afterwards, the seed sequence was automatically or manually extended to the N-terminal end and C-terminal end respectively according to the scores of sequence tags. Finally, the complete protein sequence was successfully assembled. The developed method was applied to the de novo sequencing of bovine serum albumin (BSA) and monoclonal antibody Herceptin. Excluding leucine and isoleucine, full-length de novo sequencing was achieved with 100% accuracy for BSA and Herceptin light chain. Accuracy of the sequenced Herceptin heavy chain was 99.7%. The de novo sequencing strategy based on continuous digestion of proteins using non-specific proteases can be applied to de novo sequencing of proteins with unknown sequences or quality control of monoclonal antibody drugs.

Key words: non-specific protease, continuous digestion, sequence assembly, full-length sequencing

Cite this article

Chao Yang, Yi-Chu Shan, Wei-Jie Zhang, Zhong-Peng Dai, Li-Hua Zhang, Yu-Kui Zhang. Full-length Protein Sequencing Based on Continuous Digestion Using Non-specific Proteases[J]. Acta Chimica Sinica, 2021, 79(5): 663-669.

Export EndNote|Reference Manager|ProCite|BibTeX|RefWorks

share this article

Fig. & Tab. 8

蛋白酶	时间1	时间2	时间3	时间4	时间5
菠萝蛋白酶	20	90	180	270	360
糜蛋白酶	20	90	150	300	420
弹性蛋白酶	20	120	180	240	420
蛋白酶K	20	90	180	270	360
链霉蛋白酶	20	40	120	150	180
嗜热菌蛋白酶	5	90	210	300	360

Table 1. The digestion time (min) for BSA to be digested by different proteases

Figure 1. Number of BSA peptides produced using different proteases for different digestion times. (a) bromelain. (b) chymotrypsin. (c) elastase. (d) proteinase K. (e) pronase. (f) thermolysin

Figure 2. The frequencies and numbers of identified peptides produced at different digestion times

Figure 3. Number of peptides produced by using one-time and continuous digestion

酶解方式	菠萝蛋白酶	糜蛋白酶	弹性蛋白酶	蛋白酶K	链霉蛋白酶	嗜热菌蛋白酶
一次酶解	100	94.34	98.63	96.23	96.91	97.77
连续酶解	100	100	100	100	100	100

Table 2. Sequence coverage of continuous digestion and one-time digestion (%)

Figure 4. Sequence coverage of de novo sequencing peptides

Figure 5. The diagram of sequence assembly

Figure 6. Sequence assembly algorithm

References 32

[1]	Galat, A. Arch. Biochem. Biophys. 1999, 371,149. pmid: 10545201
[2]	Jones, D. T.; Taylor, W. R.; Thornton, J. M. Bioinformatics 1992, 8,275. doi: 10.1093/bioinformatics/8.3.275
[3]	Fowler, D. M.; Araya, C. L.; Fleishman, S. J.; Kellogg, E. H.; Stephany, J. J.; Baker, D.; Fields, S. Nat. Methods 2010, 7,741. doi: 10.1038/nmeth.1492
[4]	Beck, A.; Sanglier-CianféRani, S.; Van Dorsselaer, A. Anal. Chem. 2012, 84,4637. doi: 10.1021/ac3002885
[5]	Kelleher, N. L. Anal. Chem. 2004, 76,197. pmid: 15190879
[6]	Ge, Y.; Lawhorn, B. G.; Elnaggar, M.; Strauss, E.; Park, J.-H.; Begley, T. P.; Mclafferty, F. W. J. Am. Chem. Soc. 2002, 124,672. doi: 10.1021/ja011335z
[7]	Sun, R. X.; Luo, L.; Wu, L.; Wang, R. M.; Zeng, W. F.; Chi, H.; Liu, C.; He, S. M. Anal. Chem. 2016, 88,3082. doi: 10.1021/acs.analchem.5b03963
[8]	Seidler, J.; Zinn, N.; Boehm, M. E.; Lehmann, W. D. Proteomics 2010, 10,634. doi: 10.1002/pmic.200900459 pmid: 19953542
[9]	Zhang, Y.; Fonslow, B. R.; Shan, B.; Baek, M. C.; Yates, J. R. Chem. Rev. 2013, 113,2343. doi: 10.1021/cr3003533
[10]	Liu, X.; Dekker, L. J.M.; Wu, S.; Vanduijn, M. M.; Pevzner, P. A.; Pa, P. J. Proteome Res. 2014, 13,3241. doi: 10.1021/pr401300m
[11]	Zhou, Y.; Xiao, Y. Acta Chim. Sinica 2018, 76,177. (in Chinese). doi: 10.6023/A17110484
	( 周怡青, 肖友利, 化学学报, 2018, 76,177.) doi: 10.6023/A17110484
[12]	Tsiatsiani, L.; Heck, A. J. FEBS J. 2015, 282,2612. doi: 10.1111/febs.2015.282.issue-14
[13]	Maccoss, M.; Mcdonald, W.; Saraf, A.; Sadygov, R.; Clark, J.; Tasto, J.; Gould, K.; Wolters, D.; Washburn, M.; Weiss, A.; Clark, J.; Yates, J. Proc. Natl. Acad. Sci. U. S. A. 2002,99,7900. doi: 10.1073/pnas.122231399
[14]	Xu, T.; Wong, C. C.L.; Kashina, A.; Yates, J. R. Nat. Protoc. 2009, 4,325. doi: 10.1038/nprot.2008.248
[15]	Allmer, J. Expert Rev. Proteomic. 2011, 8,645. doi: 10.1586/epr.11.54
[16]	Mou, C.; Wang, H.; Zhou, P.; Hou, X. J. Comput. Appl. 2021,1. (in Chinese).
	( 牟长宁, 王海鹏, 周丕宇, 侯鑫行, 计算机应用, 2021,1.)
[17]	Yang, C.; Liu, J.; Zhang, W.; Shan, Y.; Dai, Z.; Zhang, L.; Zhang, Y. Chin. J. Anal. Chem. 2021, 49,366. (in Chinese).
	( 杨超, 刘健慧, 张玮杰, 单亦初, 戴忠鹏, 张丽华, 张玉奎, 分析化学, 2021, 49,366.)
[18]	Ma, B.; Zhang, K.; Hendrie, C.; Liang, C.; Li, M.; Doherty‐Kirby, A.; Lajoie, G. Rapid Commun. Mass Spectrom. 2003, 17,2337. doi: 10.1002/(ISSN)1097-0231
[19]	Zhang, J.; Xin, L.; Shan, B.; Chen, W.; Xie, M.; Yuen, D.; Zhang, W.; Zhang, Z.; Lajoie, G. A.; Ma, B. Mol. Cell. Proteomics 2012, 11,M111.010587.
[20]	Chi, H.; Sun, R.-X.; Yang, B.; Song, C.-Q.; Wang, L.-H.; Liu, C.; Fu, Y.; Yuan, Z.-F.; Wang, H.-P.; He, S.-M. J. Proteome Res. 2010, 9,2713. doi: 10.1021/pr100182k
[21]	Yang, H.; Chi, H.; Zeng, W.-F.; Zhou, W.-J.; He, S.-M. Bioinformatics 2019, 35,i183. doi: 10.1093/bioinformatics/btz366
[22]	Yang, H.; Chi, H.; Zhou, W.-J.; Zeng, W.-F.; He, K.; Liu, C.; Sun, R.-X.; He, S.-M. J. Proteome Res. 2017, 16,645. doi: 10.1021/acs.jproteome.6b00716
[23]	Yang, H.; Li, Y.-C.; Zhao, M.-Z.; Wu, F.-L.; Wang, X.; Xiao, W.-D.; Wang, Y.-H.; Zhang, J.-L.; Wang, F.-Q.; Xu, F. Mol. Cell. Proteomics 2019, 18,773. doi: 10.1074/mcp.TIR118.000918
[24]	Frank, A.; Pevzner, P. Anal. Chem. 2005, 77,964. pmid: 15858974
[25]	Frank, A. M.; Savitski, M. M.; Nielsen, M. L.; Zubarev, R. A.; Pevzner, P. A. J. Proteome Res. 2007, 6,114. pmid: 17203955
[26]	Ma, B. J. Am. Soc. Mass Spectrom. 2015, 26,1885. doi: 10.1007/s13361-015-1204-0
[27]	Cheng, X.; Yang, X.; Li, J. Genomics and Applied Biology 2020, 39,3431. (in Chinese).
	( 成茜, 杨筱韵, 李婧, 基因组学与应用生物学, 2020, 39,3431.)
[28]	Muth, T.; Renard, B. Y. Brief. Bioinform. 2018, 19,954. doi: 10.1093/bib/bbx033
[29]	Lu, B.; Chen, T. J. Comput. Biol. 2003, 10,1-12s.
[30]	Savidor, A.; Barzilay, R.; Elinger, D.; Yarden, Y.; Levin, Y. Mol. Cell. Proteomics 2017, 16,1151. doi: 10.1074/mcp.O116.065417
[31]	Tran, N. H.; Rahman, M. Z.; He, L.; Xin, L.; Shan, B.; Li, M. Sci. Rep. 2016, 6,1. doi: 10.1038/s41598-016-0001-8
[32]	Yang, C.; Liu, J.; Hu, Y.; Dai, Z.; Zhang, Y. J. Sep. Sci. 2020, 43,3665. doi: 10.1002/jssc.v43.18

基于非特异性蛋白酶连续酶解的蛋白质全序列测定方法

Full-length Protein Sequencing Based on Continuous Digestion Using Non-specific Proteases

RichHTML

PDF

Supporting Info.

Knowledge

Abstract

Cite this article

share this article

Fig. & Tab. 8

References 32

Related Articles 0

Recommended Articles

Metrics

Comments