综述 |
|
|
|
|
基于机器学习的药物-靶标相互作用预测* |
刘皓淼,杨志伟**(),王力卓,周彦章,龙建纲 |
西安交通大学生命科学与技术学院 线粒体生物医学研究所 生物医学信息工程教育部重点实验室 西安 710049 |
|
Research Progress of Drug Target Interaction Prediction Based on Machine Learning |
LIU Hao-miao,YANG Zhi-wei**(),WANG Li-zhuo,ZHOU Yan-zhang,LONG Jian-gang |
Center of Mitochondrial Biology and Medicine, Key Laboratory of Biomedical Information Engineering, Ministry of Education, School of Life Science and Technology, Xi’an Jiaotong University, Xi’an 710049, China |
引用本文:
刘皓淼,杨志伟,王力卓,周彦章,龙建纲. 基于机器学习的药物-靶标相互作用预测*[J]. 中国生物工程杂志, 2022, 42(4): 40-48.
LIU Hao-miao,YANG Zhi-wei,WANG Li-zhuo,ZHOU Yan-zhang,LONG Jian-gang. Research Progress of Drug Target Interaction Prediction Based on Machine Learning. China Biotechnology, 2022, 42(4): 40-48.
链接本文:
https://manu60.magtech.com.cn/biotech/CN/10.13523/j.cb.2111037
或
https://manu60.magtech.com.cn/biotech/CN/Y2022/V42/I4/40
|
[1] |
Adams C P, Brantner V V. Estimating the cost of new drug development: is it really $802 million? Health Affairs, 2006, 25(2): 420-428.
doi: 10.1377/hlthaff.25.2.420
|
[2] |
Chen S C, Zhu Y L, Zhang D Q, et al. Feature extraction approaches based on matrix pattern: MatPCA and MatFLDA. Pattern Recognition Letters, 2005, 26(8): 1157-1167.
doi: 10.1016/j.patrec.2004.10.009
|
[3] |
Dejori M, Schuermann B, Stetter M. Hunting drug targets by systems-level modeling of gene expression profiles. IEEE Transactions on Nanobioscience, 2004, 3(3): 180-191.
doi: 10.1109/TNB.2004.833690
|
[4] |
Russ A P, Lampel S. The druggable genome: an update. Drug Discovery Today, 2005, 10(23-24): 1607-1610.
doi: 10.1016/S1359-6446(05)03666-4
|
[5] |
Li Z P, Wang R S, Zhang X S. Two-stage flux balance analysis of metabolic networks for drug target identification. BMC Systems Biology, 2011, 5(Suppl 1): S11.
|
[6] |
Chatr-Aryamontri A, Ceol A, Palazzi L M, et al. MINT: the molecular INTeraction database. Nucleic Acids Research, 2007, 35(Database): D572-D574.
|
[7] |
Wishart D S, Knox C, Guo A C, et al. DrugBank: a comprehensive resource for in silico drug discovery and exploration. Nucleic Acids Research, 2006, 34(suppl_1): D668-D672.
doi: 10.1093/nar/gkj067
|
[8] |
Kim S, Thiessen P A, Bolton E E, et al. PubChem substance and compound databases. Nucleic Acids Research, 2015, 44(D1): D1202-D1213.
doi: 10.1093/nar/gkv951
|
[9] |
Chen X, Ji Z L, Chen Y Z. TTD: therapeutic target database. Nucleic Acids Research, 2002, 30(1): 412-415.
|
[10] |
Liu T Q, Lin Y, Wen X, et al. BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Research, 2006, 35(suppl_1): D198-D201.
|
[11] |
Kanehisa M, Furumichi M, Tanabe M, et al. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Research, 2016, 45(D1): D353-D361.
doi: 10.1093/nar/gkw1092
|
[12] |
Gaulton A, Bellis L J, Bento A P, et al. ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Research, 2011, 40(D1): D1100-D1107.
|
[13] |
Szklarczyk D, Santos A, von Mering C, et al. STITCH 5: augmenting protein-chemical interaction networks with tissue and affinity data. Nucleic Acids Research, 2015, 44(D1): D380-D384.
doi: 10.1093/nar/gkv1277
|
[14] |
Sterling T, Irwin J J. ZINC 15-ligand discovery for everyone. Journal of Chemical Information and Modeling, 2015, 55(11): 2324-2337.
doi: 10.1021/acs.jcim.5b00559
pmid: 26479676
|
[15] |
Cotto K C, Wagner A H, Feng Y Y, et al. DGIdb 3.0: a redesign and expansion of the drug-gene interaction database. Nucleic Acids Research, 2018, 46(D1): D1068-D1073.
|
[16] |
Schomburg I, Chang A, Ebeling C, et al. BRENDA, the enzyme database: updates and major new developments. Nucleic Acids Research, 2004, 32(suppl_1): D431-D433.
|
[17] |
Consortium U. UniProt: a hub for protein information. Nucleic Acids Research, 2015, 43(Database issue): D204-D212.
doi: 10.1093/nar/gku989
|
[18] |
Kuhn M, Letunic I, Jensen L J, et al. The SIDER database of drugs and side effects. Nucleic Acids Research, 2016, 44(D1): D1075-D1079.
|
[19] |
Pozzan A. Molecular descriptors and methods for ligand based virtual high throughput screening in drug discovery. Current Pharmaceutical Design, 2006, 12(17): 2099-2110.
doi: 10.2174/138161206777585247
|
[20] |
Chen I J, Hubbard R E. Lessons for fragment library design: analysis of output from multiple screening campaigns. Journal of Computer-Aided Molecular Design, 2009, 23(8): 603-620.
doi: 10.1007/s10822-009-9280-5
pmid: 19495994
|
[21] |
Feng H W, Zhang L, Li S M, et al. Predicting the reproductive toxicity of chemicals using ensemble learning methods and molecular fingerprints. Toxicology Letters, 2021, 340: 4-14.
doi: 10.1016/j.toxlet.2021.01.002
|
[22] |
Batista J, Godden J W, Bajorath J. Assessment of molecular similarity from the analysis of randomly generated structural fragment populations. Journal of Chemical Information and Modeling, 2006, 46(5): 1937-1944.
doi: 10.1021/ci0601261
pmid: 16995724
|
[23] |
Biasini M, Bienert S, Waterhouse A, et al. SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Research, 2014, 42(Web Server issue): W252-W258.
doi: 10.1093/nar/gku340
|
[24] |
Steinbeck C, Han Y Q, Kuhn S, et al. The chemistry development kit (CDK): an open-source Java library for chemo- and bioinformatics. ChemInform, 2003, 34(21): 493-500.
|
[25] |
Yap C W. PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. Journal of Computational Chemistry, 2011, 32(7): 1466-1474.
doi: 10.1002/jcc.21707
|
[26] |
Lovrić M, Molero J M, Kern R. PySpark and RDKit: moving towards big data in cheminformatics. Molecular Informatics, 2019, 38(6): 1800082.
doi: 10.1002/minf.201800082
|
[27] |
Dong J, Cao D S, Miao H Y, et al. ChemDes: an integrated web-based platform for molecular descriptor and fingerprint computation. Journal of Cheminformatics, 2015, 7: 60.
doi: 10.1186/s13321-015-0109-z
pmid: 26664458
|
[28] |
Cao D S, Xiao N, Xu Q S, et al. Rcpi: R/Bioconductor package to generate various descriptors of proteins, compounds and their interactions. Bioinformatics, 2014, 31(2): 279-281.
doi: 10.1093/bioinformatics/btu624
|
[29] |
Cao D S, Liang Y Z, Yan J, et al. PyDPI: freely available Python package for chemoinformatics, bioinformatics, and chemogenomics studies. Journal of Chemical Information and Modeling, 2013, 53(11): 3086-3096.
doi: 10.1021/ci400127q
|
[30] |
Johnson M, Maggiora G. Concepts and applications of molecular similarity. New York: Wiley Interscience, 1990.
|
[31] |
González-Díaz H, Prado-Prado F, García-Mera X, et al. MIND-BEST: web server for drugs and target discovery; design, synthesis, and assay of MAO-B inhibitors and theoretical-experimental study of G3PDH protein from Trichomonas gallinae. Journal of Proteome Research, 2011, 10(4): 1698-1718.
doi: 10.1021/pr101009e
pmid: 21184613
|
[32] |
Shoichet B K, Kuntz I D, Bodian D L. Molecular docking using shape descriptors. Journal of Computational Chemistry, 1992, 13(3): 380-397.
doi: 10.1002/jcc.540130311
|
[33] |
Chen X, Liu X E, Wu J. Research progress on drug representation learning. Journal of Tsinghua University (Science and Technology), 2020(2): 171-180.
|
[34] |
Pedregosa F, Varoquaux G, Gramfort A, et al. Scikit-learn: machine learning in Python. CoRR, 2012.DOI: abs/1201.0490:2825-2830.
doi: abs/1201.0490:2825-2830
|
[35] |
Quinlan J R. Induction of decision trees. Machine Learning, 1986, 1(1): 81-106.
|
[36] |
Deb K, Pratap A, Agarwal S, et al. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Transactions on Evolutionary Computation, 2002, 6(2): 182-197.
doi: 10.1109/4235.996017
|
[37] |
Mountrakis G, Im J, Ogole C. Support vector machines in remote sensing: a review. ISPRS Journal of Photogrammetry and Remote Sensing, 2011, 66(3): 247-259.
doi: 10.1016/j.isprsjprs.2010.11.001
|
[38] |
Biau G. Analysis of a random forests model. Journal of Machine Learning Research, 2012, 13: 1063-1095.
|
[39] |
Peduzzi P, Concato J, Kemper E, et al. A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology, 1996, 49(12): 1373-1379.
doi: 10.1016/s0895-4356(96)00236-3
pmid: 8970487
|
[40] |
Srivastava N, Hinton G E, Krizhevsky A, et al. Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research, 2014, 15(1): 1929-1958.
|
[41] |
Wu Z R, Li W H, Liu G X, et al. Network-based methods for prediction of drug-target interactions. Frontiers in Pharmacology, 2018, 9: 1134.
doi: 10.3389/fphar.2018.01134
|
[42] |
Zeng X X, Zhu S Y, Liu X R, et al. deepDR: a network-based deep learning approach to in silico drug repositioning. Bioinformatics, 2019, 35(24): 5191-5198.
doi: 10.1093/bioinformatics/btz418
|
[43] |
Zhang R L, Ding Y R. Identification of key features of CNS drugs based on SVM and greedy algorithm. Current Computer-Aided Drug Design, 2020, 16(6): 725-733.
doi: 10.2174/1573409915666191212095340
|
[44] |
Madhukar N S, Khade P K, Huang L, et al. A Bayesian machine learning approach for drug target identification using diverse data types. Nature Communications, 2019, 10: 5221.
doi: 10.1038/s41467-019-12928-6
pmid: 31745082
|
[45] |
Mahmud S M H, Chen W Y, Liu Y S, et al. PreDTIs: prediction of drug-target interactions based on multiple feature information using gradient boosting framework with data balancing and feature selection techniques. Briefings in Bioinformatics, 2021, 22(5): bbab046.
doi: 10.1093/bib/bbab046
|
[46] |
Piazza I, Beaton N, Bruderer R, et al. A machine learning-based chemoproteomic approach to identify drug targets and binding sites in complex proteomes. Nature Communications, 2020, 11: 4200.
doi: 10.1038/s41467-020-18071-x
|
[47] |
Chu Y Y, Kaushik A C, Wang X G, et al. DTI-CDF: a cascade deep forest model towards the prediction of drug-target interactions based on hybrid features. Briefings in Bioinformatics, 2021, 22(1): 451-462.
doi: 10.1093/bib/bbz152
|
[48] |
Li Y, Liu X Z, You Z H, et al. A computational approach for predicting drug-target interactions from protein sequence and drug substructure fingerprint information. International Journal of Intelligent Systems, 2021, 36(1): 593-609.
doi: 10.1002/int.22332
|
[49] |
Sachdev K, Gupta M K. A comprehensive review of feature based methods for drug target interaction prediction. Journal of Biomedical Informatics, 2019, 93: 103159.
doi: 10.1016/j.jbi.2019.103159
|
[50] |
Li X Y, Li W K, Zeng M, et al. Network-based methods for predicting essential genes or proteins: a survey. Briefings in Bioinformatics, 2020, 21(2): 566-583.
doi: 10.1093/bib/bbz017
|
[51] |
Huang K, Xiao C, Glass L M, et al. SkipGNN: predicting molecular interactions with skip-graph networks. Scientific Reports, 2020, 10: 21092.
doi: 10.1038/s41598-020-77766-9
|
[52] |
Parvizi P, Azuaje F, Theodoratou E, et al. A network-based embedding method for drug-target interaction prediction. Annual International Conference of the IEEE Engineering in Medicine and Biology Society IEEE Engineering in Medicine and Biology Society Annual International Conference, 2020, 2020: 5304-5307.
|
[53] |
Yue Y, He S. DTI-HeNE: a novel method for drug-target interaction prediction based on heterogeneous network embedding. BMC Bioinformatics, 2021, 22(1): 418.
doi: 10.1186/s12859-021-04327-w
|
[54] |
Wan F P, Hong L X, Xiao A, et al. NeoDTI: neural integration of neighbor information from a heterogeneous network for discovering new drug-target interactions. Bioinformatics, 2018, 35(1): 104-111.
doi: 10.1093/bioinformatics/bty543
|
[55] |
Mohamed S K, Novááček V, Nounu A. Discovering protein drug targets using knowledge graph embeddings. Bioinformatics, 2019, 36(2): 603-610.
|
[56] |
Shang Y F, Gao L, Zou Q, et al. Prediction of drug-target interactions based on multi-layer network representation learning. Neurocomputing, 2021, 434: 80-89.
doi: 10.1016/j.neucom.2020.12.068
|
[57] |
Zhao T Y, Hu Y, Valsdottir L R, et al. Identifying drug-target interactions based on graph convolutional network and deep neural network. Briefings in Bioinformatics, 2020, 22(2): 2141-2150.
doi: 10.1093/bib/bbaa044
|
[58] |
Xu X, Xuan P, Zhang T, et al. Inferring drug-target interactions based on random walk and convolutional neural network. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2021. DOI: 10.1109/TCBB.2021.3066813.
doi: 10.1109/TCBB.2021.3066813
|
[59] |
Lee D D, Seung H S. Learning the parts of objects by non-negative matrix factorization. Nature, 1999, 401 (6755): 788-791.
doi: 10.1038/44565
|
[60] |
Stokes J M, Yang K, Swanson K, et al. A deep learning approach to antibiotic discovery. Cell, 2020, 180(4): 688-702.e13.
doi: 10.1016/j.cell.2020.01.021
|
[61] |
Meng Y J, Jin M, Tang X F, et al. Drug repositioning based on similarity constrained probabilistic matrix factorization: COVID-19 as a case study. Applied Soft Computing, 2021, 103: 107135.
doi: 10.1016/j.asoc.2021.107135
|
[62] |
Bagherian M, Kim R B, Jiang C, et al. Coupled matrix-matrix and coupled tensor-matrix completion methods for predicting drug-target interactions. Briefings in Bioinformatics, 2020, 22(2): 2161-2171.
doi: 10.1093/bib/bbaa025
pmid: 32186716
|
[63] |
Yang M Y, Wu G Y, Zhao Q C, et al. Computational drug repositioning based on multi-similarities bilinear matrix factorization. Briefings in Bioinformatics, 2020, 22(4): bbaa267.
doi: 10.1093/bib/bbaa267
|
[64] |
Ceddia G, Pinoli P, Ceri S, et al. Matrix factorization-based technique for drug repurposing predictions. IEEE Journal of Biomedical and Health Informatics, 2020, 24(11): 3162-3172.
doi: 10.1109/JBHI.2020.2991763
|
[65] |
Hao M, Bryant S H, Wang Y. Predicting drug-target interactions by dual-network integrated logistic matrix factorization. Scientific Reports, 2017, 7: 40376.
doi: 10.1038/srep40376
|
[66] |
Wang M H, Tang C, Chen J J. Drug-target interaction prediction via dual Laplacian graph regularized matrix completion. BioMed Research International, 2018, 2018: 1425608.
|
[67] |
Peng Y H, Gao P P, Shi L, et al. Central and peripheral metabolic defects contribute to the pathogenesis of Alzheimer’s disease: targeting mitochondria for diagnosis and prevention. Antioxidants & Redox Signaling, 2020, 32(16): 1188-1236.
|
[68] |
Hao J J, Shen W L, Tian C, et al. Mitochondrial nutrients improve immune dysfunction in the type 2 diabetic Goto-Kakizaki rats. Journal of Cellular and Molecular Medicine, 2009, 13(4): 701-711.
doi: 10.1111/j.1582-4934.2008.00342.x
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|
Cited |
|
|
|
|
|
Shared |
|
|
|
|
|
Discussed |
|
|
|
|