Please wait a minute...

中国生物工程杂志

China Biotechnology
China Biotechnology  2021, Vol. 41 Issue (2/3): 14-29    DOI: 10.13523/j.cb.2010040
    
Using Data Mining Technology to Carry Out Bioinformatics Study for Human Rabies in Hubei Province of China
ZHANG Qiao-zhen1,WU Wen-ting1,LI Zi-xuan1,ZHAO Xin-bo1,HU Bing2,LIU Cong2,SUI Zheng-wei3,LIU Hong-tu4,ZHANG Le1,5,**()
1 College of Computer Science, Sichuan University, Chengdu 610065, China
2 Hubei Provincial Center for Disease Control and Prevention, Wuhan 430079, China
3 China Center for Resources Satellite Data and Application, Beijing 100094, China
4 National Institute for Viral Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing 102206, China
5 Biomedical Big Data Center, West China Hospital, Sichuan University, Chengdu 610065, China
Download: HTML   PDF(66983KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

It is well known that the fatality rate of rabies, an acute zoonotic disease, is nearly 100%. As one of the high incidence areas in central China, investigating rabies epidemic in Hubei province not only can understand the current risk of Chinese rabies epidemic, but also can provide effective suggestions for local and national rabies prevention and control. Firstly, descriptive analysis turned out that the rabies cases of Hubei province are decreasing, but the exposed cases of rabies are increasing. Especially, rabies cases in the central and western regions of Hubei are slightly greater than the east, while rabies cases for men is significantly greater than for women. Moreover, the rabies cases of middle-aged and elderly people infected are greater than the young. Secondly, time series analysis for rabies cases and exposed cases to rabies confirmed that both time series have seasonal periodicity and are correlated with each other. The predictive rabies model shows that the rabies cases will be stable, whereas the exposed cases of rabies will increase in the next three years. Thirdly, panel regression showed that the rabies cases in Hubei province are comprehensively affected by population, GDP, air temperature and precipitation, and test results demonstrated statistically significant differences between low/high elevations and slopes. Finally, the limitations and the future research direction were further discussed.



Key wordsRabies      Descriptive analysis      Time series analysis      Panel regression analysis      Significance test     
Received: 29 October 2020      Published: 08 April 2021
ZTFLH:  Q143  
Corresponding Authors: Le ZHANG     E-mail: zhangle06@scu.edu.cn
Cite this article:

ZHANG Qiao-zhen,WU Wen-ting,LI Zi-xuan,ZHAO Xin-bo,HU Bing,LIU Cong,SUI Zheng-wei,LIU Hong-tu,ZHANG Le. Using Data Mining Technology to Carry Out Bioinformatics Study for Human Rabies in Hubei Province of China. China Biotechnology, 2021, 41(2/3): 14-29.

URL:

https://manu60.magtech.com.cn/biotech/10.13523/j.cb.2010040     OR     https://manu60.magtech.com.cn/biotech/Y2021/V41/I2/3/14

数据集 描述 原始数据来源
发病数据集 数据集包含2004年1月至2018年12月湖北省各县(区)狂犬病发病人数 湖北省疾病预防控制中心
暴露数据集 数据集包含2010年1季度至2018年4季度湖北省各市(州)狂犬病暴露人数 湖北省疾病预防控制中心
病例数据集 数据集包含2005年1月至2018年12月湖北省人类狂犬病详细个案资料 湖北省疾病预防控制中心
GDP数据集 数据集包含2011~2018年湖北省各市(州)年GDP总量 湖北省统计局公开资料(http://tjj.hubei.gov.cn/tjsj/)
人口数据集 数据集包含2011~2018年湖北省各市(州)年常住人口数 湖北省统计局公开资料(http://tjj.hubei.gov.cn/tjsj/)
气温数据集 数据集包含2011~2018年湖北省各市(州)年平均气温 国内天气网(http://lishi.tianqi.com)
降水数据集 数据集包含2011~2018年湖北省各市(州)年降水天数 国内天气网(http://lishi.tianqi.com)
遥感数据集 数据集包含湖北省各县(区)海拔、坡度及平均植被覆盖度 中国资源卫星应用中心、Google Earth Engine(https://earthengine.google.com)
Table 1 Related data sets for rabies
名称 定义
狂犬病发病 人类感染狂犬病毒后出现狂犬病临床症状[3]
狂犬病暴露
人类被狂犬、疑似狂犬或者不能确定健康的狂犬病宿主动物咬伤、抓伤、舔舐黏膜或者破损皮肤处,或者开放性伤口、黏膜接触可能感染狂犬病毒的动物唾液或者组织[16]
低海拔 海拔<200m[17]
高海拔 海拔≥200m[17]
低坡度 坡度<5°(http://zrzyt.hubei.gov.cn/bmdt/ztzl/hbsdycqgdlgqpcgb/gbfb/201910/t20191031_209613.shtml)
高坡度 坡度≥5°(http://zrzyt.hubei.gov.cn/bmdt/ztzl/hbsdycqgdlgqpcgb/gbfb/201910/t20191031_209613.shtml)
低植被覆盖度 植被覆盖度≤0.75[18]
高植被覆盖度 植被覆盖度>0.75[18]
Table 2 Key terms for rabies
Fig.1 Workflow
Fig.2 Trend and seasonality for rabies cases and exposed cases of rabies in Hubei province
Fig.3 Spatiotemporal distribution for rabies cases in Hubei province from 2004 to 2011
Fig. 4 Spatiotemporal distribution for rabies cases in Hubei province from 2012 to 2018
Fig.5 Population distribution for rabies cases in Hubei province
Fig.6 Workflow of the time series analysis module
Fig.7 The spectral density of the time series (a) The spectral density of the time series data for rabies cases (b) The spectral density of the time series data for exposed cases of rabies
Fig.8 Cross-correlation function of the time series data for rabies cases and exposed cases of rabies
序列 最优阶数 模型 具体表达式
发病 p=2,d=0,q=0
P=1,D=1,Q=0
SARIMA(2,0,0)(1,1,0)4 (1-0.377 5L-0.326 7L2)(1+0.595 4L4)(1-L4)Xt=εt
暴露 p=0,d=1,q=1
P=0,D=1,Q=0
SARIMA(0,1,1)(0,1,0)4 (1-L)(1-L4)Xt=(1-0.455 5L)εt
Table 3 Time series models for rabies cases and exposed cases of rabies
Fig.9 Prediction of the rabies and exposed cases (a) Prediction of rabies cases (b) Prediction of exposed cases of rabies The black and blue line is the original and predicted time series, respectively. The dark and light gray areas represent the 80% and 95% confidence intervals, respectively
系数 标准差
年常住人口(万人) 0.013 2*** 0.003 0
年GDP总量(亿元) -0.001 0*** 0.000 2
年平均气温(℃) -0.827 5* 0.393 0
年降水天数(天) -0.021 2* 0.010 0
截距 16.957 9* 7.428 1
p 0
Table 4 Panel regression with random effects
变量 X1 X2 X3 X4
VIF 1.905 9 1.848 3 1.370 5 1.321 1
Table 5 Variance inflation factor for each independent variable
Fig.10 Q-Q plot (a) Low altitude group (b) High altitude group (c) Low slope group (d) High slope group (e) Low vegetation coverage group (f) High vegetation coverage group
海拔比较组 坡度比较组 植被覆盖度比较组
低海拔组 高海拔组 低坡度组 高坡度组 低植覆组 高植覆组
方差齐性
p 0.033 7 0.019 5 0.203 3
差异是否显著
均值 15.37 11.11 16.22 11.29 14.63 12.04
Table 6 Homogeneity of variance test and corrected t-test for each comparison group
[1]   Chen J, Zou L, Jin Z, et al. Modeling the geographic spread of rabies in China. PLoS Neglected Tropical Diseases, 2015,9(5):e0003772.
doi: 10.1371/journal.pntd.0003772 pmid: 26020234
[2]   Ruan S G. Modeling the transmission dynamics and control of rabies in China. Mathematical Biosciences, 2017,286:65-93.
doi: 10.1016/j.mbs.2017.02.005 pmid: 28188732
[3]   WHO. WHO expert consultation on rabies. Second report. World Health Organ Tech Rep Ser, 2013,982:1-139.
[4]   Baghi H B, Bazmani A, Aghazadeh M. The fight against rabies: the Middle East needs to step up its game. The Lancet, 2016,388(10054):1880.
[5]   Guo D, Yin W, Yu H, et al. The role of socioeconomic and climatic factors in the spatio-temporal variation of human rabies in China. BMC Infectious Diseases, 2018,18(1):526.
doi: 10.1186/s12879-018-3427-8 pmid: 30348094
[6]   Zhou H, Vong S, Liu K, et al. Human rabies in China, 1960-2014: a descriptive epidemiological study. PLoS Neglected Tropical Diseases, 2016,10(8):e0004874.
doi: 10.1371/journal.pntd.0004874 pmid: 27500957
[7]   Montgomery J P, Zhang Y, Wells E V, et al. Human rabies in Tianjin, China. Journal of Public Health, 2012,34(4):505-511.
pmid: 22653884
[8]   Li G W, Chen Q G, Qu Z Y, et al. Epidemiological characteristics of human rabies in Henan province in China from 2005 to 2013. Journal of Venomous Animals and Toxins Including Tropical Diseases, 2015,21:34.
[9]   Ren J P, Gong Z Y, Chen E F, et al. Human rabies in Zhejiang province, China. International Journal of Infectious Diseases, 2015,38:77-82.
doi: 10.1016/j.ijid.2015.07.013 pmid: 26216767
[10]   Qi L, Su K, Shen T, et al. Epidemiological characteristics and post-exposure prophylaxis of human rabies in Chongqing, China, 2007-2016. BMC Infectious Diseases, 2018,18(1):1-7.
doi: 10.1186/s12879-017-2892-9 pmid: 29291713
[11]   沈洪兵, 齐秀英. 流行病学.第八版. 北京:人民卫生出版社, 2013: 37.
[11]   Shen H B, Qi X Y. Epidemiology.8th ed. Beijing: People’s Medical Publishing House, 2013: 37.
[12]   Song M, Tang Q, Wang D M, et al. Epidemiological investigations of human rabies in China. Bmc Infectious Diseases, 2009,9:210.
doi: 10.1186/1471-2334-9-210 pmid: 20025742
[13]   Zhang J M, Zhang Z S, Deng Y Q, et al. Incidence of human rabies and characterization of rabies virus nucleoprotein gene in dogs in Fujian province, southeast China, 2002-2012. Bmc Infectious Diseases, 2017,17(1):1-8.
pmid: 28049444
[14]   Yao H W, Yang Y, Liu K, et al. The spatiotemporal expansion of human rabies and its probable explanation in mainland China, 2004-2013. PLoS Neglected Tropical Diseases, 2015,9(2):e0003502.
[15]   Yan Q, Guo D, Cui W, et al. How to use open source data to assess infection disease risk: a framework and applications//IEEE, 2015 23rd International Conference on Geoinformatics. Wuhan: IEEE, 2015: 1-5.
[16]   王军. 狂犬病暴露预防处置工作规范(2009年版). 中国工作犬业, 2010, (2):60-61.
[16]   Wang J. Guidelines for prevention and treatment of rabies exposure(2009 edition). China Working Dog, 2010,(2):60-61.
[17]   朱建达. 小城镇空间形态发展规律:未来规划设计的新理念、新方法. 南京: 东南大学出版社, 2013: 228.
[17]   Zhu J D. Development law of small town spatial formnew concept and new method of future planning and design. Nanjing: Southeast University Press, 2013: 228.
[18]   杨胜天, 刘昌明, 杨志峰, 等. 南水北调西线调水工程区的自然生态环境评价. 地理学报, 2002,57(1):11-18.
[18]   Yang S T, Liu C M, Yang Z F, et al. Natural eco-environmental evaluation of West route area of interbasin water transfer project. Acta Geographica Sinica, 2002,57(1):11-18.
[19]   姜庆五, 陈启明, 周艺彪. 流行病学模型. 上海: 复旦大学出版社, 2012: 399-401.
[19]   Jiang Q W, Chen Q M, Zhou Y B. Epidemiological models. Shanghai:Fudan University Press, 2012: 399-401.
[20]   Han J, Kamber M. Data mining:concepts and techniques. 2nd ed. Beijing: China Machine Press, 2006: 490-512.
[21]   Nobre F F, Monteiro A B S, Telles P R , et al. Dynamic linear model and SARIMA: a comparison of their forecasting performance in epidemiology. Statistics in Medicine, 2001,20:3051-3069.
doi: 10.1002/sim.963 pmid: 11590632
[22]   白仲林. 面板数据的计量经济分析. 天津: 南开大学出版社, 2008: 11-14.
[22]   Bai Z L. Econometric analysis of panel data. Tianjin:Nankai University Press, 2008: 11-14.
[23]   刘安芳, 伍莲. 生物统计学. 重庆:西南师范大学出版社, 2013: 84-89.
[23]   Liu A F, Wu L. Biostatistics. Chongqing: Southwest China Normal University Press, 2013: 84-89.
[24]   阿德勒. R语言核心技术手册. 第二版. 刘思喆,李舰, 陈钢, 等译. 北京:电子工业出版社, 2014.
[24]   Adler J. R in a nutshell. 2nd ed. Liu S Z,Li J, Chen G, et al. Beijing: Publishing House of Electronics Industry, 2014.
[25]   牟乃夏, 刘文宝, 王海银, 等. ArcGIS 10地理信息系统教程: 从初学到精通. 北京:测绘出版社, 2012.
[25]   Mu N X, Liu W B, Wang H Y, et al. ArcGIS 10 geographic information system course:from beginning to mastering. Beijing: Surveying and Mapping Press, 2012.
[26]   杨克诚. GIS软件实验指导书: 基于ArcGIS Desktop. 昆明: 云南大学出版社, 2009.
[26]   Yang K C. GIS software experimental instructions based on ArcGIS Desktop. Kunming:Yunnan University Press, 2009.
[27]   李苗苗. 植被覆盖度的遥感估算方法研究. 北京: 中国科学院遥感应用研究所, 2003.
[27]   Li M M. The method of vegetation fraction estimation by remote sensing. Beijing: Institute of Remote Sensing Applications, Chinese Academy of Sciences, 2003.
[28]   陈建军, 于志强, 朱昀. 数据可视化技术及其应用. 红外与激光工程, 2001,30(5):339-342.
[28]   Chen J J, Yu Z Q, Zhu Y. Data visualization and its applications. Infrared and Laser Engineering, 2001,30(5):339-342.
[29]   Paula M L, Beatriz V M M, Claudia T C, et al. Time series analysis of dengue incidence in Rio de Janeiro, Brazil. American Journal of Tropical Medicine & Hygiene, 2008,79(6):933-939.
[30]   Martinez, Zangiacomi E, Silva, et al. A SARIMA forecasting model to predict the number of cases of dengue in Campinas, State of S?o Paulo, Brazil. Revista Da Sociedade Brasileira De Medicina Tropical, 2011,44(4):436-440.
doi: 10.1590/s0037-86822011000400007 pmid: 21860888
[31]   Noh J W, Kwon Y D, Park J, et al. Relationship between physical disability and depression by gender: a panel regression model. PLoS One, 2016,11(11):e0166238.
doi: 10.1371/journal.pone.0166238 pmid: 27902709
[32]   曹玲玲, 何春艳. 沪港通能否有效实现AH股溢价回归:基于固定效应面板模型的分析. 金融理论探索, 2016, ( 1):41-45.
[32]   Cao L L, He C Y. Shanghai-Hong Kong stock connect can effectively realize AH premium return:based on the fixed effect panel analysis of the model. Exploration of Financial Theory, 2016, ( 1):41-45.
[33]   王福彦. 医学计量资料统计方法. 北京:人民军医出版社, 2011: 28-30.
[33]   Wang F Y. Statistical methods of medical metrology data. Beijing: People’s Military Medical Press, 2011: 28-30.
[34]   胡利琴. 金融时间序列分析实验教程. 武汉:武汉大学出版社, 2012: 35.
[34]   Hu L Q. Experimental course of financial time series analysis. Wuhan: Wuhan University Press, 2012: 35.
[35]   杜勇宏, 王健. 季节时间序列理论与应用. 天津:南开大学出版社, 2008: 45.
[35]   Du Y H, Wang J. Theory and application of seasonal time series. Tianjin:Nankai University Press, 2008:45.
[36]   闫中晓, 贾永飞. 基于谱分析的中国科技创新与经济增长周期波动关系. 科技管理研究, 2016,36(9):13-16,50.
[36]   Yan Z X, Jia Y F. Chinese scientific and technological innovation and economic growth cycle fluctuations based on spectral analysis. Science and Technology Management Research, 2016,36(9):13-16,50.
[37]   于俊年. 计量经济学. 第二版. 北京: 对外经济贸易大学出版社, 2007: 185.
[37]   Yu J N. Econometrics. 2nd ed. Beijing:University of International Business and Economics Press, 2007: 185.
[38]   梁静, 张文俊, 田津晶, 等. 2014年南昌市狂犬病暴露者流行病学特征分析. 医学动物防制, 2016,32(5):576-578,580.
[38]   Liang J, Zhang W J, Tian J J, et al. Rabies exposure population epidemiological characteristics analysis of Nanchang in 2014. Journal of Medical Pest Control, 2016,32(5):576-578,580.
[39]   郑东方, 施达, 郭凤芝. 金华市狂犬病暴露者流行病学特征分析. 预防医学, 2017,29(12):1243-1244,1247.
[39]   Zheng D F, Shi D, Guo F Z. Analysis of epidemiological characteristics of rabies exposure patients in Jinhua city. Preventive Medicine, 2017,29(12):1243-1244,1247.
[40]   Yu J, Xiao H, Yang W H, et al. The impact of anthropogenic and environmental factors on human rabies cases in China. Transboundary and Emerging Diseases, 2020,67(6):2544-2553.
doi: 10.1111/tbed.13600 pmid: 32348020
[41]   潘铁骊, 张菲, 张守峰, 等. 几种理化因子对狂犬病病毒分离株感染力影响的再检测. 中国生物制品学杂志, 2011,24(1):45-47.
[41]   Pan T L, Zhang F, Zhang S F, et al. Restudy on effect of various physical and chemical factors on infectivity of a rabies virus isolate. Chinese Journal of Biologicals, 2011,24(1):45-47.
[1] LIU Zhen-zhen,TIAN Da-yong. Development of Sucrose Density Gradient Centrifugation Purification Process for Rabies Vaccine[J]. China Biotechnology, 2020, 40(4): 25-33.
[2] Yan GAO,Jing-jing DU,Bin WANG,Qi LIU,Zhi-qiang SHEN. Study on β-Propiolactone in Inactivation Process of Rabies Vaccine by Gas Chromatography[J]. China Biotechnology, 2019, 39(6): 25-31.
[3] ZHAO Hui, ZHENG Wen-ling, PENG Yi-fei, MA Wen-li. Protection Effect of Recombinant Oral Rabies Vaccine for Human Use on Immunity of Mice[J]. China Biotechnology, 2014, 34(1): 9-14.
[4] CHEN Ji-jun, MAO Xiao-yan, QIAO Yu-ling, BI Si-ying. Screening and Indetification of Anti-rabies Virus Single Chain Fragments Antiboy from Human Phage Display Library[J]. China Biotechnology, 2013, 33(11): 27-31.
[5] XU Heng-Zhi- Zhang-Shou-Feng- Hu-Rong-Liang- Zhang-Le-Cui. Establishment of Transformed Mammalian Cell Lines Stably Expressing the Rabies Virus Glycoprotein[J]. China Biotechnology, 2009, 29(04): 46-50.
[6] LIN Lin-Zhu- Shi-Li-Jun. Advance in Reverse Genetic Technique for Rabies Virus[J]. China Biotechnology, 2009, 29(03): 89-93.
[7] . Study on the immunogenicity of recombinant virus of pseudorabies- porcine Circovius Type 2-porcine parvovirus[J]. China Biotechnology, 2008, 28(7): 43-47.
[8] WANG Hua-Lei Cheng-Yu WANG Song-Tao YANG Na FENG Xue-Xing ZHENG Qun LI Jian-Qing SU Ping ZHU Xian-Zhu XIA. Expression of a Recombinant Immunotoxin against Rabies Virus-infected Cells and Its Biological Activity[J]. China Biotechnology, 2008, 28(5): 1-5.