Please wait a minute...

中国生物工程杂志

CHINA BIOTECHNOLOGY
中国生物工程杂志  2017, Vol. 37 Issue (3): 124-132    DOI: 10.13523/j.cb.20170317
专题     
利用语义网技术实现的分布式异构食品微生物数据整合
吴林寰, 陆震鸣, 龚劲松, 史劲松, 许正宏
江南大学工业生物技术教育部重点实验室 药学院 无锡 214122
Integrating Distributed Heterogeneous Food Microorganism Data by Semantic Web Technology
WU Lin-huan, LU Zhen-ming, GONG Jin-song, SHI Jin-song, XU Zheng-hong
Key Laboratory of Industrial Biotechnology of Ministry of Education, School of Pharmaceutical Science, Jiangnan University, Wuxi 214122, China
 全文: PDF(2105 KB)   HTML
摘要:

随着高通量测序技术的迅速发展和食品微生物研究的逐步深入,产生了大量的数据和知识,且以不同的数据格式分布在各种数据库中。为了更好地支持食品微生物的相关研究,从各种分布式、异构的数据和知识中,进行数据提取与转换,并形成一个整合的数据平台显得尤为重要。FoodMicrobes数据库利用语义网技术,建立了一个食品微生物的整合型数据平台。该平台从各种开放的公共数据库,提取了与食品微生物相关的基因、基因组、基因功能、蛋白质序列与结构、代谢途径、文献、专利等信息,利用RDF的方法,对数据进行转换,并建立了数据之间的关联,实现了数据整合,是目前在食品微生物领域以语义网方式建立的第一个数据库。在该平台中,实现了将食品微生物的物种、菌株层面的宏观信息与基因组、蛋白质、代谢与功能等微观层面信息的贯通,并通过友好的数据检索界面,为用户进行食品微生物研究提供了重要的工具。

关键词: 语义网关联数据食品微生物    
Abstract:

With the rapid development of next generation sequencing technology and the researches on fermentation mechanism of food microorganism, data and knowledge of food microorganisms increased enormously, including genomic, metagenomics, metabolic and phylogenetic information. These data are distributed from different resources with various data formats. An integrated data platform is necessary for better understanding of biological knowledge from such growing heterogeneous data. As a result, we construct a food microorganism database using semantic web technology. We describe information of gene, genome sequences, gene ontology, protein sequences and structures, pathway and enzyme in the form of Resource Description Framework (RDF) from a wide range of open data resources. In this database, physiological information of microbes from culture collections could be linked to the genomic information and further linked to the metabolic information which allows flexible queries across different domains. User-friendly interfaces of the database provide the ability to answer a number of food microorganisms research related questions based on the linked data.

Key words: Food microorganisms    Linked data    Semantic web
收稿日期: 2016-09-19 出版日期: 2017-03-25
ZTFLH:  Q811.4  
基金资助:

国家自然科学基金(31271922),国家"863"计划(2012AA021301,2014AA021501,2013AA102106)资助项目

通讯作者: 许正宏     E-mail: zhenghxu@jiangnan.edu.cn
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  

引用本文:

吴林寰, 陆震鸣, 龚劲松, 史劲松, 许正宏. 利用语义网技术实现的分布式异构食品微生物数据整合[J]. 中国生物工程杂志, 2017, 37(3): 124-132.

WU Lin-huan, LU Zhen-ming, GONG Jin-song, SHI Jin-song, XU Zheng-hong. Integrating Distributed Heterogeneous Food Microorganism Data by Semantic Web Technology. China Biotechnology, 2017, 37(3): 124-132.

链接本文:

https://manu60.magtech.com.cn/biotech/CN/10.13523/j.cb.20170317        https://manu60.magtech.com.cn/biotech/CN/Y2017/V37/I3/124

[1] Carole G, Robert S. State of the nation in data integration for bioinformatics. Journal of Biomedical Informatics, 2008,41(5):687-693.
[2] Clark T, Martin S, Liefeld T. Globally distributed object identification for biological knowledge bases. Brief Bioinform, 2004, 5(1):59-70.
[3] Ashburner M, Ball C A, Blake J A,et al. Gene ontology:tool for the unification of biology. Nat Genet, 2000, 25(1):25-29.
[4] Mark A M, Natalya F N, Nigam H S,et al. The national center for biomedical ontology. J Am Med Inform Assoc, 2012,19(2):190-195.
[5] Simon J, James M, Jerven B, et al. The EBI RDF platform:linked open data for the life sciences. Bioinformatics, 2014,30(9):1338-1339.
[6] SIB Swiss Institute of Bioinformatics Members, The SIB Swiss Institute of Bioinformatics' resources:focus on curated databases. Nucleic Acids Res,2016,44(D1):D27-D37.
[7] Alison C, Jose C, Peter A, et al. Bio2RDF release 2:improved coverage, interoperability and provenance of life science linked data. ESWC, 2013,788(2):200-212.
[8] Maulik R K, Michel D. An Ebola virus-centered knowledge base. Database, dio:10.1093/database/bav049.2015, 1-11.
[9] Simon J, Julie K, Joost S, et al, Developing a kidney and urinary pathway knowledge base, Journal of Biomedical Semantics, 2011, 2(Suppl 2):S7.
[10] Linhuan W, Qinglan S, Hideaki S, et al. Global catalogue of microorganisms (gcm):a comprehensive database and information retrieval, analysis, and visualization system for microbial resources, BMC Genomics, 2013,14:933.
[11] NCBI Resource Coordinators. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res, 2016, 44(D1):D7-D19.
[12] Paul A K, Deanna M C, Francoise T, et al. Assembly:a resource for assembled genomes at NCBI. Nucleic Acids Res, 2016,44(D1):D73-D80.
[13] Karen C, Ilene K M, David J. LGenBank. Nucleic Acids Res,2016,44(D1):D67-D72.
[14] Peter W R, Andreas P,Chunxiao B, et al. The RCSB Protein Data Bank:views of structural biology for basic and applied research and education. Nucleic Acids Res, 2016,43(D1):D345-D356.
[15] Kanehisa M, Sato Y, Kawashima M, et al. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res, 2016, 44(D1):D457-D462.
[16] Keegan K, Glass E, Meyer F. MG-RAST, a Metagenomics Service for Analysis of Microbial Community Structure and Function. Methods Mol Biol, 2016:1399:207-233.

[1] 闫鹏程, 张占江, 裴智勇, 付延婷, 陈禹保, 刘彤. 药用植物保育云服务平台设计与实现[J]. 中国生物工程杂志, 2017, 37(11): 37-44.
[2] 赵燕, 郝燕妮, 刘南京, 李婷, 吴小候, 罗春丽. miR-145通过下调PLCε抑制膀胱癌EMT和迁移及其机制研究[J]. 中国生物工程杂志, 2017, 37(3): 27-36.
[3] 倪璇, 高金欣, 余传金, 刘铜, 李雅乾, 陈捷. 玉米弯孢叶斑病菌clt-1基因生物信息学分析和启动子的功能鉴定[J]. 中国生物工程杂志, 2017, 37(3): 37-42.
[4] 刘旭霞, 张楠. 美国国家生物工程食品信息披露标准法案评析[J]. 中国生物工程杂志, 2016, 36(11): 131-138.
[5] 杨敏, 陈丹, 姚冬生, 谢春芳, 刘大岭. 技术与方法β-激动剂核酸适配体电化学生物传感器的研制[J]. 中国生物工程杂志, 2015, 35(11): 52-60.
[6] 王年, 庄振华, 唐俊, 苏亮亮. 基于Fiedler向量的基因表达谱数据分类方法[J]. 中国生物工程杂志, 2010, 30(12): 82-86.