Please wait a minute...

中国生物工程杂志

China Biotechnology
China Biotechnology  2011, Vol. 31 Issue (7): 45-53    DOI:
    
De Novo Assembly of Allotetraploid Arabidopsis suecica Transcriptome using Short Reads for Gene Discovery and Marker Identification
LIU Xin-xing, CHEN Chao
Resources and Bioengineering School at Central South University, Changsha 410083, China
Download: HTML   PDF(1276KB) HTML
Export: BibTeX | EndNote (RIS)      

Abstract  

To facilitate the research on Arabidopsis suecica (A.suecica), a method was presented for de novo assembly of A.suecica transcriptome using short reads produced by Illumina sequencing platform. 23 million sequencing reads were assembled into 125 953 unique sequences with the N50 length of 550 bp and mean size of 331 bp. At the protein level, a total of 96 057 (76.3%) A.suecica transcripts showed significant similarity with transcripts proteins from the other plants in the Nr database. Functional categorization revealed the conservation of genes involved in various biological processes in A.suecica. In addition, simple sequence repeats(SSRs) motifs in the A.suecica transcriptome was identified. The data provides a comprehensive sequence resource available for A.suecica study and demonstrates that the short pair-end reads sequencing allows de novo transcriptome assembly in a allotetraploid species lacking genome information. It is anticipated that the next generation sequencing(NGS) technologies significantly accelerate the research of the transcriptome in both model and non-model organisms. In addition, the strategy for de novo assembly of transcriptome data presented here will be helpful in other similar transcriptome studies.



Key wordsArabidopsis suecica      Transcriptome assembly      SOAPdenovo      NGS(next generation sequencing)     
Received: 14 March 2011      Published: 25 July 2011
ZTFLH:  Q75  
Cite this article:

LIU Xin-xing, CHEN Chao. De Novo Assembly of Allotetraploid Arabidopsis suecica Transcriptome using Short Reads for Gene Discovery and Marker Identification. China Biotechnology, 2011, 31(7): 45-53.

URL:

https://manu60.magtech.com.cn/biotech/     OR     https://manu60.magtech.com.cn/biotech/Y2011/V31/I7/45


[1] Jakobsson M, Hagenblad J, Tavaré S, et al. A unique recent origin of the allotetraploid species Arabidopsis suecica: evidence from nuclear DNA markers. Molecular biology and evolution, 2006, 23(6): 1217-1231.


[2] Koch M A, Matschinger M. Evolution and genetic differentiation among relatives of Arabidopsis thaliana. Proceedings of the National Academy of Sciences, 2007, 104(15): 6272-6277.


[3] Ansorge W J. Next-generation DNA sequencing techniques. New biotechnology, 2009, 25(4): 195-203.


[4] Smith D R, Quinlan A R, Peckham H E, et al. Rapid whole-genome mutational profiling using next-generation sequencing technologies. Genome research, 2008, 18(10): 1638-1642.


[5] Huang W, Marth G. EagleView: a genome assembly viewer for next-generation sequencing technologies. Genome research, 2008, 18(9): 1538-1543.


[6] Blow N. Transcriptomics: The digital generation. Nature, 2009, 458(7235): 239-242.


[7] Wilhelm B T, Landry J R. RNA-Seq-quantitative measurement of expression through massively parallel RNA-sequencing. Methods, 2009, 48(3): 249-257.


[8] Haas B J, Zody M C. Advancing RNA-Seq analysis. Nature biotechnology, 2010, 28(5): 421-423.


[9] Nagalakshmi U, Wang Z, Waern K, et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science, 2008, 320(5881): 1344-1349.

[10] Trapnell C, Williams B A, Pertea G, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature biotechnology, 2010, 28(5): 511-515.

[11] Miura F, Kawaguchi N, Sese J, et al. A large-scale full-length cDNA analysis to explore the budding yeast transcriptome. Proceedings of the National Academy of Sciences, 2006, 103(47): 17846-17851.

[12] Babik W, Stuglik M, Qi W, et al. Heart transcriptome of the bank vole(Myodes glareolus): towards understanding the evolutionary variation in metabolic rate. BMC genomics, 2010, 11(1): 390-403.

[13] Chang P L, Dilkes B P, McMahon M, et al. Homoeolog-specific retention and use in allotetraploid Arabidopsis suecica depends on parent of origin and network partners. Genome Biology, 2010, 11(12): R125.

[14] Li R, Zhu H, Ruan J, et al. De novo assembly of human genomes with massively parallel short read sequencing. Genome research, 2010, 20(2): 265-272.

[15] Pertea G, Huang X, Liang F,et al. TIGR Gene Indices clustering tools (TGICL): a software system for fast clustering of large EST datasets. Bioinformatics, 2003, 19(5): 651-652.

[16] Jones T, Federspiel N A, Chibana H, et al. The diploid genome sequence of Candida albicans. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(19): 7329-7334.

[17] Vogel J P, Gu Y Q, Twigg P, et al. EST sequencing and phylogenetic analysis of the model grass Brachypodium distachyon. TAG Theoretical and Applied Genetics, 2006, 113(2): 186-195.

[18] Conesa A, Gtz S, García-Gómez J M, et al. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics, 2005, 21(18): 3674-3676.

[19] Ye J, Fang L, Zheng H, et al. WEGO: a web tool for plotting GO annotations. Nucleic acids research, 2006, 34(suppl 2): W293-W297.

[20] Garg R, Patel R K, Tyagi A K, et al. De Novo Assembly of Chickpea Transcriptome Using Short Reads for Gene Discovery and Marker Identification. DNA research, 2011, 18(1): 53-63.

[21] Li R, Yu C, Li Y, et al. SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics, 2009, 25(15): 1966-1967.

[22] Mortazavi A, Williams B A, McCue K, et al. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature methods, 2008, 5(7): 621-628.

[23] Varshney R K, Graner A, Sorrells M E. Genic microsatellite markers in plants: features and applications. TRENDS in Biotechnology, 2005, 23(1): 48-55.

[24] Vera J C, Wheat C W, Fescemyer H W, et al. Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing. Molecular Ecology, 2008, 17(7): 1636-1647.

[25] Li R, Fan W, Tian G, et al. The sequence and de novo assembly of the giant panda genome. Nature, 2009, 463(7279): 311-317.

[26] Birol I, Jackman1 S D, Nielsen1 C B, et al. De novo transcriptome assembly with ABySS. Bioinformatics, 2009, 25(21): 2872-2877.

No related articles found!