The development of high throughput technologies has given rise to a wealth of information at system
level including genome, epigenome, transcriptome, proteome and metabolome. However, it remains a
major challenge to digest the massive amounts of information and use it in an intelligent and
comprehensive manner. To address this question, Dr. Fei.s group has focused on developing computational
tools and resources to analyze and integrate large scale 'omics' datasets, which help researchers
to understand how genes work together to comprise functioning cells and organisms.
Development of online databases to facilitate data distribution, analysis, mining and integration
Development of computational tools for omics data analysis
- Plant MetGenMAP - a web-based tool
for comprehensive mining and integration of gene expression and metabolite changes in the context of
biochemical pathways.
- iAssembler - A de novo assembly
package for transcriptome sequences generated using 454 or Sanger platforms
- iTAK - A package to identify and classify
plant transcription factors and protein kinases.
- VirusDetect - An automated pipeline for efficient virus discovery using
deep sequencing of small RNAs.
Application of NGS technologies and bioinformatics in crop improvement
During the past several years, significant progresses have been made regarding the DNA sequencing
technologies. As a result, several next-generation sequencing (NGS) platforms, such Illumina HiSeq,
have received wide applications due to their high throughput and low cost. We are interested in
using NGS technologies to investigate genomes, epigenomes and transcriptomes of several
economically important crops including tomato, cucurbits, sweetpotato, and fruit tree crops, to
facilitate the understanding of the evolution and regulatory networks of important agronomical traits.
We are also using NGS technologies to perform large-scale virus survey for crops like sweet potato
and tomato, in an effort to understand global virus diversity, distribution and evolution in
important food crops.
Inferring gene regulatory networks
Living cells are the product of gene expression programs involving regulated transcription of
thousands of genes. How a collection of transcriptional regulatory factors associates with genes
during specific biological processes or under specific environmental conditions can be described
as a gene regulatory network. We are interested in developing new algorithms to infer gene regulatory
networks by integrating datasets from various different sources, including gene expression data,
metabolomics data, promoter sequences, and microRNA information.
|