Introduction
You get a bacterial isolate. You sequence it. You manage to get some contigs after mucking around with some de novo assembly software. Now what? Annotation of course! Your FASTA file is teeming with lifeless chunks of bacterial DNA yearning to be adorned with insightfully labelled features, so it can get some more attention from you, and maybe even be reunited with some old friends in Genbank/ENA. If this sounds familiar, then this blog post is for you.
What is genome annotation?
Genome annotation is the process of identifying features of interest on a genome sequence. Some of the features relevant to bacterial genomes are protein coding genes, non-coding RNAs, and operons. Features can have all sorts of useful information associated with them in addition to their genomic location and feature type. For example, a protein-coding gene annotation could include items such as the predicted protein product, whether it has a signal peptide, a gene abbreviation and an enzyme classification number. The accuracy and richness of a genome annotation is important, and sometimes critical, to downstream biological interpretation.
In the old days, a basic ORF finder would be run over the contigs. Then the truly dedicated curators would comb over the ORFs, trim back to good looking start codon sites, delete spurious looking ORFs, and so on. Later gene predictor software and BLASTX helped bootstrap this process further. Now there are various "automatic annotation" systems which do a reasonably good job. Manual refinement of the automatic annotation can then be done using curation applications.
Below I list the tools I am aware of for performing and curating bacterial genome annotation. If I've missed any please let me know and I will add them.
In the old days, a basic ORF finder would be run over the contigs. Then the truly dedicated curators would comb over the ORFs, trim back to good looking start codon sites, delete spurious looking ORFs, and so on. Later gene predictor software and BLASTX helped bootstrap this process further. Now there are various "automatic annotation" systems which do a reasonably good job. Manual refinement of the automatic annotation can then be done using curation applications.
Below I list the tools I am aware of for performing and curating bacterial genome annotation. If I've missed any please let me know and I will add them.
Web submission systems
- RAST - Rapid Annotation using Subsystem Technology
- BaSYS - Bacterial Annotation System
- xBASE Bacterial Genome Annotation Service
- JVCI Annotation Service
- IGS Annotation Engine
- JGI/DOE IMG annotation service
- PGAAP - NCBI Prokaryotic Genome Automatic Annotation Pipeline
- MAKER Web Annotation Service
- Prokka Web Annotation Server (disclaimer - this is our software - will be public soon)
Standalone systems
- BG7 bacterial genome annotation system
- AGeS - Annotation/Analysis of Genome Sequences
- MAKER
- Prokka - prokaryotic annotation (disclaimer - this is our software)
Curation systems
Conclusion
Beware of systems claiming to do "microbial" annotation. Most of them are only designed for annotating bacteria. They will perform poorly on viruses, fungi and other microbes.
Without having any personal experience with it, RATT, Rapid Annotation Transfer Tool, from the Sanger Institute, may be useful: http://ratt.sourceforge.net/
ReplyDeleteThere is a new version of Apollo that is entirely web-based
ReplyDeletehttp://www.gmod.org/wiki/WebApollo
It's also well-integrated with MAKER
Thanks Chris, I had forgotten about WebApollo! I spoke with Scott Cain about it at BOSC last year, and he said is was progressing well. I need a replacement for my aging in-house Wasabi system (bacteria only) and I will go back and look at WebApollo. Thanks for commenting.
ReplyDeleteThanks Lex for reminding me of RATT. I also never tried it, but it's something I need to put on my TO-DO list. Three of my colleagues are coming to Oslo this year, but I'm not able to come. Perhaps we will meet one day.
ReplyDeleteIn Briefings in Bioinformatics, a survey on this topic - "The automatic annotation of bacterial genomes" - was published recently
ReplyDeletehttp://bib.oxfordjournals.org/content/14/1/1.full
Thanks for the link to Mick and Emily's paper Igor.
ReplyDelete(I intended to include it in the post but forgot.)
Hi Torsten, Whats about GenDb annotation???
ReplyDeleteI'm embarrassed to say I have never come across GenDB. I went to the website, and it looks interesting. However, I can not download any software, I have to email someone to get access, and the Demo database link does not work. Is it being actively maintained? (the publication was 11 years ago)
DeleteHi Torsten, I tried Rast /BaSYS/Maker but i think only Rast is working . Do you have any idea all about other.??
ReplyDeleteI definitly love this blog.
ReplyDeleteThank you for your kind words.
Deletethank you
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteI have a question: Can i do structural annotation in RAST or IMG? i have the idea that this webservers are for functional analysis.
ReplyDeleteThanks for the info and the future help
Impressive stuff
ReplyDeleteThis comment has been removed by a blog administrator.
ReplyDeleteSometime few educational blogs become very helpful while getting relevant and new information related to your targeted area. As I found this blog and appreciate the information delivered to my database.On Demand Home Service Provider
ReplyDeleteThanks for sharing superb information's. I’m impressed by the details that you’ve on this site. It reveals how nicely you understand this subject. Bookmarked this web page, will come back for extra articles. Learn more about Visithttps://sisgain.com/telemedicine-application-development-arizona!telemedicine application development in Arizona. Click here
ReplyDelete