Saturday, 30 March 2013

Bacterial genome annotation systems

Introduction

You get a bacterial isolate. You sequence it. You manage to get some contigs after mucking around with some de novo assembly software. Now what? Annotation of course! Your FASTA file is teeming with lifeless chunks of bacterial DNA yearning to be adorned with insightfully labelled features, so it can get some more attention from you, and maybe even be reunited with some old friends in Genbank/ENA. If this sounds familiar, then this blog post is for you.

What is genome annotation?

Genome annotation is the process of identifying features of interest on a genome sequence. Some of the   features relevant to bacterial genomes are protein coding genes, non-coding RNAs, and operons. Features can have all sorts of useful information associated with them in addition to their genomic location and feature type. For example, a protein-coding gene annotation could include items such as the predicted protein product, whether it has a signal peptide, a gene abbreviation and an enzyme classification number. The accuracy and richness of a genome annotation is important, and sometimes critical, to downstream biological interpretation.

In the old days, a basic ORF finder would be run over the contigs. Then the truly dedicated curators would comb over the ORFs, trim back to good looking start codon sites, delete spurious looking ORFs, and so on. Later gene predictor software and BLASTX helped bootstrap this process further. Now there are various "automatic annotation" systems which do a reasonably good job. Manual refinement of the automatic annotation can then be done using curation applications.

Below I list the tools I am aware of for performing and curating bacterial genome annotation. If I've missed any please let me know and I will add them.

Web submission systems


Standalone systems


Curation systems

  • Manatee (web interface + database backend)
  • Artemis (local app, can be combined with Chado SQL backend)
  • Apollo (local app, can be combined with Chado SQL backend)
  • Wasabi (my old non-public awful CGI/Perl/make mess that we still use internally)

Conclusion

Beware of systems claiming to do "microbial" annotation. Most of them are only designed for annotating bacteria. They will perform poorly on viruses, fungi and other microbes.




18 comments:

  1. Without having any personal experience with it, RATT, Rapid Annotation Transfer Tool, from the Sanger Institute, may be useful: http://ratt.sourceforge.net/

    ReplyDelete
  2. There is a new version of Apollo that is entirely web-based
    http://www.gmod.org/wiki/WebApollo

    It's also well-integrated with MAKER

    ReplyDelete
  3. Thanks Chris, I had forgotten about WebApollo! I spoke with Scott Cain about it at BOSC last year, and he said is was progressing well. I need a replacement for my aging in-house Wasabi system (bacteria only) and I will go back and look at WebApollo. Thanks for commenting.

    ReplyDelete
  4. Thanks Lex for reminding me of RATT. I also never tried it, but it's something I need to put on my TO-DO list. Three of my colleagues are coming to Oslo this year, but I'm not able to come. Perhaps we will meet one day.

    ReplyDelete
  5. In Briefings in Bioinformatics, a survey on this topic - "The automatic annotation of bacterial genomes" - was published recently

    http://bib.oxfordjournals.org/content/14/1/1.full

    ReplyDelete
  6. Thanks for the link to Mick and Emily's paper Igor.

    (I intended to include it in the post but forgot.)

    ReplyDelete
  7. Hi Torsten, Whats about GenDb annotation???

    ReplyDelete
    Replies
    1. I'm embarrassed to say I have never come across GenDB. I went to the website, and it looks interesting. However, I can not download any software, I have to email someone to get access, and the Demo database link does not work. Is it being actively maintained? (the publication was 11 years ago)

      Delete
  8. Hi Torsten, I tried Rast /BaSYS/Maker but i think only Rast is working . Do you have any idea all about other.??

    ReplyDelete
  9. This comment has been removed by a blog administrator.

    ReplyDelete
  10. I have a question: Can i do structural annotation in RAST or IMG? i have the idea that this webservers are for functional analysis.

    Thanks for the info and the future help

    ReplyDelete
  11. This comment has been removed by a blog administrator.

    ReplyDelete
  12. Sometime few educational blogs become very helpful while getting relevant and new information related to your targeted area. As I found this blog and appreciate the information delivered to my database.On Demand Home Service Provider

    ReplyDelete
  13. Thanks for sharing superb information's. I’m impressed by the details that you’ve on this site. It reveals how nicely you understand this subject. Bookmarked this web page, will come back for extra articles. Learn more about Visithttps://sisgain.com/telemedicine-application-development-arizona!telemedicine application development in Arizona. Click here

    ReplyDelete