It's main design considerations were to be:
- supports multi-threading
- hierarchical search databases
- simple to use
- no compulsory parameters
- bundled databases
- standards-compliant output files
- pipeline-friendly interface
- finds tRNA, rRNA, CDS, sig_peptide, tandem repeats, ncRNA
- includes /gene and /EC_number where possible, not just /product
- traceable annotation sources via /inference tags
- produce files close-to-ready for submission to Genbank
- complete log file
The first release is a monolithic, but followable Perl script. It only uses core Perl modules, but has quite a few external tool dependencies, some of which I can't bundle due to licence restrictions. Eventually I hope to have a public web-server version, and a version of it in the Galaxy Toolshed.
It currently takes about 10 minutes on a quad Intel i7 for a typical 4 Mbp genome.