GenDBWiki/DeveloperDocumentation/ToolIntegration

From BRF-Software
Jump to navigation Jump to search

Integrating a new Tool into GenDB

/!\ Incorporating a new tool into the GenDB system is a quite complex task that requires a good understanding of the concepts and implementation details. Before you try this at home you should have read and understood the system design and you should be familiar with the GenDB API and core modules. Since we do not have the man power we can not assist you in such tasks. If you have a working new tool we are of course willing to check if we could probably add it to the next release. /!\

Although the GenDB system already contains a number of standard bioinformatics tools that can be used for genome annotation, you may want to include your own tool (either for predicting regions or functions) or one of the many tools out there that are not yet incorporated in the current GenDB distribution. Since all standard tools like Glimmer or BLAST are implemented in their own special GENDB::DB::Tool subclasses (see the GenDB datamodel and API for details) incorporating a new tool would usually require modifications of the GenDB datamodel. In order to avoid such changes that usually have an impact on all active project databases (new classes with own attributes are stored in new SQL tables) we have introduced the class GENDB::DB::Tool::Generic that can be used to implement new tools without modifying the datamodel. Using this generic class, additional tools are not implemented in a GENDB::DB::DB_Server module but in a Perl module, usually located in the GENDB/Tools/ directory. As a template for creating such a new tool you can copy the GenericToolTemplate and add your own stuff. For implementing a new tool, you will basically need to implement a run method that executes the tool, parses the output, and creates new observations reflecting the tool result. Additionally, you can also use the auto-annotate method to perform some kind of automatic annotation (region prediction tools like Glimmer would then create regions and function predition tools like BLAST would usually write a new functional annotation in this method). For details please take a look at the tools already implemented within GenDB that can be found in GENDB::DB::DB_Server section.

Before you can now run and test your newly implemented generic tool, you will have to create a new object instance of this tool within your GenDB project database. This can be done using a simple script that creates a new GENDB::DB::Tool::Generic; you can copy the create_generic_tool_template.pl script as a template and modify it according to your needs. All options that are required for your new tool can be specified and stored persistently using a simple hash (attribute tool_options). After successfully creating a new instance of your specific generic tool you will have to create new jobs for each corresponding tool/region combination. This can be done quite easily using the submit_job.pl script.

Before you can use submit_job.pl you have to start a dispatcher (located in GENDB/bin directory).

Example:   />dispatcher.pl -l /tmp/dispatcher.log 



Currently implemented generic tools

TransTerm (contributed by D. Wetter)

The TransTerm tool finds rho-independent transcription terminators in bacterial genomes. It creates observations, terminator regions (GENDB::Region::Signal::Terminator) and automatic annotations.

Requirements

  • edit GlobalConfig.pm (located in GENDB/Common/ directory)
add the following lines to your GenDB configuration:


our @EXPORT_OK=qw(... GENDB_TRANSTERM GENDB_TRANSTERM_LIB ...)

...

=item * GENDB_TRANSTERM, GENDB_TRANSTERM_LIB

pathname to TransTerm script and directory containing binaries and scripts from TransTerm package

=cut 

use constant GENDB_TRANSTERM => '/PATH/TO/YOUR//INSTALLATION/OF/TransTerm';
use constant GENDB_TRANSTERM_LIB => '/LIBRARYPATH/TO/YOUR//INSTALLATION/OF/TransTerm';
 


  • [[Media:GenDBWiki$$DeveloperDocumentation$$ToolIntegration$TransTerm.pm.txt|TransTerm].pm] (copy to GENDB/Tools directory)
  • create_generic_tool_TransTerm.pl (copy to GENDB/exec directory)

How to use

Create a generic tool instance of TransTerm. For that task use the script create_generic_tool_TransTerm.pl. Mandatory parameter is a GenDB project, for optional parameters see usage. (Default tool name is "TransTerm")

Example:   />gendb_start create_generic_tool_TransTerm.pl -p GenDB_DEMO-2.2

With submit_job.pl (located in GENDB/share/exec directory) you can run TransTerm. You have to determine a GenDB project, a contig and tool name. Use the tool name which was set with create_generic_tool_TransTerm.pl before. Make sure a dispatcher is running!

Example:   />gendb_start submit_job.pl -p GenDB_DEMO-2.2 -c pSymB -t TransTerm -a

References

  • http://www.tigr.org/software/transterm.html
  • Maria D. Ermolaeva, Hanif G. Khalak, Owen White, Hamilton O. Smith and Steven L. Salzberg. Prediction of Transcription Terminators in Bacterial Genomes. J Mol Biol 301, (1), 27-33 (2000)