GPMSWiki/CoreDocumentation/Introduction
"Software" Documentation - Web Interface
Introduction
In times of rapidly increasing demands for high performance computing methods and expanding storage requirements, a majority of software applications has to store and access external data sources. For example, a number of software systems are currently developed at the CeBiTec, helping to organize a flood of genomic and post-genomic data. Obviously, all information that is acquired from wet lab experiments or from manual analysis of the obtained results requires persistent storage and reliable backup capabilities in order to ensure data-integrity. While several relational or object oriented database management systems (e.g. MySQL PostgreSQL, DB2, Oracle) already provide well-suited and stable solutions for storing and maintaining large datasets, there are still some additional issues that have to be addressed for real world applications. Figure 1 illustrates a typical access procedure that is implemented in many software systems.
File:GPMSWiki$$CoreDocumentation$$Introduction$ProjectAcsess.png
Basically, the initial step for accessing data requires a connection to a database management system. Connections can be established by command-line interfaces, APIs, or even graphical user frontends. Although most systems provide different comfortable ways for accessing their data, details of the access protocol and maybe even the type of the data source (e.g. flat file or RDBMS) should be hidden from the user in the frontend application (e.g. web frontend). In our opinion it is also important to provide transparent and consistent access to all data within the same scope (e.g. all information that has been acquired in a transcriptome project). This also includes the use of standard access routines that should be available independently of the chosen access method. This data and all related information can be collected and organized in projects. Once a user has established the connection to a data source, the level of access is often defined by special permissions or privileges. While some database systems allow very fine grained access control, the administration of such permissions is usually a laborious task for the maintainers of the data repository. Often, additional work is required in cases where it is desirable to restrict access to specific users for projects containing sensitive data. In such cases different roles can be identified that manifest the level of access by assigning appropriate privileges. On the other hand, an individual user can thus act in various roles for different projects (e.g. with read only access as a guest user or read/write permissions as a developer).
File:GPMSWiki$$CoreDocumentation$$Introduction$Roles2.png
Obviously, it is desirable to keep the administration overhead as small as possible; the maintainers of (large) database management systems should be supported with an easy-to-use interface that helps to keep an overview of all users and the projects they are involved in. Such a system could also provide some kind of user management interface that allows maintainers of a project to grant dedicated access to (parts of) the information to specific users without involving a database administrator.
In addition to the data itself, almost every modern application stores a number of individual settings for a user. In this case, a project management system should just as well be useable to store these settings separately for each project, independently of the fronted that is employed by the user (e.g. web frontend or GUI). Extending the scope of organizing data in separate projects, applications often reference to related data from different sources. Therefore, it is essential to provide a simple means for accessing and (cross-) referencing these information. Again, these references should be hidden from the user but they should allow asking questions across different data sources. Thus the application has to know where to find the requested information and how to access it.
Author: Lutz Krause