ProDBWiki/DeveloperDocumentation/MzDataImportSpecification: Difference between revisions
No edit summary |
m (10 revisions) |
||
(9 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
__NOTOC__ | __NOTOC__ | ||
= mzData Import Specification = | |||
== Introduction == | |||
mzData is an XML exchange format for mass-spectrometry data and is created and maintained by the [http://psidev.sourceforge.net Proteomics Standards Initiative]. The format is defined by an XML schema and an ontology. The ontology defines legal values for so-called cvParam elements to provide flexibility concerning new mass-spectrometers and their componentry. | mzData is an XML exchange format for mass-spectrometry data and is created and maintained by the [http://psidev.sourceforge.net Proteomics Standards Initiative]. The format is defined by an XML schema and an ontology. The ontology defines legal values for so-called cvParam elements to provide flexibility concerning new mass-spectrometers and their componentry. | ||
An mzData format importer for | An mzData format importer for ProDB needs to consider two different mapping problems. The first one is a relatively simple one: XML elements/attributes need to be mapped onto ProDB object attributes. The second kind of mapping is a little bit more difficult: we need to map ontology-controlled cvParam elements onto ProDB object attributes. An example: | ||
ProDB models the kind of ionisation directly as classes, for example DB::Ionisation::Electrospray or DB::Ionisation::Maldi. In mzData, however, this information is stored in a list of cvParam elements in mzData/description/instrument/source, for example: | |||
<pre><nowiki> | |||
<mzData version="1.04" accessionNumber="psi-ms:12345" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> | |||
<description> | |||
<admin> | |||
... | |||
</admin> | |||
<instrument> | |||
<instrumentName>LCQ Deca XP</instrumentName> | |||
<source> | |||
<cvParam cvLabel="psi" accession="PSI:1000008" name="IonizationType" value="ESI"/> | |||
</source> | |||
<analyzerList count="1"> | |||
<analyzer> | |||
<cvParam cvLabel="psi" accession="PSI:1000010" name="AnalyzerType" value="PaulIonTrap"/> | |||
<cvParam cvLabel="psi" accession="PSI:1000011" name="Resolution" value="2000"/> | |||
<cvParam cvLabel="psi" accession="PSI:1000013" name="Accuracy" value="0.2"/> | |||
</analyzer> | |||
</analyzerList> | |||
<detector> | |||
<cvParam cvLabel="psi" accession="PSI:1000021" name="DetectorType" value="ElectronMultiplier"/> | |||
</detector> | |||
</instrument> | |||
... | |||
<description> | |||
... | |||
</mzData> | |||
</nowiki></pre> | |||
The importer needs to have access to the ontology and also some mapping information to see which parameter type triggers the generation of an object (e.g. name="[[IonizationType]]") and how other parameters are map into an object's attributes. | |||
The ontology doesn't exist as an OWL file yet, only as tab-delimited list at [http://psidev.sourceforge.net/ontology/PSI_ontology_mzData_v0.0.htm]. As such, it cannot define parameter dependencies (e.g. if <cvParam name="AnalyzerType" ../> exists, there must be also a <cvParam name="Resolution" .../> element). Hopefully, the first proper release of the ontology as OWL version contains this kind of information. Otherwise we have to include this information in our mapping configuration(s). | |||
== Further reading == | |||
* [http://psidev.sourceforge.net/ms/ PSI-MS: Mass Spectrometry Standards Working Group] | |||
* [http://psidev.sourceforge.net/ms/xml/mzdata/mzdata.html mzData Schema Documentation] | |||
* [http://psidev.sourceforge.net/ontology/index.html The PSI Ontology] | |||
* [http://psidev.sourceforge.net/ontology/PSI_ontology_mzData_v0.0.htm PSI ontology mzData v0.0] | |||
* [http://sourceforge.net/project/showfiles.php?group_id=65472 PSI Download Section (XML Schema and XML Examples)] |
Latest revision as of 07:16, 26 October 2011
mzData Import Specification
Introduction
mzData is an XML exchange format for mass-spectrometry data and is created and maintained by the Proteomics Standards Initiative. The format is defined by an XML schema and an ontology. The ontology defines legal values for so-called cvParam elements to provide flexibility concerning new mass-spectrometers and their componentry.
An mzData format importer for ProDB needs to consider two different mapping problems. The first one is a relatively simple one: XML elements/attributes need to be mapped onto ProDB object attributes. The second kind of mapping is a little bit more difficult: we need to map ontology-controlled cvParam elements onto ProDB object attributes. An example:
ProDB models the kind of ionisation directly as classes, for example DB::Ionisation::Electrospray or DB::Ionisation::Maldi. In mzData, however, this information is stored in a list of cvParam elements in mzData/description/instrument/source, for example:
<mzData version="1.04" accessionNumber="psi-ms:12345" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <description> <admin> ... </admin> <instrument> <instrumentName>LCQ Deca XP</instrumentName> <source> <cvParam cvLabel="psi" accession="PSI:1000008" name="IonizationType" value="ESI"/> </source> <analyzerList count="1"> <analyzer> <cvParam cvLabel="psi" accession="PSI:1000010" name="AnalyzerType" value="PaulIonTrap"/> <cvParam cvLabel="psi" accession="PSI:1000011" name="Resolution" value="2000"/> <cvParam cvLabel="psi" accession="PSI:1000013" name="Accuracy" value="0.2"/> </analyzer> </analyzerList> <detector> <cvParam cvLabel="psi" accession="PSI:1000021" name="DetectorType" value="ElectronMultiplier"/> </detector> </instrument> ... <description> ... </mzData>
The importer needs to have access to the ontology and also some mapping information to see which parameter type triggers the generation of an object (e.g. name="IonizationType") and how other parameters are map into an object's attributes.
The ontology doesn't exist as an OWL file yet, only as tab-delimited list at [1]. As such, it cannot define parameter dependencies (e.g. if <cvParam name="AnalyzerType" ../> exists, there must be also a <cvParam name="Resolution" .../> element). Hopefully, the first proper release of the ontology as OWL version contains this kind of information. Otherwise we have to include this information in our mapping configuration(s).