EMMAWiki/AdministratorDocumentation/ArraylayoutGuide: Difference between revisions

From BRF-Software
Jump to navigation Jump to search
imported>MichaelDondrup
No edit summary
imported>MichaelDondrup
No edit summary
Line 18: Line 18:
* Use only '''tab-delimited''' files
* Use only '''tab-delimited''' files
* Quotes <code><nowiki>"" or ''</nowiki></code> around fields will not work with the ADF-checker
* Quotes <code><nowiki>"" or ''</nowiki></code> around fields will not work with the ADF-checker
* Reporter identifiers must only consist of : '''A-Z,a-z,0-9,_,-,+,:'''
* Reporter identifiers must only consist of : '''A-Z,a-z,0-9,_,-,+,:''' In particular reporter identifiers may not contain: <code><nowiki> () tab [] ; </nowiki></code>
* '''Do not try to conceil information''' on your array design (e.g. reporter sequences, annotation). You will need them later anyway. Let me tell you that the information you try to hide from the world is relevant only for yourself and your workgroup. It seems faulty to assume somebody in your project is going to steal your reporter sequence.  
* '''Do not try to conceil information''' on your array design (e.g. reporter sequences, annotation). You will need them later anyway. Let me tell you that the information you try to hide from the world is relevant only for yourself and your workgroup. It seems faulty to assume somebody in your project is going to steal your reporter sequence.  
* '''Double check that Reporter Identifiers are correct.''' If you have the same reporter spotted in replicates, it needs to have the identical Reporter Identifier at that position. Reporter names or other details are irrelevant for this.  
* '''Double check that Reporter Identifiers are correct.''' If you have the same reporter spotted in replicates, it needs to have the identical Reporter Identifier at that position. Reporter names or other details are irrelevant for this.  
* '''Use <code><nowiki>Reporter [[BioSequence]] Database Entry</nowiki></code> columns whenever possible.''' The entries will be shown as hyperlinks in the interface, if you provide a propper accession. You can provide multiple DB entry columns. Make sure, the accession keys work, when doing a manual query on the Database with you web-browser. Leave the field empty, if there is no db accession for this sequence.
* '''Use <code><nowiki>Reporter [[BioSequence]] Database Entry</nowiki></code> columns whenever possible.''' The entries will be shown as hyperlinks in the interface, if you provide a propper accession. You can provide multiple DB entry columns. Make sure the accession keys work when doing a manual query on the Database with your web-browser. Leave the field empty only if there is no db accession for this sequence.
* If you have a list of BRIDGE links, that provide a external reference to a Sequence in GenDB or SAMS (aka.: <code><nowiki>o2xr://</nowiki></code>) you may use <code><nowiki>Reporter [[BioSequence]] Database Entry [GenDB]</nowiki></code> or <code><nowiki>Reporter [[BioSequence]] Database Entry [SAMS]</nowiki></code> to denote them. Instead of creating a regular Database entry, a BRIDGE reference for the Sequences is made. Include the full BRIDGE URI not only the id-number.
* If you have a list of BRIDGE links, that provide a external reference to a Sequence in GenDB or SAMS (aka.: <code><nowiki>o2xr://</nowiki></code>) you may use <code><nowiki>Reporter [[BioSequence]] Database Entry [GenDB]</nowiki></code> or <code><nowiki>Reporter [[BioSequence]] Database Entry [SAMS]</nowiki></code> to denote them. Instead of creating a regular Database entry, a BRIDGE reference for the Sequences is made. Include the full BRIDGE URI not only the id-number.
* '''Empty locations may not be ommited in the layout.''' Use the Reporter Identifier <code><nowiki>Empty</nowiki></code> to mark those locations. Set, <code><nowiki>Reporter Group [role]</nowiki></code> to <code><nowiki> Control</nowiki></code> and [[ControlType]] to <code><nowiki>control_empty</nowiki></code>.
* '''Empty locations may not be ommited in the layout.''' Use the Reporter Identifier <code><nowiki>Empty</nowiki></code> to mark those locations. Set, <code><nowiki>Reporter Group [role]</nowiki></code> to <code><nowiki> Control</nowiki></code> and [[ControlType]] to <code><nowiki>control_empty</nowiki></code>.

Revision as of 11:39, 8 November 2006

Arraylayout Guidelines

An arraylayout is essential for your microarray experiment. All further processing and analysis is based on the layout. A wrong layout leads to a wrong calculation! In order to provide MIAME compliant annotations, mandatory fields in the ADF format for EMMA are the same fields as in ArrayExpress.

Please have a look at the ADF Guidelines at the EBI for detailed descriptions and ADF Checklist at the EBI for a quick overview and checklist before you upload your ADF-file.

If you have a layout in the GAL-format, this can easily be converted to the ADF-format using the spotter file to ADF converter at the EBI. Do not expect to get a valid ADF file from this conversion. You will need to add additional mandatory columns.

Use the ADF checker provided by the EBI to interatively improve your file.

You need to have the role Chief or Maintainer within your project to be able to upload ADF-files to EMMA. So check with the Maintainer of your project, to see how to upload your ADF.

Hints

  • The ADF file needs to be in text format. In Excel, use CSV export.
  • Use only tab-delimited files
  • Quotes "" or '' around fields will not work with the ADF-checker
  • Reporter identifiers must only consist of : A-Z,a-z,0-9,_,-,+,: In particular reporter identifiers may not contain: () tab [] ;
  • Do not try to conceil information on your array design (e.g. reporter sequences, annotation). You will need them later anyway. Let me tell you that the information you try to hide from the world is relevant only for yourself and your workgroup. It seems faulty to assume somebody in your project is going to steal your reporter sequence.
  • Double check that Reporter Identifiers are correct. If you have the same reporter spotted in replicates, it needs to have the identical Reporter Identifier at that position. Reporter names or other details are irrelevant for this.
  • Use Reporter [[BioSequence]] Database Entry columns whenever possible. The entries will be shown as hyperlinks in the interface, if you provide a propper accession. You can provide multiple DB entry columns. Make sure the accession keys work when doing a manual query on the Database with your web-browser. Leave the field empty only if there is no db accession for this sequence.
  • If you have a list of BRIDGE links, that provide a external reference to a Sequence in GenDB or SAMS (aka.: o2xr://) you may use Reporter [[BioSequence]] Database Entry [GenDB] or Reporter [[BioSequence]] Database Entry [SAMS] to denote them. Instead of creating a regular Database entry, a BRIDGE reference for the Sequences is made. Include the full BRIDGE URI not only the id-number.
  • Empty locations may not be ommited in the layout. Use the Reporter Identifier Empty to mark those locations. Set, Reporter Group [role] to Control and ControlType to control_empty.
  • Use the Reporter Comment field to add additional annotation to your Reporter. This will appear as a Reporter description in EMMA.
  • You can leave uncommon fields in you file as a reference and they will be appended to each Reporter description in the form [[FieldName]]=Value.
  • Remember, an ADF file has to be uploaded only once for each ArrayLayout (or array type) not for each individual array.