EMMAWiki/TermsAndConcepts/ForUsers/Datasets: Difference between revisions
imported>MichaelDondrup No edit summary |
m (10 revisions) |
||
(6 intermediate revisions by 2 users not shown) | |||
Line 2: | Line 2: | ||
= Terms and Concepts: Dataset = | = Terms and Concepts: Dataset = | ||
Well, that is just what it says, a set of data, in the special case a matrix of numbers. These data might be the output of an external image analysis software like | Well, that is just what it says, a set of data, in the special case of EMMA a matrix of numbers. These data might be the output of an external image analysis software like ImaGene, GenePix or a .cel file from the Affymetrix software. These type of data sets are called ''raw data'', as they normally need further processing to be useful. In the case of a spotted cDNA microarray these datasets must as a minimum contain the foreground and background intensities for each channel. Location information on the exact spot position is also often required, for instance to draw a false colour image of the microarray with the location of the spots to click on. | ||
While rows of a raw dataset correspond to individual spots on the array, columns correspond to different types of | While rows of a raw dataset correspond to individual spots on the array, columns correspond to different types of measurements like foreground background and location values. These are called ''QuantitationTypes''. You think of them as simply being column headers for the matrix. | ||
Several analysis methods like for instance '' | Several analysis methods like for instance ''Normalization'' will take raw datasets and produce new datasets by computation. These datasets are called ''Transformed Datasets''. In Transformed Datasets often rows do not correspond to individual spots any more but to genes. Of course different ''QuantitationTypes'' also exist for Derived Datasets. Which QuantitationTypes are actually there depends on the chosen analysis method. | ||
Each Dataset has to be assigned to at least one Experiment. Raw Datasets can also belong to several experiments. They are automatically added to an Experiment when the corresponding Arrays are assigned. |
Latest revision as of 07:15, 26 October 2011
Terms and Concepts: Dataset
Well, that is just what it says, a set of data, in the special case of EMMA a matrix of numbers. These data might be the output of an external image analysis software like ImaGene, GenePix or a .cel file from the Affymetrix software. These type of data sets are called raw data, as they normally need further processing to be useful. In the case of a spotted cDNA microarray these datasets must as a minimum contain the foreground and background intensities for each channel. Location information on the exact spot position is also often required, for instance to draw a false colour image of the microarray with the location of the spots to click on.
While rows of a raw dataset correspond to individual spots on the array, columns correspond to different types of measurements like foreground background and location values. These are called QuantitationTypes. You think of them as simply being column headers for the matrix.
Several analysis methods like for instance Normalization will take raw datasets and produce new datasets by computation. These datasets are called Transformed Datasets. In Transformed Datasets often rows do not correspond to individual spots any more but to genes. Of course different QuantitationTypes also exist for Derived Datasets. Which QuantitationTypes are actually there depends on the chosen analysis method.
Each Dataset has to be assigned to at least one Experiment. Raw Datasets can also belong to several experiments. They are automatically added to an Experiment when the corresponding Arrays are assigned.