EMMAWiki/TermsAndConcepts/ForUsers/Datasets: Difference between revisions

From BRF-Software
Jump to navigation Jump to search
imported>MichaelDondrup
No edit summary
imported>MichaelDondrup
No edit summary
Line 4: Line 4:
Well, that is just what it says, a set of data, in the special case a matrix of numbers. These data might be the output of an external image analysis software like [[ImaGene]], [[GenePix]] or a .cel file from the Affy software. These type of data sets are called ''raw data'', as they normally need further processing to be usefull. In the case of a spotted !cDNA microarray these datasets must as a minimum contain the foreground and background intensities for each channel. Location information on the exact spot position is also often required, for instance to draw a false colour image of the microarray with the location of the spots to click on.  
Well, that is just what it says, a set of data, in the special case a matrix of numbers. These data might be the output of an external image analysis software like [[ImaGene]], [[GenePix]] or a .cel file from the Affy software. These type of data sets are called ''raw data'', as they normally need further processing to be usefull. In the case of a spotted !cDNA microarray these datasets must as a minimum contain the foreground and background intensities for each channel. Location information on the exact spot position is also often required, for instance to draw a false colour image of the microarray with the location of the spots to click on.  


While rows of a raw dataset correspond to individual spots on the array, columns correspond to different types of measurments like foreground background and location values. These are called ''[[QuantitationTypes]]''.
While rows of a raw dataset correspond to individual spots on the array, columns correspond to different types of measurments like foreground background and location values. These are called ''[[QuantitationTypes]]''. You think of them as simply being column headers for the matrix.


Several analysis methods like for instance ''[[Normalization]]'' will take raw datasets andproduce new datasets by computation. These datasets are called ''Transformed Datasets''. In Transformed Datasets often rows do not correspond to individual spots any more but to genes. Of course  different ''[[QuantitationTypes]]'' also exist for Derived Datasets. Which Quantitationtypes are actually used depends on the chosen analysis method.
Several analysis methods like for instance ''[[Normalization]]'' will take raw datasets andproduce new datasets by computation. These datasets are called ''Transformed Datasets''. In Transformed Datasets often rows do not correspond to individual spots any more but to genes. Of course  different ''[[QuantitationTypes]]'' also exist for Derived Datasets. Which [[QuantitationTypes]] are actually there depends on the chosen analysis method.

Revision as of 11:30, 19 April 2005

Terms and Concepts: Dataset

Well, that is just what it says, a set of data, in the special case a matrix of numbers. These data might be the output of an external image analysis software like ImaGene, GenePix or a .cel file from the Affy software. These type of data sets are called raw data, as they normally need further processing to be usefull. In the case of a spotted !cDNA microarray these datasets must as a minimum contain the foreground and background intensities for each channel. Location information on the exact spot position is also often required, for instance to draw a false colour image of the microarray with the location of the spots to click on.

While rows of a raw dataset correspond to individual spots on the array, columns correspond to different types of measurments like foreground background and location values. These are called QuantitationTypes. You think of them as simply being column headers for the matrix.

Several analysis methods like for instance Normalization will take raw datasets andproduce new datasets by computation. These datasets are called Transformed Datasets. In Transformed Datasets often rows do not correspond to individual spots any more but to genes. Of course different QuantitationTypes also exist for Derived Datasets. Which QuantitationTypes are actually there depends on the chosen analysis method.