Monday, January 28, 2008

Flexible Informatics for Linking Experimental Data to Mathematical Models Via DataRail

Link to Bioinformatics Paper

ABSTRACT
Motivation: Linking experimental data to mathematical models in
biology is impeded by the lack of suitable software to manage and
transform data. Model calibration would be facilitated and models
would increase in value were it possible to preserve links to training
data along with a record of all normalization, scaling, and fusion
routines used to assemble the training data from primary results.
Results: We describe the implementation of DataRail, an open
source MATLAB-based toolbox that stores experimental data in
flexible multi-dimensional arrays, transforms arrays so as to maximize
information content, and then constructs models using internal
or external tools. Data integrity is maintained via a containment hierarchy
for arrays, imposition of a metadata standard based on a
newly proposed MIDAS format, assignment of semantically typed
universal identifiers, and implementation of a procedure for storing
the history of all transformations with the array. We illustrate the
utility of DataRail by processing a newly collected set of ~22,000
measurements of protein activities obtained from cytokinestimulated
primary and transformed human liver cells.
Availability: DataRail is distributed under the GNU General Public
License and available at http://code.google.com/p/sbpipeline/
Contact: sbpipeline@hms.harvard.edu
Supplementary information: accompanies this paper.

No comments: