Monday, January 28, 2008
Flexible Informatics for Linking Experimental Data to Mathematical Models Via DataRail
ABSTRACT
Motivation: Linking experimental data to mathematical models in
biology is impeded by the lack of suitable software to manage and
transform data. Model calibration would be facilitated and models
would increase in value were it possible to preserve links to training
data along with a record of all normalization, scaling, and fusion
routines used to assemble the training data from primary results.
Results: We describe the implementation of DataRail, an open
source MATLAB-based toolbox that stores experimental data in
flexible multi-dimensional arrays, transforms arrays so as to maximize
information content, and then constructs models using internal
or external tools. Data integrity is maintained via a containment hierarchy
for arrays, imposition of a metadata standard based on a
newly proposed MIDAS format, assignment of semantically typed
universal identifiers, and implementation of a procedure for storing
the history of all transformations with the array. We illustrate the
utility of DataRail by processing a newly collected set of ~22,000
measurements of protein activities obtained from cytokinestimulated
primary and transformed human liver cells.
Availability: DataRail is distributed under the GNU General Public
License and available at http://code.google.com/p/sbpipeline/
Contact: sbpipeline@hms.harvard.edu
Supplementary information: accompanies this paper.
Thursday, January 24, 2008
Integrative Bioinformatics 5
January 24 - hands on session developing a client for UCSC Genome Browser (everybody):
- Quick introduction to REST vs SOAP + revisit precedent CORBA, EMAIL, port chanelling etc
- Quick Introduction to upcomming data base and computational statistics modules.
- Hands on session.
HOMEWORK
Don't for get last sessions's: Write matlab function that reads HTML table into cell array.
Tuesday, January 22, 2008
Integrative Bioinformatics 4
January 22 - Data structures and data services (Pablo, Jonas):
- Extracting data structures from data services.
- Attribute/Value pairs --> XML --> RDF tripples
- Regular Expressions.
- UCSC Genome browser as a data service providing an aggragating data structure. For online tutorials see this page.
HOMEWORK
Write matlab function that reads HTML table into cell array.
Thursday, January 17, 2008
Integrative Bioinformatics 3
1. Discussion of matrix notation using teh homework assignment. [my solution].
2. Alignment as a similarity metric. [Presentation].
3. Discussion of collective assignment on developing a client that will use UCSC Genome Browser as a data service.
HOMEWORK
Since you did so well in the introductory class, today we move to an advance algorithm deployment assignment. The Homework is described in the last slide of the presentation.
Tuesday, January 15, 2008
Integrated Bioinformatics 2008 2
This class will introduce the two main components of the integrative exercise: data structures and programming languages. The exploration of these two topics will be pursued in MATLAB, a fast prototyping scientific and engineering programming environment.
In addition to the very extensive help material that comes with MATLAB (from manuals to viodeos, clisck on "Help" in the top menu to find more), Mathworks' website also includes a great selection of webminars.
HOMEWORK
Today we have a small homework assignment just to make sure we all know how to send them to me: write a m-function that identifies the largest element of a matrix and return their location.
Monday, November 26, 2007
Upcoming event: 2007 Computational & Theoretical Biology Symposium: Dec. 7-9 at Rice University
it's worthy to check, the registration is free and the speakers will be talking about interested topics. I just don't know why they're making it in the week before the finals, including at Rice...
************************
We would like to invite you to attend the 4th annual Computational &
Theoretical Biology Symposium at Rice University. It will be held from
December 7 - 9 and features invited talks of more than 20 speakers from
leading institutions across the US. Participants at this annual
symposium will gain new insights into a variety of approaches in
theoretical methods of statistical mechanics, nonlinear dynamics, and
systems biology that are being developed and applied to study and
manipulate nature.
Admission to the symposium is free and open to everyone; registration is
not required. To learn more about the hosts/venue, program and invited
speakers, visit http://ctbs.rice.edu <http://ctbs.rice.edu/> .
Student and Postdoc participants are encouraged to present in a poster
session to be held on Saturday, December 8th from 1:00 - 2:30 pm and
during the coffee break. Presenters are asked to send an e-mail to
ctbs@mailman.rice.edu <http://ctbs.rice.edu/ctbs
poster details by Thursday, November 29th so that logistical
arrangements can be made.
Best Regards,
Symposium Organizing Committee