Integrative Bioinformatics Laboratory

Monday, May 28, 2007

AMIA 2007 Spring Congress Synopsis

This year's AMIA Spring Congress involved researchers from many fields including Medical Communities, Nurses, Informatics, Basic Research (bench) and some industry representatives as well, all coming together with a common goal: translate biomedical knowledge into medical practice. The meeting included 5 tracks, spanning from nursing informatics to clinical decision support, personalized health records and translational research, so it was a good mix of several domains each with its own challenges and methodologies.
Adam Bosworth, vice-president of Google opened the meeting with a blast, his vision on how health related information will/should be handled in a not so distant future was mesmerizing, his insight on how google is prepared to adress this challenge an unexpected surprise. Here are his notes.
The translational research informatics track was one of the most interesting, it became clear how CTSA has had and is still having a key role in the development of a new science - Integrative Bioinformatics. Several CTSA awardees made their voices heard and lots of ideas flew around the room, from tools already being developed and evaluated to tools promising but still in the planning process, integration was the keyword and the main challenge.
caBIG was also given a chance to give their 2 cents, but the most enthusiastic seemed to rely mostly on in-house costumized tools. There were plenty of semantic web technologies aficionados, including users of Protege, but no Semantic or Sloppy Databases (exept ours :D).
Christopher Chute also made his presence felt, his insight in what regards the basic researchers vs clinical research "Chasm of Semantic Despair" was particularly insightfull.

Tuesday, May 15, 2007

IBLabook

Internal web-based resource for IBL people can be accessed here: IBLabook. If you have problems accessing it, think you should be given access, or have recommendations for changes to what is available there please bring it up at the weekly lab meeting, Tuesday's 9am at HMB 13.304.

Thursday, May 10, 2007

Integrative Bioinformatics 7

XML mediated interoperability between data structures. The basic idea is very simple: [environment/structure 1] --> XML --> [environment/structure 2]. A very powerful suite of tools, formal and computational, have been developed to deal with the mediation so we have plenty to keep you busy for a couple of hours.

For the Matlab centric exercises please note the xmlread, xmlwrite and xslt commends. Have a go at the W3C links listed in the help files of those commands. In today's class we are going to cheat and use tools that both rely on Matlab structures and/or can manipulate them using XPath. I'll also explain why this is a very popular and useful cheat in any language. Note also that the best libraries are often produced by people with the same problems as you. For example see these two: XML Toolbox and XML Parser. I also wrote a library to deal with XML mediation XML4MAT but one of you (no names please) told me it is ugly.

Thursday, May 03, 2007

Integrative Bioinformatics 6

Modelling Strings

Today we have a hands on tutorial on modelling strings and how you can use them to represent and/or retrieve structured information. This will be a class on menial Bioinformatics.

Modelling strings relies heavily on regular expressions. Spend some time getting familiar with the concepts of patterning, lazy and greedy quantifiers and tokens. Try the regexp function with all 5 output arguments to see what is there.

Wednesday, April 25, 2007

Integrative Bioinformatics 5

Case Study: Ovary Cancer data integration at the Kleberg Foundation

NOTE THE LOCATION FOR THIS CLASS IS IN THE SOUTH CAMPUS
(SCRB1. conference room 4)

[Index of classes]

Lets get serious: you now have data structures that describe your (or at least Katherine's) data so the time has come to show you how it all fits together. Instead of having the class in HMB this time you are invited to join a workshop about this topic in the South Campus. We'll do two things:

1. I'll post a sample solution of a script that would assemble a data structure for K's data. The idea is not that my solution is the best, it is just to encourage you to post your own m-files assembling alternative data structures. This way next week we'll be generating multiple XML structures from the different solutions.

2. You are asked to listen to my short presentation at 2:10 pm encouraged to participate in the followup discussion. Here's the full agenda for the afternoon in case you can and want to hear the full story:

Workshop at the Kleberg Center for Biomarker discovery,South Campus Research Building (SCRB1. conference room 4) behind the welcome desk to the right upon entering the building.
01:15 pm -01:30 pm
Dr. Rahul Mitra – Kleberg Center
Introduction

01:35 pm - 02:05 pm
Dr. Bryan Hennessy – Kleberg Center
Molecular Medicine Studies on ovarian and breast cancers at Kleberg Center

02:10 pm - 02:40 pm
Dr. Jonas Almeida – Kleberg Center
Data collection and analysis at Kleberg Center

02:45 pm - 03:45 pm
All: Open discussion

Wednesday, April 18, 2007

Integrative Bioinformatics 4

XML constructs

[Index of classes]

What is XML?
In this class we are going to learn what are the main features of XML, and how it can be used to build data structures (including some insights on the homework ;-) ). XML is one of the most popular and widely used representation languages on the web. Since it's implementation in 1996 (originally by the SGML Editorial Review Board), it has been widely extended and became a w3c recommentation in 2006. Due to it's flexibility many standard development teams in biomedical research have relied on XML to implement knowledge domain representations, such as MAGE-ML, MiniML, SBML, AGML, etc.
On a different line of though, we are also going to learn how to use S3DB through the API, by building (surprisingly!!) an XML structure that resembles SQL.

For more on XML visit the W3C recommendation and the tutorial.

Google presentations

Google is adding presentations to their Docs and Spreadsheets package.
"Presentations" is a new addition to the Google Docs and Spreadsheets (GOffice) where docs are saved to an online storage facility and can be accessed from any computer. Owners of the documents can edit docs and can grant access to collaborators who are able to modify docs and invite more collaborators; and viewers who can only view the most recent version of the documents or spreadsheets. It seems that about 99% of the time GOffice can replace MS Office. But GOffice has "intarwebbiness", is free, and is so easy to share!