Olivier Elemento’s weblog

Olivier’s science weblog

Mining the Deep Web February 23, 2009

Filed under: Uncategorized — oelemento @ 8:49 pm

There’s a pretty interesting article in the NY Times about the Deep Web, that is, the data that is stored in databases and available through web interfaces. The article mentions some of the strategies that scientists (and web search companies) use to mine the Deep Web. Essentially, these strategies involve making a first few queries in order to guess the type and structure of the data contained in a given database, then either building a model of the data or making many more targeted queries in order to essentially map out the content of the database. This type of research is particularly interesting for us biologists, since we use many such databases (pubmed, genome browsers, database of gene expression, etc), and these databases are not at all connected with each other. Clearly, tools that automatically query and integrate data from the Deep Web would be very useful for us.

http://www.nytimes.com/2009/02/23/technology/internet/23search.html

 

Poplar genome now supported in FIRE February 5, 2009

Filed under: FIRE, genome — oelemento @ 6:41 pm

My lab doesn’t work on plants, but following a request from somebody in Malcom Campbell’s lab at U of Toronto, I’ve just added the Poplar to the list of organisms supported by FIRE.

The genome data comes from http://genome.jgi-psf.org/Poptr1_1/

To analyze Poplar expression data, you’ll need to download FIRE, then download the poplar_data.zip file from http://tavazoielab.princeton.edu/FIRE/

I tested it on a clustered dataset of Poplar tissue expression profiles (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE6422), and I got very impressive results. The test expression profile is included in the EXPFILES directory in poplar_data.zip file.

 

The US Senate wants to increase NIH funding February 4, 2009

Filed under: Funding — oelemento @ 6:03 pm

Update Feb 18th. Here is more information about how the stimulus money is going to be spent, from Nature :

“National Institutes of Health: $7.4 billion divided among the agency’s scientific institutes and centres will fund grants from a backlog of 14,000 investigator-initiated ‘R01′ grants already reviewed and categorized as “highly meritorious”. It will also fund new R01 applications for projects that could reasonably make progress in two years. The agency will add supplemental funding to existing grants and fund new “challenge grants” aimed at thorny problems.

Kington notes: “We are being very careful to focus on funding that only covers the two years of the stimulus package. There will be relatively little, if any, money that entails a four-year commitment.” “

Source: http://www.nature.com/news/2009/090218/full/457942a.html