I saw a great tech talk at work today about Science Commons (by the way, I’m at Google over the summer coding it up). James Boyle from Duke talked about the current state of scientific research and a few ideas to improve upon the state of things.

The first thing that he talked about was a semantic search engine designed to pull together complex ideas from papers. Today, if I’m interested in Alzheimer’s and want to identify possible drug targets for treatment or model a cell signaling pathway, the first thing that I’ll need to do is start with the literature to find out what’s been done. Sounds reasonable, right? It is, except for the hundred thousand papers related to a process like apoptosis (cell death) that I’ll be digging through until I need to find a cure for myself. The problem is compounded by the fact that useful information is spread across so many different specialized disciplines that I can’t just ask my Alzheimer’s researcher buddy to break things down. Review papers can help this somewhat, but you won’t necessarily find unexplored connections by reading them and someone has to know enough to write them. Their solution was to make a search engine that could find actual connections, identifying the relationships between cause and effect in the text. He showed some interesting results but you could tell that it has a very long way to go.

He talked about a few other things too, but what I found most interesting was the problem of sharing materials between labs. Being ‘scooped’ is a problem that is on most people’s minds, and it’s definitely the most important problem limiting collaboration between labs. If I have a cell line that I think that I can get two good papers out of, there is a lot of motivation to hold off on sharing it with other labs until I have finished my work on it. That’s also besides the work required in preparing something to share. Their solution was to setup a formalized sharing system, so that in addition to things like publications and citations being considered in evaluations, the fact that lab X shared 800 embryos with other labs would also be a big plus on the CV.

I’ve thought a lot about this problem of giving people the motivation to share and haven’t come up with a great solution, so it was especially interesting to hear James speak. If there is a good personal relationship where everyone will receive due credit it’s easy, but how can someone with a great idea in Japan coordinate with a fabrication wizard at Cornell when they don’t even know each other yet? There must be a way to provide the incentive to sharing ideas and materials, anything from mask layouts to microfluidic devices to cell lines and to ensure their quality. Once that’s accomplished and you can put together materials from many different people in a black box fashion then output will really shoot off of the charts. And then that perfect search engine will really be important…

The final thing that he touched on was the current debate over public access to journal articles that were funded by the federal government. The huge journal subscription fees are mainly justified by their managing of the peer review and editorial process. But when you think about it, why does it really take that much work? If an online system was setup that all grant awardees were required to enroll in with their specialties, it would be incredibly easy and probably a lot less biased to distribute articles for review that way. And then there’s even the option of open pre-publication review which Nature has flirted with lately, but that raises the specter of scooping again.

If you aren’t familiar with it (or you’ve never used flickr), you should check out Creative Commons if you have a second. Also, the actual tech talk will hopefully be online in a few days.