<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0"><channel><title>bbgm - the discussion - Latest Comments</title><link xmlns="http://www.w3.org/2005/Atom" rel="http://api.friendfeed.com/2008/03#sup" href="http://disqus.com/sup/all.sup#forumcomments-5f146156" type="application/json"/><link>http://mndoci.disqus.com/</link><description>At the interface of biotech and infotech</description><language>en</language><lastBuildDate>Thu, 10 Dec 2009 17:04:51 -0000</lastBuildDate><item><title>Re: Bioinformatics and mythology.  You still need to manage the data</title><link>http://mndoci.com/2009/12/09/bioinformatics-and-mythology-you-still-need-to-manage-the-data/#comment-25474190</link><description>Good blog-post. Biologists and computer scientists need to work together and two realize that there are two parallel approaches that can be done separately or in tandem.&lt;br&gt;&lt;br&gt;1) Defining the biological problem and using technology to approach it.&lt;br&gt;2) Using technology to define biological problems that can be answered.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Anirban</dc:creator><pubDate>Thu, 10 Dec 2009 17:04:51 -0000</pubDate></item><item><title>Re: Scaling out for analytics</title><link>http://mndoci.com/2009/12/04/scaling-out-for-analytics/#comment-24864347</link><description>I concur and think you really hit the nail on the head when you mentioned the need for domain specific tooling. Non-engineers (and I'd even argue most engineers) will not speak raw map reduce natively. Pig, Hive, and Cascading get the job done in an elegant fashion, but in my opinion, the greatest potential for broad adoption in informatics hinges on the need for further abstraction.  Rethinking algorithms that take advantage of such infrastructures out of box (think CrossBow and Bowtie) is a step in the right direction. Bundling and sharing AMIs another step in the right direction. Baby steps, but steps nonetheless.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">chrisbaglieri</dc:creator><pubDate>Sat, 05 Dec 2009 01:13:21 -0000</pubDate></item><item><title>Re: Academic software and infrastructure &amp;#8211; AKA more ranting</title><link>http://mndoci.com/2009/11/28/academic-software-and-infrastructure-aka-more-ranting/#comment-24345202</link><description>Sort of adding on to this discussion&lt;br&gt;&lt;br&gt;1. Open Source does not preclude commercialization.  You need to choose the appropriate license.  The reason companies buy software is to get support, custom patches, more input into the dev cycle etc.  There is an entire, well established model on how you can make this work.&lt;br&gt;&lt;br&gt;2. The one problem with niche software is that it requires you to have licensing fees, the kind no one is willing to pay which is why scientific software companies do not usually make money for most algorithmic software (you can make money on platforms, data management solutions, etc).   This makes open source even more attractive really as it broadens out development and results in better code.  Think of CHARMM.  I think one reason the software has not evolved in quality (it has had oodles of algorithms thrown in) is because it's not open source.  IMO the benevolent dictator model works really well in these cases</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">mndoci</dc:creator><pubDate>Mon, 30 Nov 2009 11:31:17 -0000</pubDate></item><item><title>Re: Academic software and infrastructure &amp;#8211; AKA more ranting</title><link>http://mndoci.com/2009/11/28/academic-software-and-infrastructure-aka-more-ranting/#comment-24333198</link><description>I agree with you Cameron, a researcher needs to do research,  not worry about funding mechanisms and such. It will be great when every researcher realizes that opening things up builds community in away that almost guarantees continuity and maybe eventually profitability.&lt;br&gt;&lt;br&gt;It is also an unfortunate by-product of closed to non academics cases that "hobbyists" and non-profit users get caught in the crossfire ,  there is nothing worse than alienating the motivated.&lt;br&gt;&lt;br&gt;But my point still is that , to demand or even expect academics to adopt open source because of funds coming from the public is unrealistic and impractical.&lt;br&gt;&lt;br&gt;From my own experience , the best crystallography software I use happens to be split down the middle,  50% of them are closed source and non-free for non academics and the rest of them are free and open source. The going is equally good for both of them and I cannot live without either. I would never expect the non-free software to adopt the model of the free, because what they have going just works!  If I have to stop using the non-free version because I move to a non-academic setting , then so be it! , I still have the free version to fall back on.  &lt;br&gt;&lt;br&gt;I think the best way to persuade academics to adopt open source is to adopt a  "Gandhian" attitude to persuasion. Contribute selflessly and hope they see the point. If you are a hobbyist,  I would email the author and request him to allow you to use it. If the email works great, if not just sit back and hope the software gets liberated sometime soon, or better yet fund an open source alternative.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">harijay</dc:creator><pubDate>Mon, 30 Nov 2009 10:00:35 -0000</pubDate></item><item><title>Re: Academic software and infrastructure &amp;#8211; AKA more ranting</title><link>http://mndoci.com/2009/11/28/academic-software-and-infrastructure-aka-more-ranting/#comment-24324686</link><description>Disagree pretty violently with this. It is not the responsibility of researchers to determine funding mechanisms to ensure the effective continuity of research outputs, it is the responsibility of funders to ensure that their (public) money is effectively spent. And if that includes funding continuity for important software projects (or data projects, or specialist materials) then we have to have that discussion.&lt;br&gt;&lt;br&gt;Okay realistically this is not going to happen but given that the interest of the researchers is in having the software used, and in finding a way of keeping it supported surely it would be preferable to allow it to be available for any non-profit use? The more people using it the more likely it is for them to come up with something that might turn a profit - limiting use to people who have neither the time nor the inclination to actually make it useful seems farcical. &lt;br&gt;&lt;br&gt;And of course, open sourcing it would be be even better in this regard :-)</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">openid-14292</dc:creator><pubDate>Mon, 30 Nov 2009 05:08:36 -0000</pubDate></item><item><title>Re: Academic software and infrastructure &amp;#8211; AKA more ranting</title><link>http://mndoci.com/2009/11/28/academic-software-and-infrastructure-aka-more-ranting/#comment-24299572</link><description>I recently tried to talk a friend out of going to grad school in bioinformatics (computer side), arguing that&lt;br&gt;&lt;br&gt;* since he had no interest in further academic work -- only in the knowledge gained and the signaling mechanism thereto -- grad school is vastly expensive for that purpose&lt;br&gt;* interesting parts of the field are so immature that huge chunks are accessible to the motivated amateur: eg hadoop, command of the baseline machine learning toolkit&lt;br&gt;* armed with those tools, he can get a job to fill out his toolbox at a bioinformaticist's salary rather than at a grad student's salary &lt;br&gt;* much of the interesting research done on the computer-engineering end of bioinformatics is done in a commercial and not academic setting&lt;br&gt;&lt;br&gt;It's frustrating to be reminded of the many petty obstacles to amateur science -- not only academic-only software with no carve-out for the amateur scientist, but lack of access to journals and all the rest.&lt;br&gt;&lt;br&gt;Imagine if we had to prove our bonafides to contribute to Linux or view its mailing lists?</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">mrflip</dc:creator><pubDate>Sun, 29 Nov 2009 18:50:51 -0000</pubDate></item><item><title>Re: Academic software and infrastructure &amp;#8211; AKA more ranting</title><link>http://mndoci.com/2009/11/28/academic-software-and-infrastructure-aka-more-ranting/#comment-24282744</link><description>I was just saying that for Robetta or whatever to restrict useage to academia is perfectly acceptable.  You get your grants rejected ! , your funding dries up..what do you do . Reign in the give it all away ideals, and start getting pragmatic!. Allow academics to use your software free , and charge everyone else to use it! &lt;br&gt;&lt;br&gt;That said open source works and works very well, there is no denying it. Numpy/Python, R , Ruby on Rails etc succeed because they are open source and quite universally applicable. Its a different matter to open source a niche application that which is used by a small group of people. I dont think its fair to expect the same model to translate to other platforms or software. &lt;br&gt;&lt;br&gt;Although I can see your point that setting things free guarantees their long term survival , helps keep quality up etc etc , it rarely pays the bills!  Accordingly, I think the biggest way to encourage open source is to start contributing to projects , and hit those paypal buttons everytime an open source projects makes your work easier. I would rather do that than expect academics or government funded projects to give things away everytime.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Hari</dc:creator><pubDate>Sun, 29 Nov 2009 12:47:37 -0000</pubDate></item><item><title>Re: Academic software and infrastructure &amp;#8211; AKA more ranting</title><link>http://mndoci.com/2009/11/28/academic-software-and-infrastructure-aka-more-ranting/#comment-24282604</link><description>I'd like to add that where there is value, companies will and do fund software development, either directly, or through consortia and that is fine.&lt;br&gt;&lt;br&gt;This half-way model is the one I have always had problems with and more now than ever.  It also results in academics not appreciating open source.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">mndoci</dc:creator><pubDate>Sun, 29 Nov 2009 12:42:43 -0000</pubDate></item><item><title>Re: Academic software and infrastructure &amp;#8211; AKA more ranting</title><link>http://mndoci.com/2009/11/28/academic-software-and-infrastructure-aka-more-ranting/#comment-24281892</link><description>Open source does not close avenues for commercialization.  Most of the current models (and I've worked in those for a long time) do not really work.  Well perhaps for a few individuals, but not for the quality of the code and the improvement of science.  &lt;br&gt;&lt;br&gt;The reason is that almost none of the software by itself is worth that much in the first place, i.e. unique enough to be absolute must.  Bioinformatics has done fine with very little closed source software (in fact, closed source has lost) and the places where money is paid is in areas such as data management, not algorithms.  Pharma etc will pay for custom development and would rather be contributors to the open source world.  Can you imagine if R were not open source?  Would all of you be even talking that much about it?  Would it be half as successful and it's actually broadly usable and has a lot of value.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">mndoci</dc:creator><pubDate>Sun, 29 Nov 2009 12:28:15 -0000</pubDate></item><item><title>Re: Academic software and infrastructure &amp;#8211; AKA more ranting</title><link>http://mndoci.com/2009/11/28/academic-software-and-infrastructure-aka-more-ranting/#comment-24278650</link><description>One way out is for companies and individuals to fund science that they get value from.  If an academic site  that had some value to the general community goes down, then the community should rally to write to the authors/maintainers and port it to a suitable platform like google app-engine or an AWS  account backed by a loosely constructed paypal donation backed foundation.  In these cash strapped times, I would not blame any government entity that shuts down a service that costs money to keep running.&lt;br&gt;&lt;br&gt;Also along these lines , I dont think it is practical or right  given the funding situation that academics be expected to close all avenues for commercialization by giving away their developed algorithms . The markets involved are too small for academics to  sit and wait for value to accrue from the application of any open-source model. The cash rich entities in the equation a.k.a most big Pharma companies  should be expected to pay for software /algorithms developed with public funds. That helps academics stay afloat as government funding gets more and more scarce.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">harijay</dc:creator><pubDate>Sun, 29 Nov 2009 11:23:00 -0000</pubDate></item><item><title>Re: The web becomes an even better platform for chemistry</title><link>http://mndoci.com/2009/11/25/the-web-becomes-an-even-better-platform-for-chemistry/#comment-24102834</link><description>Chempedia is a terrific project.  Glad I found your blog.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">labgrab</dc:creator><pubDate>Wed, 25 Nov 2009 18:26:21 -0000</pubDate></item><item><title>Re: A GPU community</title><link>http://mndoci.com/2009/11/23/a-gpu-community/#comment-23926936</link><description>Another major issue with restricting membership to people with a .edu email address is that you thereby exclude everyone who works at a university outside the US.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">larsjuhljensen</dc:creator><pubDate>Tue, 24 Nov 2009 00:10:57 -0000</pubDate></item><item><title>Re: Revisting data and people</title><link>http://mndoci.com/2009/11/19/revisting-data-and-people/#comment-23541534</link><description>Deepak - Not really sure I agree with your diagnosis.  I can use myself as an example: why don't I know your musical friends?  I've read and enjoyed (and occasionally commented on) your occasional musical posts on FriendFeed.  But music isn't why I'm on FriendFeed, and so I never quite manage to cross the gap to start really chatting with your friends who are.  It's not that I wouldn't like to - I'm pretty sure I'd enjoy it a lot - nor that I'm unaware, but there's an awful lot of communities I'd like to join but don't quite have the time.  By way of contrast, programming is a major interest of mine, which is how I've ended up connected to a lot of developers on FriendFeed.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Michael Nielsen</dc:creator><pubDate>Thu, 19 Nov 2009 11:04:26 -0000</pubDate></item><item><title>Re: Zeroing in on the public domain</title><link>http://mndoci.com/blog/2009/03/11/zeroing-in-on-the-public-domain/#comment-21600356</link><description>Hey Deepak,&lt;br&gt;&lt;br&gt;What about the Genome Commons? Steven Brenner is speaking about this at Advances Towards Personalized Medicine &lt;a href="http://lifescience.planetconnect.com/program/pmberkeley2009" rel="nofollow"&gt;http://lifescience.planetconnect.com/program/pm...&lt;/a&gt;. I'm very hopeful for open source science!&lt;br&gt;&lt;br&gt;Also, at BIL:PIL on Friday in San Diego, I heard about Open Science Summit 2010 from Joseph Jackson. No website yet.&lt;br&gt;&lt;br&gt;Cheers, Rita</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">ritalim8</dc:creator><pubDate>Sun, 01 Nov 2009 23:18:45 -0000</pubDate></item><item><title>Re: Matt&amp;#8217;s manifesto for a science data platform</title><link>http://mndoci.com/2009/10/28/matts-manifesto-for-a-science-data-platform/#comment-21396030</link><description>Hey folks,&lt;br&gt;&lt;br&gt;I'm really thrilled to see discussion coming together on this topic, and am trying to come up to speed on all the existing technologies, projects, and ontologies.  Jamie, thanks very much for the links and information about AnIML - looks fantastic, and I'll get in touch after I do some reading.&lt;br&gt;&lt;br&gt;I'm *particularly* pumped about this statement: "At the moment we have gained the attention of most of the big instrumentation manufactures and are in the process of wrapping up version 1.0 of the standard."&lt;br&gt;&lt;br&gt;I was also quite excited to read Cameron Neylon's "Head in the clouds: Re-imagining the experimental laboratory record for the web-based networked world" at &lt;a href="http://www.aejournal.net/content/1/1/3" rel="nofollow"&gt;http://www.aejournal.net/content/1/1/3&lt;/a&gt; - thoughts?</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">jasonmorrisontb</dc:creator><pubDate>Fri, 30 Oct 2009 18:57:54 -0000</pubDate></item><item><title>Re: Matt&amp;#8217;s manifesto for a science data platform</title><link>http://mndoci.com/2009/10/28/matts-manifesto-for-a-science-data-platform/#comment-21350170</link><description>There is a lot of working going on in this area at the moment.  I'm currently a member of a working group at NIST developing an new data format for analytical data - AnIML (Analytical Information Markup Language).&lt;br&gt;&lt;br&gt;One piece of advice when it comes to scientific data formats, look for existing efforts and contribute your skills there.  In the analytical data world we have ANDI, JCAMP, MzData etc... the AnIML project's goal is to take the good points from there standards and at the same time correcting their shortfalls.&lt;br&gt;&lt;br&gt;At the moment we have gained the attention of most of the big instrumentation manufactures and are in the process of wrapping up version 1.0 of the standard.&lt;br&gt;&lt;br&gt;This data format will be an ASTM standard when completed.  The standard will be free to use (i.e. "open"), license is still pending but likely LGPL.&lt;br&gt;&lt;br&gt;If you forge out on your own to start a new format please (please!) at least get in touch with the existing groups.  A lot of leg work has been done in this field and most people are willing to share.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">jamcquay</dc:creator><pubDate>Fri, 30 Oct 2009 09:28:25 -0000</pubDate></item><item><title>Re: Matt&amp;#8217;s manifesto for a science data platform</title><link>http://mndoci.com/2009/10/28/matts-manifesto-for-a-science-data-platform/#comment-21255264</link><description>One reason for the lack of innovation in scientific software is the lack of software development skills among scientists: a small minority are as good as anyone in the world, but as we found last year [1,2], the vast majority are too busy learning and doing science to pick these skills up on their own, much less to create the "well designed, high quality programming interfaces" that Matt feels we need.  I think this is the biggest roadblock to wider adoption of cloud computing and anything with "peta-scale" in its name.&lt;br&gt;&lt;br&gt;[1] &lt;a href="http://www.cs.utoronto.ca/%7Egvwilson/articles/amsci-survey-2009.pdf" rel="nofollow"&gt;http://www.cs.utoronto.ca/~gvwilson/articles/am...&lt;/a&gt;&lt;br&gt;[2] &lt;a href="http://www.cs.utoronto.ca/%7Egvwilson/articles/how-scientists-use-computers-2009.pdf" rel="nofollow"&gt;http://www.cs.utoronto.ca/~gvwilson/articles/ho...&lt;/a&gt;</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Greg Wilson</dc:creator><pubDate>Thu, 29 Oct 2009 07:22:50 -0000</pubDate></item><item><title>Re: The disconnect in funding data resources</title><link>http://mndoci.com/2009/10/18/the-disconnect-in-funding-data-resources/#comment-20871273</link><description>I guess part of the problem is with the grants given from the funding agencies, which include support for the data-generation, but include minimal or none long-term support for people + hardware to maintain generated data. Of course this problem did not exist 5-7 years ago, when nobody was cranking 1000 Affy chips from a soybean population or  generating metagenomic data.&lt;br&gt;&lt;br&gt;There's GenBank as a repository, but please read again aloud the "repository" word. I am saying that because when a small to medium size lab gets a grant to generate a bunch of data (1000 Affy chips - keep insisting on that example because of personal experience), it's not only about placing the data on a FTP website; it's about the analytics built around the data. Like for example a website created as part of the grant by the lab generating the data, which website offers a mini-portal with some viewers, or algorithms to run on the data via a small computational back-end.&lt;br&gt;&lt;br&gt;Now when the grant runs out, nobody will maintain that website. NCBI is not a repository, so it's up to the users to grab the deposited data and find a way to analyze them.&lt;br&gt;&lt;br&gt;What is a possible solution ? Well, to praise my own craft, put your Affy expression measurements or sequence on a data cloud (NCBI can become a data cloud - they have good hardware, but lack an easy to work with and maybe non-scalable API), and compute on your data using a SaaS approach. What this involves is machine images on the Amazon (or any other) compute cloud, with the appropriate software installed, which machines pull the data from the repository an do computes as needed.&lt;br&gt;&lt;br&gt;As William Gibson said, "The future is already here, it's not evenly distributed yet".</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">agbiotec</dc:creator><pubDate>Fri, 23 Oct 2009 13:53:05 -0000</pubDate></item><item><title>Re: Awesomeness</title><link>http://mndoci.com/2009/10/14/awesomeness/#comment-20343667</link><description>This is obvious, as noted.  Unfortunately, a lot of "awesomeness" gets pushed aside for marketing these days.  Our consumer society is teaching us that packaging and customer service are more important.  I'm glad to hear the "real thing" brought to the fore again.  The "awesomeness" of truth or craft understood thoroughly and communicated well still floats my boat a lot higher.  Thank you!</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Peggy Fahnestock</dc:creator><pubDate>Sun, 18 Oct 2009 12:40:08 -0000</pubDate></item><item><title>Re: Post Hadoop World thoughts</title><link>http://mndoci.com/2009/10/03/post-hadoop-world-thoughts/#comment-19805889</link><description>Likewise.  And hopefully under less busy conditions.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">mndoci</dc:creator><pubDate>Sat, 10 Oct 2009 16:15:07 -0000</pubDate></item><item><title>Re: Post Hadoop World thoughts</title><link>http://mndoci.com/2009/10/03/post-hadoop-world-thoughts/#comment-19801269</link><description>Thanks for the kind words Deepak! It was great to meet you as well, and your words have certainly sent me off on a few intellectual escapades over the last few years. Looking forward to running into you more often soon.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">jhammerb</dc:creator><pubDate>Sat, 10 Oct 2009 14:34:40 -0000</pubDate></item><item><title>Re: Blog away</title><link>http://mndoci.com/blog/2008/10/15/blog-away/#comment-19273215</link><description>I have no idea what I was thinking about at the time.  My memory is not that good :).&lt;br&gt;&lt;br&gt;I think what I was speaking about blogs as a whole, rather than specifics.  I do agree that in the past year the quality of blogging has ratcheted up</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">mndoci</dc:creator><pubDate>Tue, 06 Oct 2009 23:39:27 -0000</pubDate></item><item><title>Re: Blog away</title><link>http://mndoci.com/blog/2008/10/15/blog-away/#comment-19268971</link><description>What do you mean "blogs are superficial"?  I know of many blogs that write in GREAT detail about a particular subject.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">P212121</dc:creator><pubDate>Tue, 06 Oct 2009 22:04:26 -0000</pubDate></item><item><title>Re: Notable</title><link>http://mndoci.com/2009/09/23/notable/#comment-17934300</link><description>I thought about this recently: &lt;a href="http://blog.jonudell.net/2009/07/09/understanding-wikipedia-notability/" rel="nofollow"&gt;http://blog.jonudell.net/2009/07/09/understandi...&lt;/a&gt;. People were impressed to see that I was a "notable inhabitant" of Keene, but from my perspective I was a pretty random choice. I thought I'd review Keene's revision history to show how random it actually was. But I got bogged down in the difficulty of analyzing the revision history.</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">openid-11344</dc:creator><pubDate>Thu, 01 Oct 2009 11:31:28 -0000</pubDate></item><item><title>Re: When HPC will not be the HPC you remember</title><link>http://mndoci.com/2009/09/26/when-hpc-will-not-be-the-hpc-you-remember/#comment-17847974</link><description>Re. the last one. I'm absolutely convinced that the biggest benefit of Amazon-style clouding is that you can do a 1000-CPU-hour job in 1 hour on 1000 CPUs for the same cost. But I don't think many people (in bioinformatics at least) have grokked this yet.&lt;br&gt;&lt;br&gt;Andrew.&lt;br&gt;&lt;br&gt;(Firefox spellchecker has grokked but not bioinformatics or CPUs... heh)</description><dc:creator xmlns:dc="http://purl.org/dc/elements/1.1/">Andrew Clegg</dc:creator><pubDate>Wed, 30 Sep 2009 10:59:16 -0000</pubDate></item></channel></rss>