LWN.net Logo

Ten simple rules for the open development of scientific software

Ten simple rules for the open development of scientific software

Posted Dec 31, 2012 21:00 UTC (Mon) by Trelane (subscriber, #56877)
In reply to: Ten simple rules for the open development of scientific software by brooksmoses
Parent article: Ten simple rules for the open development of scientific software

This assumes that the current model, namely develop in private and throw over the wall, is the model to go for, instead of active collaboration between similar research groups. Such a model would be much more sustainable since presumably the PIs or senior post docs would have been invovled with the overarching project long enough that they are truly co-maintainers of the project.

Unfortunately, I suspect that, due to competition over increasingly scarce funding coupled with the lack of interest in the software compared to the papers produced by the software, the situation is not going to improve in the forseeable future. Plus the PIs are used to fiefdoms within their domain, not collaboration with similar groups.

--a former solid-state physics post-doc.
(A PI I worked with stated that there was (paraphrasing from memory) no line in grant applications or progress reports for lines of code written. Rather, it's all papers. Seeing as how I didn't have nearly enough of those, I decided it to be in my family's best interest to move into high-performance software in an industry setting. :)


(Log in to post comments)

Ten simple rules for the open development of scientific software

Posted Dec 31, 2012 21:34 UTC (Mon) by brooksmoses (subscriber, #88422) [Link]

I would disagree that a lot of my points rely on that assumption. My conclusion does rely on it to some extent -- you are in essence arguing that writing reusable-quality code and building a community around it produces benefits in terms of reduced programming effort from being able to reuse the work of one's collaborators, and this partly offsets the extra costs; I had neglected that potential offset. And that's a very valid point.

However, I think that my general argument still holds. A model of active collaboration on code development between similar research groups will require that the research groups develop high-quality code, and that means (a) a lot of additional effort in making the code reusable by others, and (b) the graduate students who are writing the code need to have training, mentoring, and code review that they currently aren't getting -- and which the current structure generally doesn't have anybody with skills or time to provide. And, (c) you also get politics of maintainership when people have different ideas of where the code should go and what level of code quality is acceptable, which means you end up spending time and effort dealing with the politics. Maybe the benefits of collaboration can offset that for shared foundational work in some cases, but that's quite a lot of extra work that needs to be offset -- and the need for universities to invest in people who can provide programming mentorship is still there.

(I've seen software written by PIs. I've worked with software written by PIs. I learned a lot about how not to write code by working with software written by PIs. Senior postdocs, maybe, but are you selecting and training for programming skills or research skills? These days, the programming skills only seem to come along accidentally.)

There's also the point that, even when you have a shared foundation, there's a lot of one-time-use code that gets written to support a single experiment, because every experiment is (by definition) doing something new. That's still going to be in the "develop in private" model simply because only one person ever needs it!

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds