LWN.net Logo

No code needed

No code needed

Posted May 21, 2012 15:10 UTC (Mon) by southey (subscriber, #9466)
In reply to: A scientific basis for Open Source Software by Del-
Parent article: A scientific basis for Open Source Software

I very much agree that the any person trained in the area should be able to independently verify any result without requiring any code for the authors. If you can not do that then I do not see that you have a right to complain about the code availability.

Code licenses are really a small issue as often author may send you the code (or not). Usually it is other aspects that are more problematic. One is the user support (documentation and running the code) as the authors have no time or money for that - hence my first comment. Probably under that is also code quality - some code is really well written that you can find what you want, others are more complex (but not incorrect). Often, it is far easier to write your own than try to modify existing code.

Most of the applications have very specific code bases that are not suitable for distribution. Sure, there are community efforts (just see what Scientific Linux distro provide) that provide the basic libraries yet you still must know how to use them. It is very easy to say provide the code but it just isn't that simple. You need to find a dedicated person to help when the code does not compile (especially porting to x86-64 platforms or from one platform to another). Even if you have money, finding a person with suitable training (i.e., knows the area AND programming) is very difficult. Furthermore, I doubt that the return on that investment is more than correctly training a person.

Finally, there is one of the most important components, competitive edge. Grant money is essential and I am NOT going to help someone using my code beat me to the same grant!

(Actually I consider having the data used way more critical than the code!)


(Log in to post comments)

No code needed

Posted May 21, 2012 15:55 UTC (Mon) by pboddie (subscriber, #50784) [Link]

The problems you're describing have everything to do with the sustainability of an activity, which in this case is about a piece of research that is supposed to inform further research. If the level of engineering is more or less "it works for me", both in the environment that produced some work and in any environment that wishes to build on it, then the code is likely to be no more than a curiosity, particularly if all people are going to do is just run it and get it to do something before it crashes.

Finally, there is one of the most important components, competitive edge. Grant money is essential and I am NOT going to help someone using my code beat me to the same grant!

And this is precisely why the sustainability situation is so hopeless. It's all "We got our result, on to the next publication!" and just hope that somebody else absorbs the cost of picking up any pieces worth keeping.

Meanwhile, as I write this, an academic somewhere on the planet is probably seeing open source for the first time and wondering if it's an extraterrestrial artifact: "What? They do sharing like this? How wonderful/perverse!"

No code needed

Posted May 24, 2012 8:35 UTC (Thu) by man_ls (subscriber, #15091) [Link]

Finally, there is one of the most important components, competitive edge. Grant money is essential and I am NOT going to help someone using my code beat me to the same grant!
Shameful. Instead of advancing the state of the art we are back to Alchemy, but with software as the secret ingredient that nobody else must have. Just replace "code" with "formula" or "reaction" in the above.

Only for this reason public research grants should mandate publication of the code bases under free licenses. The days where a few equations were enough to reproduce someone else's results are long gone in too many fields.

No code needed

Posted May 24, 2012 14:49 UTC (Thu) by raven667 (subscriber, #5198) [Link]

These motivations are understandable and they are regrettable. In any event a scientific paper should detail all of the analysis in sufficient detail that it can be reproduced. I wouldn't say source code would be required but certainly sufficient detail to re-implement any tools would be required. For sufficient analysis complexity maybe the source code would be the best documentation. One worry I have is the same as for electronic elections, what does it really mean if you just rerun the same tools and it spits out a number, any errors in the analysis will be faithfully reproduced which I think would impede scientific understanding.

No code needed

Posted May 24, 2012 15:17 UTC (Thu) by nybble41 (subscriber, #55106) [Link]

It seems to me that in a study of this kind there are two aspects to reproducibility: the raw data, and the analysis performed by the software. By including both the raw data and the actual software used in the original study, you make it possible to check each part separately. Without the original software, it's difficult to say whether any differences in the processed results are due to problems with the original software, problems with the reimplementation, or differences in the input.

Having the original software for comparison also makes it easier to guarantee that the results can be reproduced with a different _style_ of implementation; otherwise, not knowing how the original software was implemented, you might end up recreating it the same way, with the same built-in flaws. If the software is included you can deliberately choose a different approach.

No code needed

Posted May 24, 2012 15:35 UTC (Thu) by man_ls (subscriber, #15091) [Link]

There is also gradual improvement of results, which has been a tenet of science for many centuries. A first researcher publishes their basic results, a second researcher publishes their enhancements, the next one publishes a refinement in certain conditions... In these days of computer simulations it becomes essential to have both data and software, as you say, and improve on them gradually. Otherwise research papers become just a lot of hand-waving around estimations and algorithms.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds