LWN.net Logo

they are writing Linux code, but not releasing it.

they are writing Linux code, but not releasing it.

Posted Jul 24, 2007 20:38 UTC (Tue) by dlang (subscriber, #313)
Parent article: Where have the universities gone?

I was at the Usenix conference last month and there were several very interesting presentations from students relating to enhancements for Linux. not one of them had the code available (under any conditions), let alone attempted to get it merged.

I brought up the issue with the Usenix staff and they acknowledged that this is a problem and started talking about releasing the code becoming a requirement (ar at the very least, very strongly encouraged) in the future.

when I asked the students about availablity of their code I got the responses "I'm too busy to release it", "the code isn't clean enough to let anyone see it", "I'll release it 'soon'", etc (note that these are the same things you hear from companies about releasing their code)

I think it's a general problem, not university specific.


(Log in to post comments)

they are writing Linux code, but not releasing it.

Posted Jul 24, 2007 22:40 UTC (Tue) by JoeF (guest, #4486) [Link]

For graduate students, that's rather normal. Grad students need to write papers, and the code has just to be good enough to run, so it often is prototype quality. Some universities have staff programmers who actually clean up the code and make it release quality.
Grad students who submit code to major projects like Apache usually have a personal interest outside of their studies for that (sometimes it leads to the topic for the PhD thesis.)

they are writing Linux code, but not releasing it.

Posted Jul 25, 2007 13:56 UTC (Wed) by zooko (subscriber, #2589) [Link]

Scientific work should come with source code, not because of the value of Free Software, but because of the older value of Reproducible Science.

Referees should reject as "unreproducible" any results which require source code if that source code is not included with the results.

they are writing Linux code, but not releasing it.

Posted Jul 25, 2007 17:28 UTC (Wed) by khim (subscriber, #9252) [Link]

Referees should reject as "unreproducible" any results which require source code if that source code is not included with the results.

This is strange approach. Source code usually can be regarded as an equipment - and any reproduction where the same equipment is used can not be called independent! So if you want to talk about science and not alchemy then you must write your own code...

they are writing Linux code, but not releasing it.

Posted Jul 26, 2007 20:20 UTC (Thu) by amikins (subscriber, #451) [Link]

On the contrary; source code is in its essence a set of steps necessary to complete a task.

Can you call any results "reproducible" if you don't have all the steps used to attain the result?

Unless you're publishing or citing a full and -very- well documented algorithm, including the source code is critical for proper science.

they are writing Linux code, but not releasing it.

Posted Jul 26, 2007 1:25 UTC (Thu) by njs (subscriber, #40338) [Link]

I fully agree, but it's going to take a while -- we're only now reaching the point where results in *any* area of science are habitually released alongside with raw data, and the practice is only now spreading beyond the medical/biology journals where it started. For the vast majority of papers in any field, it's completely impossible to double-check the conclusions.

It's going to be a big pain, too. Remember that in many cases, critical pieces of code are not even written down anywhere, just some series of commands were run in a terminal (or by clicking in a GUI!), and then the number shown on the screen pasted into an article. Or what code that is there, has pervasive and hard-wired assumptions about the particular way the author organizes their home directory, and unmentioned assumptions about how exactly they organized their raw data in said home directory. And so on...

(This would all be *much* *easier* if there were better tools. Where's the R/matlab/numpy-like language that automatically tracks which operations are applied to produce each piece of data, so that after an interactive session you can always walk back through the various intermediate expressions that produced each variable (and graph, and etc.) in your workspace? It's not like this would be computationally expensive, these days, but you can't really fake it if you aren't playing around in the language guts, either.)

Data Processing.

Posted Jul 26, 2007 14:20 UTC (Thu) by grantingram (subscriber, #18390) [Link]

we're only now reaching the point where results in *any* area of science are habitually released alongside with raw data

I'm a big fan of releasing data as well as papers. I have a wonderful thesis from 1986 full of tables of numbers in my office which means that even twenty odd years later it's still extremely useful. It's a real shame that we have a culture of secrecy in many areas.

...critical pieces of code are not even written down anywhere, just some series of commands were run in a terminal (or by clicking in a GUI!), and then the number shown on the screen pasted into an article.

Well I hope that this is not true in "many cases" though I have a suspicion that you might be right. Not being able to reproduce your own data is poor to say the least!

Where's the R/matlab/numpy-like language that automatically tracks which operations are applied to produce each piece of data

Posted Jul 27, 2007 2:51 UTC (Fri) by illtyd (guest, #2124) [Link]

http://www.bepress.com/jhubiostat/paper142/

is one approach to this

Where's the R/matlab/numpy-like language that automatically tracks which operations are applied to produce each piece of data

Posted Jul 28, 2007 4:26 UTC (Sat) by njs (subscriber, #40338) [Link]

How cool!

(I'll have to whinge about random things on LWN more often...)

they are writing Linux code, but not releasing it.

Posted Jul 24, 2007 23:00 UTC (Tue) by PO8 (guest, #41661) [Link]

The Freenix Track of Usenix ATC used to require that code be released under an open source license. We got a lot of contributors of great code, including kernel code, out of Universities with that track. Unfortunately, Usenix killed it a few years ago.

they are writing Linux code, but not releasing it.

Posted Jul 25, 2007 11:22 UTC (Wed) by nix (subscriber, #2304) [Link]

It sounds oddly like the LWN code to me. ;}}}}

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds