LWN.net Logo

Advertisement

E-Commerce & credit card processing - the Open Source way!

Advertise here

they are writing Linux code, but not releasing it.

they are writing Linux code, but not releasing it.

Posted Jul 26, 2007 1:25 UTC (Thu) by njs (subscriber, #40338)
In reply to: they are writing Linux code, but not releasing it. by zooko
Parent article: Where have the universities gone?

I fully agree, but it's going to take a while -- we're only now reaching the point where results in *any* area of science are habitually released alongside with raw data, and the practice is only now spreading beyond the medical/biology journals where it started. For the vast majority of papers in any field, it's completely impossible to double-check the conclusions.

It's going to be a big pain, too. Remember that in many cases, critical pieces of code are not even written down anywhere, just some series of commands were run in a terminal (or by clicking in a GUI!), and then the number shown on the screen pasted into an article. Or what code that is there, has pervasive and hard-wired assumptions about the particular way the author organizes their home directory, and unmentioned assumptions about how exactly they organized their raw data in said home directory. And so on...

(This would all be *much* *easier* if there were better tools. Where's the R/matlab/numpy-like language that automatically tracks which operations are applied to produce each piece of data, so that after an interactive session you can always walk back through the various intermediate expressions that produced each variable (and graph, and etc.) in your workspace? It's not like this would be computationally expensive, these days, but you can't really fake it if you aren't playing around in the language guts, either.)


(Log in to post comments)

Data Processing.

Posted Jul 26, 2007 14:20 UTC (Thu) by grantingram (subscriber, #18390) [Link]

we're only now reaching the point where results in *any* area of science are habitually released alongside with raw data

I'm a big fan of releasing data as well as papers. I have a wonderful thesis from 1986 full of tables of numbers in my office which means that even twenty odd years later it's still extremely useful. It's a real shame that we have a culture of secrecy in many areas.

...critical pieces of code are not even written down anywhere, just some series of commands were run in a terminal (or by clicking in a GUI!), and then the number shown on the screen pasted into an article.

Well I hope that this is not true in "many cases" though I have a suspicion that you might be right. Not being able to reproduce your own data is poor to say the least!

Where's the R/matlab/numpy-like language that automatically tracks which operations are applied to produce each piece of data

Posted Jul 27, 2007 2:51 UTC (Fri) by illtyd (guest, #2124) [Link]

http://www.bepress.com/jhubiostat/paper142/

is one approach to this

Where's the R/matlab/numpy-like language that automatically tracks which operations are applied to produce each piece of data

Posted Jul 28, 2007 4:26 UTC (Sat) by njs (subscriber, #40338) [Link]

How cool!

(I'll have to whinge about random things on LWN more often...)

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds