Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Posted Aug 3, 2016 14:03 UTC (Wed) by corbet (editor, #1)In reply to: Statistics from the 4.7 development cycle by johannbg
Parent article: Statistics from the 4.7 development cycle
The lack of contributions from academia have been an interesting problem for years. There are guesses as to why (once the work has gone far enough to be published it stops and there's no incentive to polish it for inclusion, for example), but nobody really seems to know what the roots of it are.
Nobody has ever "asked to be taken out of the list."
Most of the unknowns are small contributors, often cleanups. When we see unknowns making significant contributions, we try harder to figure out who they work for.
Gender ratio is hard; there is no gender tag attached to patches. People often ask for country-based statistics as well. It would all be interesting to know, but somehow I don't want to be the one sending "gender and location?" emails to developers...
Posted Aug 3, 2016 14:16 UTC (Wed)
by patrick_g (subscriber, #44470)
[Link] (1 responses)
There are some statistics here : http://www.remword.com/kps_result (look at NT:Nation by Patch).
Posted Aug 3, 2016 14:22 UTC (Wed)
by corbet (editor, #1)
[Link]
Posted Aug 3, 2016 14:37 UTC (Wed)
by fratti (guest, #105722)
[Link] (3 responses)
Perhaps academia is also focusing on solely academic kernels, since a kernel that does not have to deal with all the pitfalls of real world hardware is a lot easier to work on when you're trying to implement a proof-of-concept feature, though that's just a guess of mine. Someone (with access to comp sci publications) would have to actually dig through all the papers to find out where the work ended up.
It could also be very possible that some company or individual then re-implements the work in an upstreamable shape after reading the paper, which would mean academic contributions are still very much real, just not as direct. Searching the kernel git log for the word "paper" brings up some commit messages where people mention work to be published in a paper and such.
Posted Aug 3, 2016 22:24 UTC (Wed)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
Posted Aug 3, 2016 22:41 UTC (Wed)
by anselm (subscriber, #2796)
[Link]
GCC and LLVM are probably closer to the cutting edge of research into compiler technology than the Linux kernel is to the cutting edge of research into operating systems, so that's not a huge surprise.
Posted Aug 6, 2016 15:41 UTC (Sat)
by anton (subscriber, #25547)
[Link]
As for the Linux kernel, other postings have given the reasons; or in other words, there is a gap between where a research projects ends and a piece of code is good enough for inclusion in the kernel. How big is that gap? Philipp Reisner finished his Diplomarbeit (~master's thesis) on DRBD in 2000, then continued working on it commercially (forming a company along the way), and DRBD was finally accepted into the Linux kernel in 2009; I am sure this did not count as academic contribution at that time, and given that many more years had been spent commercially on it than acedemically, counting it as academic would have been wrong.
Posted Aug 3, 2016 15:14 UTC (Wed)
by johannbg (guest, #65743)
[Link]
This raises the question if the same thing might apply to the kernel community.
Posted Aug 3, 2016 19:53 UTC (Wed)
by Fats (guest, #14882)
[Link]
As said in the article kernel is boring and academia needs hot and sexy. Also kernel is likely old OS technology so nothing really novel fit for research.
Posted Aug 4, 2016 8:46 UTC (Thu)
by paulj (subscriber, #341)
[Link] (3 responses)
Academics can not fix this alone. One would need to go to the governments' and government agencies funding CS work and make the case to have factors other than paper output considered as success criteria in funding applications, in departmental assessments, in career progression, etc. Now, the relevance of "Impact" (i.e. real-world effects of research) in academia has slowly become more important to funding agencies - academics do often now have to pay some attention to this in funding applications - however it seems generally still to be a side-line performance metric compared the traditional measure of papers (weighted by venue).
Posted Aug 4, 2016 11:06 UTC (Thu)
by jamesmorris (subscriber, #82698)
[Link] (2 responses)
Posted Aug 4, 2016 12:48 UTC (Thu)
by paulj (subscriber, #341)
[Link]
Most academics in universities (certainly in the UK) have their career progression measured on the success of their papers, not polishing code.
Posted Aug 4, 2016 17:33 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Aug 4, 2016 14:20 UTC (Thu)
by deater (subscriber, #11746)
[Link] (1 responses)
How is "academia" tabulated on the list? I try to contribute regularly (but possibly not during the 4.7 timeframe) using my .edu address. Would all people with .edu be tabulated under academia, or would we be individually broken out by our University?
The main problem with academia are threefold:
So most of the people I know from academia who contribute back are ones who (like me) were open-source developers first, academics second. And the fact we bother trying to get things contributed back probably hurts our career both financially and timewise.
Posted Aug 4, 2016 21:34 UTC (Thu)
by Lekensteyn (guest, #99903)
[Link]
For the full description, see https://github.com/gregkh/kernel-history/blob/master/emai...
Posted Aug 15, 2016 13:29 UTC (Mon)
by broonie (subscriber, #7078)
[Link] (2 responses)
Posted Aug 15, 2016 14:10 UTC (Mon)
by Jonno (subscriber, #49613)
[Link] (1 responses)
Posted Aug 15, 2016 14:18 UTC (Mon)
by broonie (subscriber, #7078)
[Link]
Posted Aug 18, 2016 15:50 UTC (Thu)
by ortalo (guest, #4654)
[Link]
Statistics from the 4.7 development cycle
But apparently the site was not updated since November 2015 :-(
And those numbers show just the sort of hazard you can run into; it seems to be based mostly on domain names. I'm sure Neil Brown would be surprised to learn that he's German..:)
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Given the direction that GCC and LLVM/Clang are taking, I am happy that the Linux kernel accepts fewer academic contributions. "Optimizations" based on unrealistic assumptions are an interesting academic curiosity, but should never become the default in production compilers.
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
> for years. There are guesses as to why (once the work has gone far enough
> to be published it stops and there's no incentive to polish it for inclusion, for
> example), but nobody really seems to know what the roots of it are.
1. Most academic code contributions are *awful*, generally one-off hacks made during a mad rush to get a paper/thesis out the door
2. There are no incentives to merge your results back in (i.e. federal grants and such don't stipulate this, and really outside of google I'm not sure if there's anyone who is sponsoring linux-kernel related reserach grants), and also open-source contributions don't matter for anything on tenure packages.
3. There's a perception (probably rightly so) that trying to get code merged in is going to be a long, frustrating process. Often by the time the student has finished the work and it's time to contribute back, the student has graduated, moved on to a new job, and has no incentive or time to deal with the hassle.
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
Statistics from the 4.7 development cycle
So, what would we do with this information and the associated statistics? I am not sure we can do something useful (to either gender) with the result.
Similar for countries.