Posted Feb 17, 2010 17:09 UTC (Wed) by corbet (editor, #1)
In reply to: How old is our kernel? by joey
Parent article: How old is our kernel?

I've thought about doing that; clearly, there would be more work involved, and I was interested in the five-year horizon for now.

We have good history through the BitKeeper era, which could easily extend the view back a few years. Prior to that, of course, it's a big mess, though we do have per-release resolution for the most part.

But, yes, wouldn't it be interesting to know how much of 2.4.x we're still running?

Older trees

Posted Feb 17, 2010 17:57 UTC (Wed) by eli (guest, #11265) [Link]

I think pulling in additional history could prove very interesting, so when you get a bit of free time.... ;)

I am picturing a different graph that I think might tell us more about kernel development than the bar graphs above.

The Y-axis would be the absolute number of lines of code, and the X-axis would be kernel releases.

Each kernel release would have a line that started at its release number. You'd have the total number of lines in the kernel marked for 2.6.12.
Then you'd draw a line to the 2.6.13 release to show how many lines of 2.6.12 remained in 2.6.13, and you'd start a new line for 2.6.13 showing the total number of lines in 2.6.13. Each new release would add a new line on the graph. It should look a bit like strata layers or something.

Over time, you should see some patterns in how the releases get replaced over time. If one release was particularly badly done, we'd see it start out with a large number of lines of code at its release, and see it rapidly squeezed to a small number of lines of code by later releases.

I wonder if there is such a thing as a "code half-life"...

Older trees

Posted Feb 19, 2010 22:32 UTC (Fri) by aegl (guest, #37581) [Link]

The current git history looks too short to tell whether the speed of code
removal follows a "half-life" curve. We still have 58.6% of the git origin
(2.6.12-rc2) code present in the current kernel (it only makes up 30.5% of
the current code because the kernel is almost twice as large now).

Here's a graph showing growth of the kernel, and decline of the original

Older trees

Posted Feb 22, 2010 17:04 UTC (Mon) by aegl (guest, #37581) [Link]

Here's the "strata" picture you asked for:

The lowest line is the 2.6.12-rc2 git origin. Count up from there
to 2.6.32 at the top. Scripts were run with current tip of "linus"
tree at v2.6.33-rc8-113-gf8b55f2 so it doesn't take into account the
320 lines of code added and 149 deleted over the weekend.

Visually there does seem to be an inflection point around 2.6.27
where we slowed down at deleting old code (perhaps because there
was so much new code to be deleted instead?)

Older trees

Posted Feb 22, 2010 17:40 UTC (Mon) by nix (subscriber, #2304) [Link]

Oh, that's a lovely graph.

I wonder if the inflection point can be attributed to the staging tree?
(That's certainly a lot of new crap^Wcode to be deleted...)

Older trees

Posted Feb 26, 2010 0:26 UTC (Fri) by robert_s (subscriber, #42402) [Link]

With this graph, sir, you have read my mind.

Older trees

Posted Feb 26, 2010 14:44 UTC (Fri) by eli (guest, #11265) [Link]

Thank you, sir!

'Course, now I need to stare at it for an hour looking for all the interesting things it's trying to tell me. ;)

Older trees

Posted Feb 17, 2010 18:31 UTC (Wed) by alex (subscriber, #1355) [Link]

I'm sure I've seen a git tree that goes all the way back to 0.* series. The
best I could Google today though an archive of discussions about it:

Older trees

Posted Feb 17, 2010 18:35 UTC (Wed) by corbet (editor, #1) [Link]

Such things exist, yes. And, indeed, I've grabbed copies of them over time. Lots of old stuff in git://, for example. Eventually I'll see what I can do about trawling through it all.

WoPDaSD 2010

Posted Feb 17, 2010 18:57 UTC (Wed) by PO8 (guest, #41661) [Link]

I'm on the Program Committee for the 5th Workshop on Public Data about Software Development (WoPDaSD 2010), and the paper deadline is coming up in March. I would really love to see you and/or other readers of LWN get this kind of data and analysis together as a workshop paper and submit it there. We don't get so many submissions from outside academia, and that's a shame—I'm confident that this work would be quite well-received.

WoPDaSD 2010

Posted Feb 17, 2010 20:50 UTC (Wed) by ajross (guest, #4563) [Link]

Were all the pronouncable acronyms already taken?

WoPDaSD 2010

Posted Feb 18, 2010 0:02 UTC (Thu) by felixfix (subscriber, #242) [Link]

It's related to INTERCAL ...

"The full name of the compiler is "Compiler Language With No Pronounceable Acronym", which is, for obvious reasons, abbreviated "INTERCAL"."

WoPDaSD 2010

Posted Feb 18, 2010 7:48 UTC (Thu) by PO8 (guest, #41661) [Link]

The acronym is kind of awkward, but the workshop is pretty cool. It works that way sometimes. :-)

Older trees

Posted Feb 17, 2010 19:25 UTC (Wed) by marineam (guest, #28387) [Link]

I started a rebase of the current linux-2.6 on top of old-2.6-bkcvs to see if your little findoldfiles finds anything that hasn't changed since 2.4 but this could take a loooooong time, rebase is going at about two patches a second on my machine.

I'm guessing writing a smarter script would be faster. :-P

Older trees

Posted Feb 17, 2010 22:24 UTC (Wed) by dlang (subscriber, #313) [Link]

there is a historical git archive that can be grafted onto the current 2.6.12 archive.

once this is done the combined archive can be treated as a single archive and I expect that the scripts used for this report could be used as-is (although it will obviously take longer)

IIRC, the historical git archive goes all the way back to the 0.0x days (although not without gaps)

David Lang

Older trees

Posted Feb 17, 2010 22:31 UTC (Wed) by corbet (editor, #1) [Link]

That's davej's repository, yes. I have it. The tools will require some tweaks to work well with that data source, but it's all certainly doable.

Older trees

Posted Feb 17, 2010 22:38 UTC (Wed) by viro (subscriber, #7872) [Link]

FWIW, I've got a slightly more complete tree (several versions missed by davej added to his) plus the CVS-exported 2.4 BK tree; need to convert the latter to git, then we'll get full tree with all branches (right now there's stuff up to 2.4.0 + 2.4.0--2.6.12 + 2.4.31--2.4.current + 2.6.12--2.6-current, with gap between 2.4.15 and 2.4.31).

Older trees

Posted Feb 23, 2010 0:34 UTC (Tue) by Aissen (guest, #59976) [Link]

Is your tree online? How does this compare to ?
I contacted the original author about 3 months ago and built a tree using his ocaml program. It seems to gather data from dave, tglx and linus' tree.

If anyone is interested, I can forward the ~210k archive of the program building the tree.

