|
|
Subscribe / Log in / New account

Lines-changed algo?

Lines-changed algo?

Posted Jan 21, 2025 14:56 UTC (Tue) by andy_shev (subscriber, #75870)
Parent article: Development statistics for 6.13

I'm puzzled by how the lines-changed algo works. My simple `git log --numstat ...` approach gives (slightly in some cases) bigger values:

Philipp Hortmann 76514
Jan Kara 32848
...
Dmitry Baryshkov 14785
...
Andy Shevchenko 9006
...


to post comments

Lines-changed algo?

Posted Jan 21, 2025 16:07 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (1 responses)

Do you have some level of copied/moved code detection enabled?

Lines-changed algo?

Posted Jan 23, 2025 20:22 UTC (Thu) by andy_shev (subscriber, #75870) [Link]

Yes, `-M`, but even with `-M -C` still different numbers.

Lines-changed algo?

Posted Jan 22, 2025 8:59 UTC (Wed) by taladar (subscriber, #68407) [Link] (3 responses)

Maybe they use one of the options to ignore white space changes?

Lines-changed algo?

Posted Jan 23, 2025 20:23 UTC (Thu) by andy_shev (subscriber, #75870) [Link] (2 responses)

Do you believe we produced N thousands lines of whitespaces (the difference between LWN statistics and mine)? :-)

Lines-changed algo?

Posted Jan 28, 2025 12:12 UTC (Tue) by taladar (subscriber, #68407) [Link] (1 responses)

Thousands of lines of white space changes are not that unusual if e.g. something is wrapped in a new if and then re-indented or some similar change was applied all over the code base.

Lines-changed algo?

Posted Jan 31, 2025 13:30 UTC (Fri) by andy_shev (subscriber, #75870) [Link]

It's still rare, and on top of that I precisely know what I have done in that release (most LoC changes came from removing old GPL boilerplate texts, no adding/removing whitespaces). I have just even checked by adding these to my script: "-C -D --ignore-all-space", still it gives 8790 (without 9006), but statistics shows 7755, I beleive there is a mystery (bug or feature?) in the LWN scripts. Full script for the reference I have used:
git log -M -C -D --author="Andy Shevchenko" --ignore-all-space --pretty="" --numstat v6.12..v6.13
awk RS="\n" {
	 	 for (i=0; i < int(NF / 3); i++) {
	 	 	 sum += $(3*i+1) + $(3*i+2)
	 	 }
	 } END { print sum }
I even went further and cut the filenames from the `git log` output to be sure we have only numbers and calculated a sum using `bc`, same result.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds