User: Password:
|
|
Subscribe / Log in / New account

vmsplice(): the making of a local root exploit

vmsplice(): the making of a local root exploit

Posted Feb 13, 2008 7:27 UTC (Wed) by JoeF (subscriber, #4486)
In reply to: vmsplice(): the making of a local root exploit by tialaramex
Parent article: vmsplice(): the making of a local root exploit

For example, if you work on code related to SCSI or filesytems or otherwise connected with disks, you're expected to recognise that (bytes >> 9) converts from a byte count to a sector count, since sectors are 512 bytes.

Hmm, no. The use of magic numbers is something I beat out of undergrads. Use a define, or a macro.
When I do code reviews at my job, I always latch onto the use of magic numbers.
And, your example is flawed, anyway. While hardware sectors may be 512 bytes (currently), filesystems usually deal with larger blocks.


(Log in to post comments)

vmsplice(): the making of a local root exploit

Posted Feb 14, 2008 6:40 UTC (Thu) by jimparis (subscriber, #38647) [Link]

> While hardware sectors may be 512 bytes (currently), filesystems usually deal with larger
blocks.

All the more reason to be just specify it clearly at the point of use rather than hiding it
away in a #define!  See my response above too.

vmsplice(): the making of a local root exploit

Posted Feb 14, 2008 19:44 UTC (Thu) by JoeF (subscriber, #4486) [Link]

You either need to add a comment, or use a define.
bytes >> 9 without a clarifier is something that should never be in production-quality code.
Magic numbers should never be in code without a clear explanation what they mean.
If you don't want a macro that can hide things, write something like bytes >> SECTOR_SHIFT.

vmsplice(): the making of a local root exploit

Posted Feb 14, 2008 20:42 UTC (Thu) by jimparis (subscriber, #38647) [Link]

> You either need to add a comment, or use a define.
bytes >> 9 without a clarifier is something that should never be in production-quality code.
Magic numbers should never be in code without a clear explanation what they mean.  If you don't
want a macro that can hide things, write something like bytes >> SECTOR_SHIFT.

If one doesn't want to hide things, hide it inside "SECTOR_SHIFT"?
I'm sorry, I remain unconvinced.
Using ">> 9" is essentially a comment saying "I'm converting this to a 512-byte sector count".
It's such a fundamentally basic operation that anyone working with filesystem or disk code
would understand it immediately.

Abstracting common constants is most useful when they might change, and they're just arbitrary
to begin with -- HZ for example.  Or things that actually have no real meaning, like magic
numbers that identify a filesystem.  For a number that has a real, well-known meaning, and for
which changing would involve huge logic changes in the code, I think macros like that are just
extra levels of obfuscation.

vmsplice(): the making of a local root exploit

Posted Feb 15, 2008 6:54 UTC (Fri) by JoeF (subscriber, #4486) [Link]

Using ">> 9" is essentially a comment saying "I'm converting this to a 512-byte sector count".

No. It only says that you shift a value by 9 bits to the right.
It may say to you that you are converting a value to a 512-byte sector count. It does not necessarily say that to others.

It's such a fundamentally basic operation that anyone working with filesystem or disk code would understand it immediately.

That mindset is what results in a lot of the flaws in all kinds of code. There will always be somebody working on the code for whom it isn't obvious. I venture that if you can't see that, you must still be in your first job and haven't yet had the task to maintain somebody else's code. I can tell you from (painful) experience that this kind of stuff is among the worst stuff out there.
">>9" makes implicit assumptions, and implicit assumptions, even if they may seem reasonable to the original author, are often not obvious to maintainers, possibly years down the road.
Oh, and just in case, I am writing filesystem code. I would never even get the idea to write something like ">>9" without a comment, or better, in a macro.

vmsplice(): the making of a local root exploit

Posted Feb 15, 2008 7:43 UTC (Fri) by jimparis (subscriber, #38647) [Link]

> I venture that if you can't see that, you must still be in your
> first job and haven't yet had the task to maintain somebody else's code.

And I venture that, in my experience (which is not as limited as you impolitely presume),
"somebody else's code" is a whole lot more difficult to work with when I have to dig through
arbitrary levels of opaque macros to figure out what they're really trying to do.

Clearly we have a difference in opinion.  I prefer in code that is written clearly enough to
be self-explanatory.  Consider:
  inode->i_blocks = bytes >> 9;
vs:
  inode->i_blocks = BYTES_TO_BLOCKS(bytes);

If you maintain that the second version conveys more information than the first, then I'm
afraid we will just have to disagree.

vmsplice(): the making of a local root exploit

Posted Feb 16, 2008 1:52 UTC (Sat) by dododge (subscriber, #2870) [Link]

> Consider:
>   inode->i_blocks = bytes >> 9;
> vs:
>   inode->i_blocks = BYTES_TO_BLOCKS(bytes);

It's a bit of an unfair example, though, because you're computing
against a value called "bytes" and assigning it to something called
"blocks".  You've put enough context around the expression to make
it clear what the shift is trying to accomplish.

The problem is when someone assumes ">> 9" is inherently
self-documenting and throws it into the middle of a much more
complex statement.  Consider:

  process_frag_2(sig,(get_ent(curr) >> 9) + 2,HEX_ENCODE);
vs:
  process_frag_2(sig,BYTES_TO_BLOCKS(get_ent(curr)) + 2,HEX_ENCODE);

When I'm reading code, I'd much rather see the latter.  It doesn't
just tell me why the shift is being done; it even adds useful
information about the APIs for get_ent() and process_frag_2().





vmsplice(): the making of a local root exploit

Posted Feb 16, 2008 17:13 UTC (Sat) by bronson (subscriber, #4806) [Link]

I guess we will disagree.  Your way is still hostile to new maintainers because it carries so
much implicit information.  What if your code code has this?

   inode->i_blocks = bytes >> 8;

What is a new maintainer to think?  Even if he does recognize that this expression is
different from the others, he's still confused.  Did you auto-optimize the expression (bytes
>> 9)*2?  Is it typoed or a thinko?

Will every potentially confusing use of the shift operator in your code include a comment
stating the author's intention?  If so, then will that convention still be true once someone
else has modified your code?

Also, you've given yourself a rather heavy restriction on variable naming haven't you?  Will
all your variable names really contain "blocks" and "bytes" as appropriate?  You'll probably
want to add this restriction as a comment to each file to which it applies or it won't last
long once other people start committing patches to your code!

Finally, your technique only works for trivial expressions like assignment.  What happens if
you need to actually perform a calculation?

A BYTES_TO_BLOCKS macro solves all these problems by making the conversion explicit.  Implicit
rules and unwritten conventions always make life hard for new maintainers.  Always.

vmsplice(): the making of a local root exploit

Posted Jan 11, 2009 4:06 UTC (Sun) by dibbles_you (guest, #45004) [Link]

cd /to/linux/kernel
# find . -name "*.c" | xargs grep ">>" | awk -F ">>" '{ for(n=2;n<=NF;n++) { $c=$n; sub(/^=/,"",$c); sub(/ /,"",$c); gsub(/\(/,"",$c); gsub(/\)/,"",$c); $c=substr($c,1,1); if ($c>='0' && $c<='9') print "number"; else print "text"; } }' > a
# cat a| grep text | wc -l
# cat a| grep number | wc -l

number=19877
text=4936

so it would seem the linux kernel believes numbers are usually clearer for others to read and understand.

if you believe otherwise submit a patch to the kernel removing all occurrences and replacing them with meaningful #defines.

#define 'ing all numbers is stupid, using identifiers that describe what they are should allow anybody to understand their context, if you use lots of operators and numbers in a single expression split them up to use multiple variables with appropriate names, if they are in conditions use multiple conditions with comments if necessary. The compiler will optimize all these away and make your code more readable without have 15 editors open or using code navigators to work out that the actual value of CLUSTERS_DIVIDE_BY_BYTES_IN_SECTORS_MULTIPLIED_BY_PI(x) is 2.

PS. the script is an approximation.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds