LWN.net Logo

C as portable assembler

C as portable assembler

Posted May 30, 2011 18:56 UTC (Mon) by jrn (subscriber, #64214)
In reply to: Sorry, but this is just wrong... by khim
Parent article: What Every C Programmer Should Know About Undefined Behavior #3/3

> Sure, some platforms were perfectly happy with such code. But then if you want low-level non-portable language... asm is always there.

And so is C. :) After all, what language is the Linux kernel written in?

> Or, alternatively, you can actually read specifications and see what the language actually supports.

I don't think the case of signed overflow is one of trial and error versus reading specifications. It seems more like one of folk knowledge versus new optimizations --- old gcc on x86 and many similar platforms would use instructions that wrap around for signed overflow, so when compiling old code that targeted such platforms, it seems wise to use -fwrapv, and when writing new code it seems wise to add assertions to document why you do not expect overflow to occur.

Of course, reading the spec can be a pleasant experience independently from that.


(Log in to post comments)

C as portable assembler

Posted May 31, 2011 7:17 UTC (Tue) by khim (subscriber, #9252) [Link]

> Sure, some platforms were perfectly happy with such code. But then if you want low-level non-portable language... asm is always there.

And so is C. :) After all, what language is the Linux kernel written in?

Linux kernel is written in C, quite portable and people fight constantly to fix hardware and software compatibility problems. Note that while GCC improvements are source of a few errors they are dwarfed by number of hardware compatibility errors. Most of the compiler problems happen when people forget to use appropriate constructs defined to make hardware happy: by some reason macroconstructs designed to fight hardware make code sidestep a wide range of undefined C behaviors. Think about it.

I don't think the case of signed overflow is one of trial and error versus reading specifications.

It is, as was explained before. There are other similar cases. For example standard gives you ability to convert pointer to int in some cases, but even then you can not convert int to pointer because on some platforms pointer is not just a number - yet people who don't know better often do that. Will you object if gcc and/or clang will start to miscompile such programs tomorrow?

It seems more like one of folk knowledge versus new optimizations --- old gcc on x86 and many similar platforms would use instructions that wrap around for signed overflow, so when compiling old code that targeted such platforms, it seems wise to use -fwrapv, and when writing new code it seems wise to add assertions to document why you do not expect overflow to occur.

Note that all these new optimizations are perfectly valid for the portable code. Surprisingly enough -fwrapv exist not to make broken programs valid but to make sure Java overflow semantic is implementable in GCC. Sure, you can use it in C, but it does not mean your code is suddenly correct after that.

Of course, reading the spec can be a pleasant experience independently from that.

Actually it's kind of sad that the only guide we have here is the standard... Given how often undefined behavior bites us you'd think we'll have books which explain where and how they can be triggered in "normal" code. Why people accept that i = i++ + ++i; is unsafe and unpredictable code but lots of other cases which trigger undefined behavior are perceived as safe? It's matter of education...

C as portable assembler

Posted May 31, 2011 17:56 UTC (Tue) by anton (guest, #25547) [Link]

Why people accept that i = i++ + ++i; is unsafe and unpredictable code
  1. Who would write "i = i++ + ++i;" anyway?
  2. It is easy to write what you intended here (whatever that was) in a way that's similarly short and fast and generates a similar amount of code.
but lots of other cases which trigger undefined behavior are perceived as safe?
Because they were safe, until the gcc maintainers decided to break them (and the LLVM maintainers follow them like lemmings).

That's the point...

Posted Jun 1, 2011 9:16 UTC (Wed) by khim (subscriber, #9252) [Link]

It is easy to write what you intended here (whatever that was) in a way that's similarly short and fast and generates a similar amount of code.

It's easy to do in other cases, too. You can always use memcpy to copy from float to int. GCC will eliminate memcpy and unneeded variables.

$ cat test.c
#include <string.h>

int convert_float_to_int(float f) {
  int i;
  memcpy(&i, &f, sizeof(float));
  return i;
}
$ gcc -O2 -S test.c
$ cat test.s
        .file "test.c"
        .text
        .p2align 4,,15
.globl convert_float_to_int
        .type convert_float_to_int, @function
convert_float_to_int:
.LFB22:
        .cfi_startproc
        movss %xmm0, -4(%rsp)
        movl -4(%rsp), %eax
        ret
        .cfi_endproc
.LFE22:
        .size convert_float_to_int, .-convert_float_to_int
        .ident "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
        .section    
    .note.GNU-stack,"",@progbits

Because they were safe, until the gcc maintainers decided to break them (and the LLVM maintainers follow them like lemmings).

They were never completely safe albeit cases where they break were rare. Today it happens more often. This is not the end of the world, but this is what you must know and accept.

C as portable assembler

Posted May 31, 2011 18:05 UTC (Tue) by anton (guest, #25547) [Link]

The funny thing is that new gcc (at least up to 4.4) still generates code for signed addition that wraps around instead of code that traps on overflow. It's as if the aim of the gcc maintainers was to be least helpful to everyone: for the low-level coders, miscompile their code silently; for the specification pedants, avoid giving them ways to detect when they have violated the spec.

C as portable assembler

Posted May 31, 2011 21:00 UTC (Tue) by jrn (subscriber, #64214) [Link]

There is -fwrapv and -ftrapv. If you think one of those should be the default, no doubt there are some other optimization tweaks (-fno-strict-aliasing?) that you also like; so I encourage you to work on an -Oanton switch in either a wrapper for gcc or gcc itself, for the sake of sharing.

I am pretty happy with -O2 for my own needs, since it generates fast code for loops, but I understand that different situations may involve different requirements and would be happy to live in a world with more people helping make the gcc UI more intuitive.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds