User: Password:
|
|
Subscribe / Log in / New account

McIntyre: Scanning for assembly code in Free Software packages

McIntyre: Scanning for assembly code in Free Software packages

Posted Apr 2, 2013 17:32 UTC (Tue) by butlerm (guest, #13312)
In reply to: McIntyre: Scanning for assembly code in Free Software packages by stevem
Parent article: McIntyre: Scanning for assembly code in Free Software packages

There are places where the semantics of the C language are too weak to avoid assembly language. Atomic operations in particular. Performance wise, there are also major issues with the inability of C code to take advantage of the carry bit. Ideally you could write (x + y) >> 32, but on a 32 bit machine you either can't do that or it is very slow.


(Log in to post comments)

McIntyre: Scanning for assembly code in Free Software packages

Posted Apr 2, 2013 17:46 UTC (Tue) by stevem (subscriber, #1512) [Link]

Of course, there are a number of places where C just can't/won't do the job.

Atomics is a good example. BUT: I think there's no excuse for lots and lots of people all using assembly for locking directly in their code, with all the attendant porting and maintenance problems. The compiler should have working builtins for whatever you need here, on any platform the compiler supports. If it doesn't (or they're too slow, or whatever), then that's a bug and it's easily fixable once - not in every program out there.

A lot of the other uses of assembly are similar, from what I've seen in this study. I was shocked to see how many people were using x86 assembly for trivial bitops or byte-swapping.

McIntyre: Scanning for assembly code in Free Software packages

Posted Apr 2, 2013 18:23 UTC (Tue) by JoeBuck (guest, #2330) [Link]

If you have an older package that uses assembly language for performance, it might be worth re-evaluating whether the assembly code still beats the GCC output.

If you think that you need to write assembly language because you need atomic operations, you should first read the GCC manual and learn about the __sync and __atomic builtins (these are also supported by LLVM and Intel's compiler, so you aren't locking yourself into GCC). The compiler will then choose the correct implementation for the target architecture, so your program works on ARM and Sparc even if you don't know the assembly language for those architectures.

There will still be specialized cases where assembly language might help, but it makes the program less portable (unless fallback C/C++ code is provided, and then maybe it makes sense to try to reduce the gap between the C++ and the assembly performance by improving the code or possibly by helping the GCC folks to improve the compiler by providing good bug reports).

McIntyre: Scanning for assembly code in Free Software packages

Posted Apr 2, 2013 19:03 UTC (Tue) by Aliasundercover (subscriber, #69009) [Link]

There is another option. If it ain't broke, don't fix it. Just because you might do something different with today's tools and hardware doesn't mean you should open up old working code spending time creating new bugs. There is time to deal with portability when doing a port.

McIntyre: Scanning for assembly code in Free Software packages

Posted Apr 2, 2013 20:22 UTC (Tue) by justincormack (subscriber, #70439) [Link]

A lot of this code, judging from the report, is not working. Open source code is expected to work on new architectures and with new C compilers but clearly a lot of this code does not. Upstream does not "do a port" because Debian is or Fedora is doing a port. And if a new version of gcc breaks your code because it assumed old behaviour that "ain't broke" then it is broke. Binary versions might work, but source is what matters here.

McIntyre: Scanning for assembly code in Free Software packages

Posted Apr 2, 2013 22:55 UTC (Tue) by robert_s (subscriber, #42402) [Link]

I wonder if this is the right place to bring up the last notable case of a distribution package maintainer (no less) "fixing" things that they didn't 100.0% understand in a package. After going for a semi-automated trawl.

(Debian & OpenSSL for those who don't remember)

McIntyre: Scanning for assembly code in Free Software packages

Posted Apr 2, 2013 21:15 UTC (Tue) by FranTaylor (guest, #80190) [Link]

Anything written in assembler is clearly "broke" with respect to portability, which is the criterion in question.

To put finer point on it, incomprehensible code that "just works" should be put high up on the list of things to FIX, not "leave alone".

Honestly your "old saw" about "leaving things alone" is just POOR ENGINEERING PRACTICE.

---

Programs must be written for people to read, and only incidentally for machines to execute.

- H. Abelson and G. Sussman (in "The Structure and Interpretation of Computer Programs)

McIntyre: Scanning for assembly code in Free Software packages

Posted Apr 2, 2013 21:24 UTC (Tue) by dlang (subscriber, #313) [Link]

who said that this code is "incomprehensible"?

But in any case, if you re-write incomprehensible code, you are almost guaranteed that the result is code that doesn't do the job that the original did, because you don't fully understand the problems that the code is solving.

You probably understand the more obvious problems, but the subtle problems and corner cases will bite you.

That doesn't mean that you should never re-write something, but rather than when you do so, you need to recognize that you aren't going to get it right in the first try, and you need to be sure that the value of having the new code (leaner/faster/better documented/etc) is greater than the effort to re-write the code AND then debug the code after it hits the real world (including whatever damage the bugs can do)

McIntyre: Scanning for assembly code in Free Software packages

Posted Apr 3, 2013 3:45 UTC (Wed) by rsidd (subscriber, #2582) [Link]

You are taking "incomprehensible" literally. No code is incomprehensible. But taking assembly code that takes an hour to understand, and replacing it with C code that takes 5 minutes to understand, is a win, especially if you are the maintainer (it may not be worth it if it's an obscure package and you're a distro packager).

McIntyre: Scanning for assembly code in Free Software packages

Posted Apr 2, 2013 22:13 UTC (Tue) by Aliasundercover (subscriber, #69009) [Link]

> Honestly your "old saw" about "leaving things alone" is just POOR ENGINEERING PRACTICE.

There is a reason why software has a reputation for mickey mouse engineering. Even the things that did once work break in the endless update churn. Other fields respect leaving working designs alone until there is a genuine need to change them and time to verify those changes are correct.

Even this field respected leaving working things alone before security paranoia set in. Now we have an endless arms race with the hackers and a new set of patches every time you look away. Only hack resistance is served while all other measures of quality suffer.

Since you liked my last old saw so much I have another for you. There is no such thing as portable software, only software that has been ported.

McIntyre: Scanning for assembly code in Free Software packages

Posted Apr 2, 2013 22:24 UTC (Tue) by xbobx (subscriber, #51363) [Link]

> > Honestly your "old saw" about "leaving things alone" is just POOR ENGINEERING PRACTICE.

> There is a reason why software has a reputation for mickey mouse engineering.

Both are true. In mechanical or civil engineering, just because a bridge hasn't fallen over yet doesn't mean that it doesn't need to be monitored for flaws and maintained to stay up to code. Then again, a perfectly good concrete bridge doesn't need to be replaced by a fancy new suspension bridge just because suspension bridges are all the rage nowadays.

Engineering is the practice of applying judgement to decide when the current solution is sufficient and can be left alone, or needs refinement and to what extent. Doing either extreme by default is going to bite you.

QotW

Posted Apr 3, 2013 20:56 UTC (Wed) by man_ls (guest, #15091) [Link]

Engineering is the practice of applying judgement to decide when the current solution is sufficient and can be left alone, or needs refinement and to what extent. Doing either extreme by default is going to bite you.
Good Quote of the Week, if you ask me.

McIntyre: Scanning for assembly code in Free Software packages

Posted Apr 3, 2013 8:57 UTC (Wed) by ssam (guest, #46587) [Link]

the new bugs you get in an update are because some change has unforeseen consequences. this probably happens a lot because software is complex with many interdependent parts, some of them more fragile than you would expect.

so modifying any code is potentially dangerous, and needs to be tested. translating asm to C may introduce a subtle behaviour change. but if the change is in a corner case, its quite possible that it was doing the wrong thing in asm and no one ever noticed.

maybe the asm version is fast because it does not check for alignment, or that something is non-zero (maybe poor examples). maybe when the asm was written all the data was aligned, and x was never zero, but that assumption might not always be true.

so replacing a fragile bit of asm with a robust bit of C might be a very good thing. (not that all asm is fragile, or all c is robust. but i am sure the compiler and static analysis tools can give you much better warnings for the C).

Lack of CarryOut in C

Posted Apr 2, 2013 18:45 UTC (Tue) by jreiser (subscriber, #11027) [Link]

the inability of C code to take advantage of the carry bit

Amen. However, sometimes ((unsigned)(x+y) < (unsigned)x) plus a comment is good enough (courtesy of MIPS, which has no Carry in hardware.)

That still isn't good enough for decoding a big-endian bitstream, which wants both CarryOut and Zero after ((x<<=1)|CarryIn).

Lack of CarryOut in C

Posted Apr 2, 2013 19:24 UTC (Tue) by brunowolff (guest, #71160) [Link]

This risks getting removed during optimization.

Lack of CarryOut in C

Posted Apr 2, 2013 20:14 UTC (Tue) by pbonzini (subscriber, #60935) [Link]

Not for ((unsigned)x+(unsigned)(y) < (unsigned)x). jreiser almost got it right.

Lack of CarryOut in C

Posted Apr 3, 2013 3:28 UTC (Wed) by tterribe (✭ supporter ✭, #66972) [Link]

> ((x<<=1)|CarryIn)

Conveniently, x<<=1 can be implemented as x=(unsigned)x+(unsigned)x, which reduces this to a previously solved problem. But honestly if you're decoding a bitstream a bit at a time, there are better optimizations to be done.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds