Glibc change exposing bugs

Posted Nov 10, 2010 19:46 UTC (Wed) by rodgerd (guest, #58896)
In reply to: Glibc change exposing bugs by clugstj
Parent article: Glibc change exposing bugs

After the o_ponies poo-flinging from kernel developers in the direction of app developers, it's pretty funny seeing the lead kernel developer complaining about... code conforming to it's documented behaviour.

Glibc change exposing bugs

Posted Nov 10, 2010 20:03 UTC (Wed) by corbet (editor, #1) [Link] (37 responses)

That's a germane example, actually. "Poo flinging" notwithstanding, the kernel developers fixed things so that applications would not lose data even if they weren't following standard behavior. Not breaking things was seen as more important than doing something because the posted rules say you can.

I don't believe that Linus (or anybody else) is saying that the broken applications are not buggy. What I'm hearing is that those applications have worked for years and that people should think for a long time before introducing a change which breaks them. Thus, Linus asks: what's the benefit that justifies such a change? I think it's a reasonable question.

Glibc change exposing bugs

Posted Nov 10, 2010 20:10 UTC (Wed) by jwb (guest, #15467) [Link] (18 responses)

There are a huge variety of improvements to Linux which have broken or will break Flash Player, for example Flash abused the ALSA API until Pulse came along and exposed that abuse.

Glibc change exposing bugs

Posted Nov 10, 2010 20:38 UTC (Wed) by neilbrown (subscriber, #359) [Link] (17 responses)

This all sounds like a very strong recommendation in favour of Rusty Russell's Maxim of API development: APIs should be hard to misuse. memcpy, and apparently ALSA, are easy to misuse.

So implementing memcpy as memmove - which Linus says in the bugzilla threads is largely what the kernel does - sounds very sensible. memmove is much harder to misuse.

Glibc change exposing bugs

Posted Nov 13, 2010 1:07 UTC (Sat) by rriggs (guest, #11598) [Link] (3 responses)

memmove: safe, fast, verbose function name
memcpy: unsafe, at least as fast as memmove, one less character to type

Which one do you think your average C programmer will choose?

Which one do you think new programmers are taught to use (in schools that still teach C programming)?

Glibc change exposing bugs

Posted Nov 13, 2010 2:52 UTC (Sat) by neilbrown (subscriber, #359) [Link] (1 responses)

So we can save the world by creating a 'memmv' in glibc which aliases memmove? Brilliant!

Glibc change exposing bugs

Posted Nov 15, 2010 16:43 UTC (Mon) by renox (guest, #23785) [Link]

Too late!

And memcpy should also be named as mem_unsafe_copy, but yes if you tell developers to use safe function by default and to optimize only when they can show benchmarks that the optimisation will make a difference, then yes, you'd get probably better software (if a bit slower).

Glibc change exposing bugs

Posted Oct 17, 2013 12:49 UTC (Thu) by jzbiciak (guest, #5246) [Link]

You're calling memmove verbose as compared to memcpy? Even Ken Thompson said if he had it to do over, he'd spell creat() with the final 'e'.

Glibc change exposing bugs

Posted Nov 25, 2010 15:13 UTC (Thu) by Spudd86 (subscriber, #51683) [Link] (2 responses)

It's not so much that ALSA is easy to misuse (although it probably is), it's that certain parts of it are impossible to emulate from userspace. An app that actually NEEDS those parts is NOT misusing the API when it uses them (for example, pulseaudio itself uses those bits).

The problem is that most apps don't actually need the those bits, so they just needlessly break software like pulseaudio (and also break on bluetooth audio too).

Pulseaudio does use those unemulatable APIs, but it also falls back if they don't work, and it has good reasons to use those APIs (so it can hand over large chunks of audio data, but still be able to decide it wants to change that same data later (if for example something else starts playing audio), this saves you power because pulse won't wake your CPU as much, but it also uses APIs that don't emulate well AND until pulse came along nobody ever tried to do that sort of thing so it broke)

Glibc change exposing bugs

Posted Nov 25, 2010 16:07 UTC (Thu) by foom (subscriber, #14868) [Link] (1 responses)

If the documentation wasn't so terrible, this probably wouldn't be a problem. It doesn't give *any* clue that, for example, a developer shouldn't use the mmap functions. In fact it makes it sound like you should use them, because they're zero-copy (and that's better, right?)

Glibc change exposing bugs

Posted Nov 25, 2010 16:39 UTC (Thu) by Spudd86 (subscriber, #51683) [Link]

Well yea, but Lennart Pottering does have a blog post where he says exactly what subset of ALSA's API you should restrict yourself to, perhaps someone should put that into the ALSA docs.

Glibc change exposing bugs

Posted Oct 17, 2013 12:42 UTC (Thu) by jzbiciak (guest, #5246) [Link] (9 responses)

One major reason the remaining distinction between memcpy and memmove exists in the standard seems to be this:

To write memmove completely within conformant C, you need a malloc and a double-copy. That's because in that mythical Platonic ideal of a language, you cannot compare two pointers that do not point into the same object, and you are not guaranteed that the arguments to memmove point within the same object. That is, a fully compliant memmove would look something like this:

    void *memmove(void *dst, const void *src, size_t len)
    {
        char *srcc = (char *)src;
        char *dstc = (char *)dst;
        char *temp = malloc(len);
        size_t i;

        /* What if 'malloc' fails?  call abort()?  Unspecified! */

        for (i = 0; i != len; i++)
            temp[i] = srcc[i];

        for (i = 0; i != len; i++)
            dstc[i] = temp[i];

        free(temp);
        return dst;
    }

And, on 16-bit segmented computers or other computers lacking flat memory spaces, both of which are rather from a Platonic ideal, comparing two pointers isn't always as straightforward as you might like. So practically, memcpy offers some noticeable performance benefits on those machines.

Yes, I'm aware that the actual language in the standard says 'as if' the source was first copied to a temporary array. But, as I recall, a fully conformant C program has no other option. The 'as-if' clause allows library writers to avoid such shenanigans, without requiring them to do so. So much hair-splitting...

If it weren't for that, you could make the argument that separate memcpy and memmove were historical accidents, and change the C standard at some point to remove the restrictions on memcpy to make them both equivalent. That new memcpy would then adhere to Rusty's Maxim, or at least come much closer. And, from the thread linked above, that's pretty much what BSD did, it sounds like.

As a half step, you could define memcpy as always copying forward, to make "sliding down" safe, but that just seems a little goofy for a number of reasons.

I'm personally with Linus that the glibc breakage seems gratuitous. I'd lean towards making memcpy and memmove equivalent if their performance is largely indistinguishable. Arguing that the software is broken when it worked for year with the old library reminds me of this silly meme. It's the kind of hair-splitting that only a bureaucrat or chapter-verser could love.

Glibc change exposing bugs

Posted Oct 17, 2013 12:44 UTC (Thu) by jzbiciak (guest, #5246) [Link] (8 responses)

...rather far from a Platonic ideal...

Need. More. Coffee.

Glibc change exposing bugs

Posted Oct 18, 2013 13:31 UTC (Fri) by meuh (guest, #22042) [Link] (7 responses)

... or a time machine to go back in 2010 ...

If we were on "stackoverflow", you would have earned the "Necromancer" badge ;)

Glibc change exposing bugs

Posted Oct 18, 2013 14:08 UTC (Fri) by jzbiciak (guest, #5246) [Link] (6 responses)

Yeah, I was up late and followed a link into the ancient thread. The next morning, I resumed reading, forgetting I was in a 3 year old thread. Ah well. :-)

Glibc change exposing bugs

Posted Oct 21, 2013 20:37 UTC (Mon) by nix (subscriber, #2304) [Link] (5 responses)

Your comment was interesting anyway. This is the relevant guarantee from C89 (C99 and C11 have similar wording):

If two pointers to object or incomplete types compare equal, they point to the same object. If two pointers to functions compare equal, they point to the same function. If two pointers point to the same object or function, they compare equal. If one of the operands is a pointer to an object or incomplete type and the other has type pointer to a qualified or unqualified version of void , the pointer to an object or incomplete type is converted to the type of the other operand.

The problem here is that this does not guarantee that two pointers to the same object always compare equal, but rather that if they compare equal, they are pointers to the same object (and similarly for comparison operators). We can tell if two pointers definitely are pointers within the same object, but if the comparison fails we cannot conclude anything. This is unfortunately the opposite of the guarantee that memmove() needs if it is to transform itself into a memmove() when needed, so (in the absence of a Standard-blessed way to normalize pointers) you are indeed forced to do a double-copy at all times when writing memmove() in Standard C.

Glibc change exposing bugs

Posted Oct 21, 2013 20:49 UTC (Mon) by khim (subscriber, #9252) [Link] (4 responses)

This is unfortunately the opposite of the guarantee that memmove() needs if it is to transform itself into a memmove() when needed, so (in the absence of a Standard-blessed way to normalize pointers) you are indeed forced to do a double-copy at all times when writing memmove() in Standard C.

Note that in real world there are no such guarantee (hint, hint) thus GLibC's memmove sometimes works and sometimes does not work.

Glibc change exposing bugs

Posted Oct 23, 2013 14:23 UTC (Wed) by nix (subscriber, #2304) [Link] (3 responses)

You appear to have read what I said exactly backwards. Of *course* C on Unix conforms to the guarantee that pointers that compare equal will point to the same object! What you can do with mmap() is produce two pointers that do *not* compare equal but which nevertheless point to the same object. This is exactly what torpedoes a fast auto-reducing-to-memcpy() memmove() implementation, since there is no way to efficiently tell if two pointers point into the same aliased region: even modifying the region via one pointer and probing via the other won't work because they could be pointing at different parts of the aliased region rather than e.g. at the start of it (you are not restricted to call memcpy()/memmove() on pointers returned from malloc(): you can copy parts of objects, and the like).

This behaviour is explicitly permitted by the Standard: segmented architectures like MS-DOS were like this decades ago. The guarantee that a == b returns nonzero only when a and b are pointers to the same object holds nonetheless. It's just a less useful guarantee than we might like.

Glibc change exposing bugs

Posted Oct 23, 2013 15:47 UTC (Wed) by khim (subscriber, #9252) [Link] (2 responses)

My point was that real-world GLibC-implemented memmove does not actually work when used on POSIX system. It compares pointers and assumes that if they are different then underlying memory is also different!

Which means, strictly speaking, that memmove in GLibC is not standards-compliant :-)

Glibc change exposing bugs

Posted Oct 23, 2013 16:47 UTC (Wed) by jzbiciak (guest, #5246) [Link]

From my perspective, the two of you are in violent agreement. :-)

Glibc change exposing bugs

Posted Oct 23, 2013 18:09 UTC (Wed) by nix (subscriber, #2304) [Link]

What? Surely not...

... bloody hell, it does. Or many of the assembler versions do anyway. Or, rather, it assumes that distinct addresses cannot alias.

I suppose this is probably safe in practice, because if you *do* use mmap() to set up aliased regions at distinct addresses you are suddenly in hardware-dependent land (due to machines with VIPT caches such as, IIRC, MIPS, not being able to detect such aliasing at the caching level, so you suddenly need to introduce -- necessarily hardware-dependent -- cache flushes and memory barriers) so you have to know what you're doing anyway, and little things like memmove() falling back to memcpy() at unexpected times are things you're supposed to know about.

I hope.

Glibc change exposing bugs

Posted Nov 10, 2010 20:56 UTC (Wed) by clugstj (subscriber, #4020) [Link] (8 responses)

Even if there isn't currently-demonstrable benefit, there could be in the future, so why not get the buggy code fixed now instead of later? Yes, it's a very fine line, but, just my opinion, I don't have a problem w/ GlibC not reverting the change.

Glibc change exposing bugs

Posted Nov 10, 2010 22:24 UTC (Wed) by lmb (subscriber, #39048) [Link] (6 responses)

Because the user whose data has just been corrupted or whose important business meeting presentation just crashed or whose mail has been eaten no longer cares, and has switched to a less hostile platform.

That behavior is undefined makes one only right as far as technicality is concerned; it does not imply that changing it silently is good software engineering practice, nor that it is right in terms of software providing a service to users.

Glibc change exposing bugs

Posted Nov 10, 2010 23:58 UTC (Wed) by nix (subscriber, #2304) [Link] (4 responses)

But the compiler makes undefined stuff break all the time, and the set of undefined stuff which is broken is changed by all sorts of things. Nobody complained when LTO came in, although it surely broke programs relying on numerous instances of undefined behaviour which had been harmless before due to wider optimization opportunities when optimizing across translation units. So why complain about this? Just because Flash was affected?

Glibc change exposing bugs

Posted Nov 11, 2010 2:29 UTC (Thu) by foom (subscriber, #14868) [Link] (3 responses)

Because changes in the compiler don't break already-installed working binaries. They break newly compiled versions of software. Presumably such newly-compiled software gets tested, and if there's a problem, the program is perhaps recompiled with an older version of the compiler until the issue is fixed.

Glibc change exposing bugs

Posted Nov 11, 2010 7:29 UTC (Thu) by nix (subscriber, #2304) [Link] (2 responses)

That's a very large assumption indeed. When glibc gets recompiled, is everything on the distro tested? When libpng gets recompiled (every other week), is everything that uses it tested? I doubt it.

Glibc change exposing bugs

Posted Nov 11, 2010 17:30 UTC (Thu) by foom (subscriber, #14868) [Link] (1 responses)

Shrug, yet still, even if it was only discovered sometime later...if there was a bug in the new libpng binary that only appeared because it was compiled with a new gcc, it's still a bug in the new libpng binary that can be fixed by uploading yet another new libpng binary.

Here we have a new bug in flash which appeared without a new version of the flash binary being uploaded. It's a substantively different situation.

Glibc change exposing bugs

Posted Nov 12, 2010 7:12 UTC (Fri) by hozelda (guest, #19341) [Link]

If you update your libpng then the corruption already happened just as if you update glibc. [But the odds grow problems will arise when you update glibc because of its vast use]

If you are that worried, you should work off stable versions or off a stable distributor that will manage this for you. You should not change key parts of the system if possible. glibc is a very key part. You should not update for optimizations, at least not without significant tests and only if you think it's worthwhile the gains. Stick to security updates or when a crucial problem has been solved.

Anyway, when an important "bug" like this comes up, projects should audit the code. In this case, the possible entry points to potential problems can be identified quickly for many projects (just search for memcpy).

The case of glibc involves well-defined standards. Most libraries do not have such carefully defined semantics, and we must rely on access to source code for the juicy bits.

OK, despite what I just said, if the gains here are not that useful, glibc should revert, at least for the time being. Reverting should not hurt those that adjusted already and will save those that have not. On the other hand, when will be the right time to change? Will people remember to fix this problem or will we just have a repeat later on? [Again, if the gains are negligible, the change in glibc should probably be avoided.]

Glibc change exposing bugs

Posted Nov 11, 2010 0:50 UTC (Thu) by MattPerry (guest, #46341) [Link]

> That behavior is undefined makes one only right as far as technicality
> is concerned;

But it is defined. The man page says not to use that function on overlapping regions. That applications ignored that and still functioned for so long is more a matter of good luck. That luck has run out due to their poor implementations and they should now be fixed.

Glibc change exposing bugs

Posted Nov 11, 2010 1:15 UTC (Thu) by Lovechild (guest, #3592) [Link]

Perhaps if somehow one could get it to emit a warning message instead of crashing that might work. For now it might be best done in testing environments such as being enabled perhaps during the development cycle of a distribution.

Glibc change exposing bugs

Posted Nov 11, 2010 2:41 UTC (Thu) by quotemstr (subscriber, #45331) [Link] (8 responses)

the kernel developers fixed things so that applications would not lose data even if they weren't following standard behavior

What some filesystem developers propose applications do isn't defined by any standard. POSIX, SuS, and so on don't state what happens after a crash, fsync() or not. The argument was over what to do in certain circumstances outside any standard. The argument was must muddled because one said kept claiming that its brand of brain damage was endorsed by the standard. Fortunately, sanity prevailed. Calling fsync() after every rename would have inconvenienced application developers and decreased performance.

memcpy, on the other hand, is clearly described by the relevant standards. Application developers deserve what they get.

Glibc change exposing bugs

Posted Nov 11, 2010 8:07 UTC (Thu) by bojan (subscriber, #14302) [Link] (7 responses)

> The argument was must muddled because one said kept claiming that its brand of brain damage was endorsed by the standard.

He, he... Nice try :-)

Nothing could be further from the truth. The problem is that the standard doesn't _specify_ in which order things should happen on the underlying FS, which then gives implementers the ability to implement _any_ order (which they do). Relying on a _particular_ order (which is completely undocumented, of course) by application writers is the problem.

Suggestion about specification not dealing with crashes is irrelevant, because, once again, it doesn't specify _any_ behaviour. In other words, if you FS is hosed completely after a crash, that OK. If it's half hosed, that's OK too. If it's completely OK, that's OK as well. Obviously, the _interesting_ case is when it's completely OK, in which case the _implemented_ ordering actually makes a difference. And, once again, _any_ ordering is OK, because the standard specifies _none_.

The only difference between this and the memcpy() fiasco is that in the case of rename() folks may get an _impression_ that the operation is atomic on the FS level, because it is atomic as viewed from the processes currently running on the system. Of course, this is documented nowhere, but is a common misreading of the standard.

With memcpy() it is quite clear overlapping regions should be copied with memmove().

Glibc change exposing bugs

Posted Nov 11, 2010 8:32 UTC (Thu) by Mook (subscriber, #71173) [Link] (6 responses)

Funnily enough... that has to do with glibc too. In particular, its manual on rename(): http://www.gnu.org/s/libc/manual/html_node/Renaming-Files...

Yes, glibc's rename() API guarantees atomic renames. Since normal applications do not make syscalls directly, but call the libc API to do it on their behalf, they are not to blame.

Glibc change exposing bugs

Posted Nov 11, 2010 8:46 UTC (Thu) by bojan (subscriber, #14302) [Link] (5 responses)

And even more "funily", glibc doesn't deal with file system implementation (i.e. the persistence of the change) at all. In fact, that very page you pointed to states that strange things may indeed happen after a crash.

The atomicity of rename() refers to a view from the running system and not much else. But it has sure been misread a lot :-)

Glibc change exposing bugs

Posted Nov 11, 2010 9:06 UTC (Thu) by Mook (subscriber, #71173) [Link] (4 responses)

Hmm, odd; I parse "If there is a system crash during the operation, it is possible for both names to still exist; but newname will always be intact if it exists at all. " as "the file named by the destination will either not exist, or have some sort of sensible value, but not be truncated at zero bytes unless that was one of the two inputs".

Glibc change exposing bugs

Posted Nov 11, 2010 9:52 UTC (Thu) by bojan (subscriber, #14302) [Link] (3 responses)

You are confusing file names (i.e. what is recorded in the directory) with contents of files.

Glibc change exposing bugs

Posted Nov 11, 2010 13:49 UTC (Thu) by pbonzini (subscriber, #60935) [Link] (2 responses)

"intact" seems to refer to the contents?

Glibc change exposing bugs

Posted Nov 11, 2010 23:05 UTC (Thu) by bojan (subscriber, #14302) [Link] (1 responses)

Suppose there are two entries in the directory, with oldname being renamed to newname, and each (obviously) pointing to an inode. If the system crashes during the rename, it is possible that both will survive (because the directory was not committed to disk yet).

What glibc docs are talking about is that rename() is not implemented by copying content of the oldname to newname. So, if there was newname before rename and the directory commit doesn't go through, the content of newname will not be changed. It is a pure directory operation. On the other hand, if the directory gets committed, there will be just newname there, pointing to whatever content oldname had. All of that is if your FS knows how to survive a crash - otherwise situation is not interesting (well, unless you're the sysadmin recovering the mess :-).

Now note the situation from the ext4 "problem". The oldname content was not fsync()-ed to disk before the rename(). Ergo, when the directory got committed, oldname became newname on disk, pointing to zero bytes, due to delayed allocation. This has nothing to do with the fact that on unsuccessful (i.e. not committed before the crash) rename(), both oldname and newname would remain in the directory.

Glibc change exposing bugs

Posted Nov 12, 2010 7:12 UTC (Fri) by Mook (subscriber, #71173) [Link]

Thank you for the clear explanation! It does clearly say that I'm wrong :)

Glibc change exposing bugs

Posted Nov 10, 2010 20:07 UTC (Wed) by dlang (guest, #313) [Link] (51 responses)

Linus is being very consistant here.

if a userspace program does things that have been working, even if they weren't supposed to work, that's part of the ABI of the kernel and he is very reluctant to change anything, and will only do so when there is a _very_ compelling reason

Glibc change exposing bugs

Posted Nov 10, 2010 20:36 UTC (Wed) by JoeBuck (subscriber, #2330) [Link] (50 responses)

The existing memcpy implementation did copying in a forward direction, so it would give a wrong result for memcpy(buf, buf + 4, 8) but the expected result for memcpy(buf, buf - 4, 8). The change (in at least some circumstances) does the reverse, and both ways satisfy the spec, which says that src and dst must not overlap, and if they might, memmove should be used. Linus is apparently calling for the original implementation decision (forward, not backward) to be set in stone, even if a backward-copy might be faster on a particular processor. This doesn't seem right to me. However, it seems reasonable to provide a cleaner workaround until old code can be fixed (it might just be a cleaned-up version of his proposed LD_PRELOAD trick).

An alternative LD_PRELOAD, pointing to a memcpy that crashes for overlapping arguments, could be used to expose accidental misuse of the API.

Glibc change exposing bugs

Posted Nov 10, 2010 20:51 UTC (Wed) by clugstj (subscriber, #4020) [Link]

I think that history has shown that old code won't get fixed until it actually manifests itself as broken - at least in the commercial world.

Please don't attack strawmen. Thnx.

Posted Nov 10, 2010 22:47 UTC (Wed) by khim (subscriber, #9252) [Link] (2 responses)

The actual cite:

So in the kernel we have a pretty strict "no regressions" rule, and that if people depend on interfaces we exported having side effects that weren't intentional, we try to fix things so that they still work unless there is a major reason not to.

...

Regardless, it boils down to: we know the glibc change resulted in problems for real users. We do _not_ know that it helped anything at all.

Linus is Ok with changes that break buggy programs (it happened before, it'll happen again) bit only if there are "major reason". What's the justification for this particular case?

Please don't attack strawmen. Thnx.

Posted Nov 10, 2010 23:17 UTC (Wed) by bojan (subscriber, #14302) [Link] (1 responses)

> What's the justification for this particular case?

Linus couldn't play his favourite YouTube videos ;-)

Please don't attack strawmen. Thnx.

Posted Nov 11, 2010 1:31 UTC (Thu) by jonabbey (guest, #2736) [Link]

Ah! Andreas was trying to get Linus to quit wasting time on YouTube and get back to kernel development.

It's not, in fact, a bug. It's a feature.

Glibc change exposing bugs

Posted Nov 10, 2010 23:15 UTC (Wed) by charlieb (guest, #23340) [Link] (45 responses)

> ... and both ways satisfy the spec, which says that src and dst must
> not overlap, ...

Does it? The man page says:

The memory areas should not overlap.

It does not say:

The memory areas must not overlap.

It also says:

The memcpy() function copies n bytes from memory area src to
memory area dest.

It doesn't say:

The memcpy() function copies n bytes from memory area src to
memory area dest, unless the memory areas overlap.

"should" provisions are not mandatory. Unless you decide to redefine the terminology.

Glibc change exposing bugs

Posted Nov 10, 2010 23:23 UTC (Wed) by bojan (subscriber, #14302) [Link]

Nice exercise in verbal gymnastics. However, you forgot:

> Use memmove(3) if the memory areas do overlap.

Glibc change exposing bugs

Posted Nov 10, 2010 23:24 UTC (Wed) by donwaugaman (subscriber, #4214) [Link] (42 responses)

Hmm... the man page on my desktop computer (RHEL4) says:

If copying takes place between objects that overlap, the behavior is undefined.

In the context of standardese, that specifies that exactly anything can happen in the event of overlapping memory areas, with no 'should' or 'must' about it. The standard doesn't set down any rules that a developer must follow, only what will happen under certain conditions (in this case, the result is 'anything').

'must' and 'should' are more in the vein of RFCs.

Glibc change exposing bugs

Posted Nov 11, 2010 0:25 UTC (Thu) by nicooo (guest, #69134) [Link] (38 responses)

On my system there are two man pages, one from POSIX and one from the linux man-pages project.

Glibc's info page says it's undefined. It's the official documentation but nobody uses info.

Glibc change exposing bugs

Posted Nov 11, 2010 0:28 UTC (Thu) by bojan (subscriber, #14302) [Link]

And both clearly state that if regions overlap, one should use memmove().

Glibc change exposing bugs

Posted Nov 11, 2010 0:33 UTC (Thu) by charlieb (guest, #23340) [Link]

> On my system there are two man pages, one from POSIX and one from
> the linux man-pages project.

Ideally the linux man-page will be clarified. "should" there seems just a recommendation. Not "your software will eat babies unless you do this".

Glibc change exposing bugs

Posted Nov 11, 2010 2:42 UTC (Thu) by butlerm (subscriber, #13312) [Link] (35 responses)

It's the official documentation but nobody uses info.

That is because 'info' is user hostile and dangerously close to useless. A web search is a dozen times faster than navigating an info document.

Glibc change exposing bugs

Posted Nov 11, 2010 6:41 UTC (Thu) by HelloWorld (guest, #56129) [Link] (25 responses)

What's useless isn't info, but man, at least for documentation spanning more than a couple of pages (like for gcc or mplayer).

Glibc change exposing bugs

Posted Nov 11, 2010 9:46 UTC (Thu) by mpr22 (subscriber, #60784) [Link] (12 responses)

For large manuals, my experience is that info merely sucks less than a man page; the user interface of both /usr/bin/info and /usr/bin/emacs -f info is horrible. For simple things, man wins by a country mile, because it doesn't slice-and-dice a simple program's documentation into 742 one-paragraph pages.

info considered harmful?

Posted Nov 11, 2010 22:47 UTC (Thu) by vonbrand (subscriber, #4458) [Link] (1 responses)

Try pinfo

info considered harmful?

Posted Nov 12, 2010 14:08 UTC (Fri) by jzbiciak (guest, #5246) [Link]

Another vote for pinfo. It doesn't hate me for wanting to know something like "info" does.

Glibc change exposing bugs

Posted Nov 11, 2010 22:56 UTC (Thu) by HelloWorld (guest, #56129) [Link] (8 responses)

Dividing manuals into chunks of a sensible size is a feature, not a bug. And if you don't like GNU info or emacs, just use something else. You can view info manuals with konqueror by typing info:<program name> into the address bar, and yelp is also capable of displaying info documents.

Glibc change exposing bugs

Posted Nov 12, 2010 18:19 UTC (Fri) by sorpigal (guest, #36106) [Link] (1 responses)

I don't know about you but for anything less than ten pages I find man much easier than info for one very simple reason: It's easy to scroll through a stream of text. It's also easier to hit / and search the whole document, it's easy to not get lost, etc.. Info's problem is that info readers don't default to a man-like one-big-document, which is well known, well accepted and suitable to a terminal (which is, I imagine, where most man and info pages are consumed).

I've used pinfo and it helps some in the UI department, but I'd still use man over pinfo for almost every trivial lookup. If your goal is to completely replace man then your system needs to be a drop-in replacement from a user interaction point of view, with the advantages discoverable by users who are interested in learning them.

Glibc change exposing bugs

Posted Nov 12, 2010 18:45 UTC (Fri) by foom (subscriber, #14868) [Link]

> easier to hit / and search the whole document

Not really: "info" also searches the whole document if you hit /. (although I share the general dislike for the info browser).

Glibc change exposing bugs

Posted Nov 25, 2010 15:22 UTC (Thu) by Spudd86 (subscriber, #51683) [Link] (5 responses)

info's major problem is that it's interface SUCKS, there's no real 'back' command, the keybindings are just plain weird (unless you're an EMACS user...).

It'd be nice to have an info viewer that converts to HTML on the fly and uses webkit to render it.

Glibc change exposing bugs

Posted Nov 25, 2010 22:13 UTC (Thu) by paulj (subscriber, #341) [Link] (4 responses)

Have you tried going to System -> Help? GNOME's "Yelp" supports browsing info docs - providing a web browser style GUI...

Glibc change exposing bugs

Posted Nov 26, 2010 0:31 UTC (Fri) by Spudd86 (subscriber, #51683) [Link] (3 responses)

Don't use GNOME, I wonder how much of GNOME Yelp pulls in

Glibc change exposing bugs

Posted Nov 26, 2010 0:40 UTC (Fri) by sfeam (subscriber, #2841) [Link] (2 responses)

You could use konqueror instead
konqueror info:tar

Glibc change exposing bugs

Posted Nov 26, 2010 1:33 UTC (Fri) by Spudd86 (subscriber, #51683) [Link] (1 responses)

Don't use KDE either... I use XFCE and try to keep most of the GNOME stuff not installed.

Glibc change exposing bugs

Posted Nov 27, 2010 13:26 UTC (Sat) by paulj (subscriber, #341) [Link]

Well, if you want a web interface style GUI for info, but don't want to install either of the main two GUI environments, then... ;) Pinfo possibly is closest to what you want. A lynx/elinks style browser interface, for the terminal.

Glibc change exposing bugs

Posted Nov 12, 2010 10:36 UTC (Fri) by marcH (subscriber, #57642) [Link]

You are mixing in the same very short post three entirely unrelated things:
- the info format
- the info reader
- how fine the writer sliced the document
Very confusing.

Glibc change exposing bugs

Posted Nov 12, 2010 5:01 UTC (Fri) by nicooo (guest, #69134) [Link] (5 responses)

The rest of the world uses HTML and PDF for that kind of documentation.

Glibc change exposing bugs

Posted Nov 12, 2010 7:33 UTC (Fri) by paulj (subscriber, #341) [Link]

Funnily enough, a lot of PDFs are written in some other language and generated through TeX (that I read anyway), with PDF being just one possible output format. Which is just how GNU _Tex_info works too..

Glibc change exposing bugs

Posted Nov 12, 2010 10:42 UTC (Fri) by marcH (subscriber, #57642) [Link]

HTML does not support indexes, a very useful feature of the info document format. I find most PDF viewers cumbersome for screen browsing; not every surprising since it is a *printer* format at the core.

I find it too bad that a not-so-good default user interface is rebuffing users before then even start to see the nice features of the format. The fix is to promote alternatives user interfaces, something I keep doing constantly (and which has already been done here).

Glibc change exposing bugs

Posted Nov 12, 2010 13:59 UTC (Fri) by HelloWorld (guest, #56129) [Link] (1 responses)

What's your point? You can generate both PDF and HTML from info.

Glibc change exposing bugs

Posted Nov 12, 2010 20:10 UTC (Fri) by nicooo (guest, #69134) [Link]

That's texinfo. Using info for online documentation is what everyone hates.

Glibc change exposing bugs

Posted Nov 12, 2010 23:32 UTC (Fri) by Wol (subscriber, #4433) [Link]

And pdf is (done properly) one big page, just like man :-)

Which is why I like man, and like pdf, and just curse profusely every time I'm exhorted to use info!

Cheers,
Wol

Glibc change exposing bugs

Posted Nov 12, 2010 13:52 UTC (Fri) by Wol (subscriber, #4433) [Link] (5 responses)

I'd actually say the complete opposite! Even for a complex chunk of documentation, I'd rather have man than info.

At least with man, I can scroll down (or search) until I find what I'm looking for.

info, on the other hand, "you are in maze of twisty little passages all alike". When presented with the instruction to "use info", I give up and use the web. When presented with a 1000-line man page, no problem ... :-)

Cheers,
Wol

Glibc change exposing bugs

Posted Nov 12, 2010 14:06 UTC (Fri) by HelloWorld (guest, #56129) [Link] (4 responses)

> I'd actually say the complete opposite! Even for a complex chunk of documentation, I'd rather have man than info.

> At least with man, I can scroll down (or search) until I find what I'm looking for.
So you can with info. You can search the complete manual with the s key. The fact that you don't know this indicates you don't bother to read documentation at all really.
> info, on the other hand, "you are in maze of twisty little passages all alike".
If you had actually read the headings of the "twistly little passages", you would have found that they're really not alike at all. Alas, you don't seem to have bothered and decided to pointlessly whine about info instead.

Glibc change exposing bugs

Posted Nov 12, 2010 19:11 UTC (Fri) by bronson (subscriber, #4806) [Link]

Wonder if self-important replies like this have contributed to info's utter obscurity...?

Take a deep breath dude. Different people like different things.

Glibc change exposing bugs

Posted Nov 12, 2010 23:39 UTC (Fri) by Wol (subscriber, #4433) [Link] (2 responses)

Ah. "s" for "search".

The problem with that is if I can't articulate what I'm searching for. The number of times I've searched on what I think is the obvious search key, wasted half-an-hour or so doing it, then done a manual scroll through whatever I can find.

I then find what I'm looking for, and discover that it's called something (to me) extremely obscure, and doesn't mention my search term at all, etc etc.

Plus the fact that I'm one of those strange people who actually DOES tend to read documentation, from cover to cover, and likes to have a straight line path through it, not with redirects and jumps and god knows what all over the place. About the only place I can find information on info is in info - and if I find info repellent, how on earth am I going to find out how to use it if I have to use it to find out?

THERE is your problem with info - if you hate it because you can't find out how to use it, it's catch 22. You need to know how to use it to find out how to use it :-)

Cheers,
Wol

Glibc change exposing bugs

Posted Nov 13, 2010 0:42 UTC (Sat) by foom (subscriber, #14868) [Link] (1 responses)

Oh come on, if you can't stand to use "info info" long enough to figure out that you can use "space" and "backspace" to scroll forward and backward through the document (including going to the next page automatically upon reaching the end of the current one), then I dunno what to say.

Glibc change exposing bugs

Posted Nov 14, 2010 22:32 UTC (Sun) by nix (subscriber, #2304) [Link]

Well, info's handling of backspace in particular has long been buggy: it has a habit of going up to the top of the current page only, and then halting. Space has always worked, though.

Glibc change exposing bugs

Posted Nov 11, 2010 7:31 UTC (Thu) by nix (subscriber, #2304) [Link] (8 responses)

And POSIX is on the web and in the 3p manpages, so developers *still* have no excuse. (They should be developing to the POSIX manpages anyway, not the Linux ones.)

Glibc change exposing bugs

Posted Nov 12, 2010 7:30 UTC (Fri) by hozelda (guest, #19341) [Link] (7 responses)

I think the man pages that say "should" may want to clarify that issue a little better; however, it does appear to have the correct information.

If you use Linux, the Linux documentation should be authoritative. Hopefully, it will agree with POSIX and C99 (or whatever is the latest memcpy standard) as much as possible. If there is a reason for a change (or to document a Linux bug) and you use Linux, I would pay attention to the Linux documentation and treat everything else as advisory. If you use Red Hat or whatever other distro, I would look treat those docs as authoritative and not whatever other standard you think should apply.

A different matter is arguing about keeping Linux in sync with POSIX, etc, but if you want to build software that will work, short of maintaining your personal set of patches not accepted by upstream, you would probably want to code to "Linux" (at least for the Linux port).

Glibc change exposing bugs

Posted Nov 12, 2010 7:37 UTC (Fri) by hozelda (guest, #19341) [Link]

Before you say that a man page is not authoritative, I don't know the answer to that but it depends on your Linux vendor. In practice you will want to follow the major standards and consider otherwise to probably be an error in the man page; however, if you vendor says that X and Y are the documents, then that is what you go by (perhaps bringing up doubtful points to your vendor's attention). In particular, if you don't like a vendor that hacks Linux to bypass certain standards, then change vendors or ask for help in identifying these hacks.

Glibc change exposing bugs

Posted Nov 14, 2010 21:19 UTC (Sun) by nix (subscriber, #2304) [Link] (5 responses)

No, you normally want to code to POSIX. Carefully-written software does not *require* much if any porting to work on Linux rather than Solaris or IRIX or even sometimes AIX. If it's POSIX, it should work.

(You might need to adjust for bits of older systems that are non-POSIX, but that is really quite rare these days unless you're aiming for some strange emulation layer like Cygwin. Also you might need to do byteorder detection and so forth, but, again, that's stuff which is left unspecified by POSIX. You should not generally have to use Linux-specific stuff unless you really want to, and you normally shouldn't want to.)

Glibc change exposing bugs

Posted Nov 14, 2010 22:03 UTC (Sun) by promotion-account (guest, #70778) [Link] (3 responses)

You should not generally have to use Linux-specific stuff unless you really want to, and you normally shouldn't want to.

I'm sure you know this, but for some applications, POSIX is not really enough. Thus, for example, the need for some portable abstraction libraries like libevent.

Glibc change exposing bugs

Posted Nov 14, 2010 23:08 UTC (Sun) by nix (subscriber, #2304) [Link] (2 responses)

Yes, exactly. But at worst you should stuff the nonportability into a library with an API which can be replicated on other platforms (or make that library as portable as possible, and keep it a separate library to keep the ugly away from everyone else.)

(btw, your account name is... *interesting*.)

Glibc change exposing bugs

Posted Nov 15, 2010 1:37 UTC (Mon) by promotion-account (guest, #70778) [Link] (1 responses)

(btw, your account name is... *interesting*.)

That's descriptive anonymity :)

Readers usually give higher weight to subscribers opinions here, so this handle honestly states that I'm a promoted guest.

Glibc change exposing bugs

Posted Nov 15, 2010 10:39 UTC (Mon) by nix (subscriber, #2304) [Link]

Ah. I interpreted it as 'account bought to promote something else', and got confused because most advertisers would try to lie about it and *not* mention their affiliations :)

'Promotion' is a word with many meanings...

Glibc change exposing bugs

Posted Nov 15, 2010 8:13 UTC (Mon) by dlang (guest, #313) [Link]

you are assuming that the program authors care about Irix, AIX, Solaris, or anything else.

most programs do not start off being written portably, usually portability is something that shows up after the program starts being used when people ask about using it on other platforms (and it's not uncommon for it to wait until those people asking submit patches)

not saying that this is right, just saying that it's the way things are. When Solaris dominated the same thing happened favoring it.

Glibc change exposing bugs

Posted Nov 11, 2010 0:30 UTC (Thu) by charlieb (guest, #23340) [Link] (2 responses)

> Hmm... the man page on my desktop computer (RHEL4) says:

What manpage is that? The memcpy(3) manpage on my CentOS4 box does not say "the behavior is undefined". Ah, I see that the memcpy(3p) one does.

> 'must' and 'should' are more in the vein of RFCs.

OK. But at least those are clear. "should" in the context of an API man page is not.

Glibc change exposing bugs

Posted Nov 11, 2010 12:23 UTC (Thu) by gidoca (subscriber, #62438) [Link]

> OK. But at least those are clear. "should" in the context of an API man page is not.
If "should" is interpreted the way you do, then they might as well have omitted the sentence.

Glibc change exposing bugs

Posted Nov 11, 2010 18:03 UTC (Thu) by donwaugaman (subscriber, #4214) [Link]

memcpy(3) on the same RHEL4 box says:

"The memory areas may not overlap."

... which sounds a little stronger than "should" to me.

Not sure why CentOS4 differs...

At any rate, arguing over the man pages is irrelevant to the standard - if the man pages don't match the standard, the man pages need to be fixed rather than the standard.

That being said, it would sure be nice to have some kind of formal deprecation of the previous behavior. One of the nice things about the free software world is that it should be more possible to make these kinds of changes because it's easier to change the programs whose assumptions worked OK with the previous behavior but are violated by the new behavior. Of course, with closed-source Flash players, that goes out the window, and it becomes a question of whether it is more important to pacify Adobe users or to give Adobe an incentive to clean up its software.

Glibc change exposing bugs

Posted Nov 11, 2010 0:04 UTC (Thu) by nix (subscriber, #2304) [Link]

POSIX states:

> If copying takes place between objects that overlap, the behavior is undefined.

The behaviour of Linux (and Unix) systems in this area are governed by POSIX, not a random manpage. (And in this case POSIX is aligned with ISO C, and even uses the same phrasing.)