Handling argc==0 in the kernel

By Jonathan Corbet
January 28, 2022

By now, most readers are likely to be familiar with the Polkit vulnerability known as CVE-2021-4034. The fix for Polkit is relatively straightforward and is being rolled out across the net. The root of this problem, though, lies in a misunderstanding about how programs are run on Unix-like systems. This problem is highly likely to exist in other programs, so it would be nice to find a more general solution. The best place to address this issue may be in the kernel, but properly working around this misunderstanding without causing regressions is not an easy task.

I'd like to have an argument, please

Most developers are familiar with the prototype of the main program as expressed in the C language:

    int main(int argc, char *argv[], char *envp[]);

The program is invoked with its command-line arguments in argv and the environment in envp; both are pointers to null-terminated arrays of char * strings. The number of non-null entries in argv is stored in argc. This API is a user-space creation, though; what happens when the kernel first runs a program is a little bit different: on Linux, that program is passed a single pointer to the argv array. The envp array begins immediately after the NULL value that terminates argv. Thus, in a C program, the following statement will be true on entry to main():

    envp == argv + argc + 1

By convention, argv[0] is the name of the program that is being executed, and many programs rely on that convention. As it happens, though, this convention is exactly that: a convention, but not a guarantee. The actual contents of argv are entirely under the control of whoever calls execve() to run the program in the first place, and that caller is not required to put the program name in argv[0].

Indeed, the caller is not required to provide argv[0] at all. If the argv array passed to execve() is empty (or the argv pointer is simply NULL), the first pointer in the new program's argv array will be NULL, and the envp array will start immediately thereafter. Unfortunately, Polkit (or, more specifically, the setuid pkexec utility) "knew" that argv[0] would always be present, so it performed its argument processing by iterating over the argv array starting at argv[1]. If there are no arguments at all, argv[1] is the same as envp, so pkexec was iterating through its environment variables instead. Throw in some in-place argument modification (pkexec overwrites its argv array), and pkexec could be convinced to rewrite its environment variables, thus bypassing the sanitizing of those variables done for setuid programs. At that point, the game was over.

This problem is not new, and neither is awareness of it. Ryan Mallon wrote about it in 2013, noting that "it does allow for some amusing behaviour from setuid binaries"; he also evidently sent a Polkit patch to address it, but that patch was never applied. Even further back, in 2007, Michael Kerrisk reported the kernel's behavior as a bug, but the report was closed with little discussion. So the problem persisted, culminating in the vulnerability administrators are scrambling to patch now.

Toward a more general fix

Fixing this issue is a simple matter of making pkexec check that argc is at least one. But there are surely other programs out there containing similar assumptions. Given the strength of the argv[0] convention, it is natural to ask whether it makes sense to allow programs to be run with an empty argv array at all. Perhaps it doesn't, but the current API has a lot of history and cannot be changed without a lot of thought.

Ariadne Conill started the linux-kernel discussion with a patch that would simply disallow calls to execve() without at least one argv entry. Offending callers would get an EFAULT return value instead. This would solve the problem by providing a guarantee that argv would not be empty, but at the potential cost of introducing problems of its own. One is that, as Kees Cook discovered, there is actually a fair amount of code out there that calls execve() with an empty argv array. Conill wrote those off as "lazily-written test cases which should be fixed", but regressions in lazily-written test cases are still regressions. Also, as Heikki Kallasjoki and Rich Felker both pointed out, an empty argv array is actually allowed by the POSIX standard.

Felker also suggested an alternative with less potential for regressions: only enforce a non-empty argv at privilege boundaries — when execve() is being called to run a setuid program, in other words. Cook said that he would rather avoid taking the privilege boundary into account if possible, though. He proposed a different solution to the problem: inject an extra null pointer at the end of an empty argv array so that even code that tries to skip argv[0] will notice that there is nothing there. This solution will not work either, as it turns out: the ABI promise is that envp starts immediately after argv, and the extra NULL breaks that promise. There are evidently programs that rely on that layout and would break if it were changed.

Yet another approach, first suggested by Eric Biederman, would be to replace an empty argv with one containing a single pointer to an empty string. This proposal had some support (though nobody has implemented it as of this writing), but also provoked some worries of its own. Perhaps there are programs out there that will respond badly to an empty-string argument, or which do something special when argc is zero. Changing the number of arguments passed to the program run by execve() just looks like it could create surprises.

Cook eventually summarized the situation this way:

Given the code we've found that depends on NULL argv, I think we likely can't make the change outright, so we're down this weird rabbit hole of trying to reject what we can and create work-around behaviors for the cases that currently exist.

That notwithstanding, he then went on to express a preference for the initial change (simply disallowing a zero-argument execve() call anyway, albeit with an EINVAL return rather than EFAULT), with a suggestion to fix a Valgrind test that is already known to break with that restriction. Conill responded with a new version of the original patch; this time it emits a warning before failing an empty-argv call to execve() with EINVAL. Cook acked the patch, saying: "Let's do it and see what breaks"; Biederman concurred: "Especially since you are signing up to help fix the tests".

That is where the discussion stands as of this writing, but it is far from clear that this is how the problem will eventually be addressed. This patch has, crucially, not yet survived its first encounter with Linus Torvalds, who may take a dim view of its potential for regressions. It is an ABI change, after all, and there may well be code out there that responds badly to it, though the fact that BSD systems already prohibit an empty argv will, with luck, have already shaken out most of those. Should unfortunate regressions arise anyway, a different solution will almost certainly need to be found.

Index entries for this article
Kernel	Security/Vulnerabilities
Kernel	System calls/execve()
Security	Linux kernel

Handling argc==0 in the kernel

Posted Jan 28, 2022 15:21 UTC (Fri) by maxfragg (guest, #122266) [Link] (9 responses)

For security reasons it would be a good idea to actually break with the rule of "envp == argv + argc + 1" and move env and argv to separate pages, with one guard in between. This convention really is risky and saving 2 pages does not seem worth it today.

Handling argc==0 in the kernel

Posted Jan 28, 2022 15:42 UTC (Fri) by johnnyapol (subscriber, #151785) [Link]

I like this idea - I had a similar one where issues of argc==0 could be addressed by modifying libc (specifically glibc) to pass in NULL for argv so attempts to read/write would just segfault.

Handling argc==0 in the kernel

Posted Jan 28, 2022 16:02 UTC (Fri) by Tobu (subscriber, #24111) [Link]

The way argc / argv[] / envp[] are laid out on the stack by the elf loader is part of the elf ABI (for existing architectures at least). You might be able to shuffle it in userspace, I guess.

Handling argc==0 in the kernel

Posted Jan 28, 2022 16:15 UTC (Fri) by jreiser (subscriber, #11027) [Link]

Breaking envp == (1+ argc + argv) wlll break all current ELF executables that have been statically linked against glibc. [Asking for a system that is highly robust requires that there be at least a few such programs.] It will also break all executable main programs that have been compressed by upx.

Handling argc==0 in the kernel

Posted Jan 28, 2022 17:26 UTC (Fri) by ariadne (subscriber, #138312) [Link] (5 responses)

Breaking contracts like the ELF ABI for security reasons is a bad idea, and this would be a far more severe ABI break than the one I proposed.

Handling argc==0 in the kernel

Posted Jan 28, 2022 20:14 UTC (Fri) by khim (subscriber, #9252) [Link] (4 responses)

Note that while, at one time, both C standard and POSIX allowed argc == 0 both are fixed now and POSIX, in fact, says that explicitly to remove any shadow of doubt:

> The wording, in particular the use of the word should, requires a Strictly Conforming POSIX Application to pass at least one argument to the exec function, thus guaranteeing that argc be one or greater when invoked by such an application.

So yes, it's a mess, kinda, because yes, old standards explicitly allowed this, but these standards were fixed.

Whether Linux can or should take advantage of that change is another story, of course.

Handling argc==0 in the kernel

Posted Jan 28, 2022 21:52 UTC (Fri) by developer122 (guest, #152928) [Link]

People care far less about standards written on paper than all the programs that will break or behave unexpectedly.

Handling argc==0 in the kernel

Posted Jan 29, 2022 0:23 UTC (Sat) by nybble41 (subscriber, #55106) [Link] (1 responses)

Linus's "no user regressions" policy covers *many* programs which are not Strictly Conforming POSIX Applications, so I'm not sure how that change in the standard is relevant. For example, anything that depends on Linux-specific interfaces, or on any implementation-defined behavior not specified by POSIX, is not a Strictly Conforming POSIX Application. (How many Strictly Conforming POSIX Applications are there in an average Linux installation, really?)

If you do claim to be a Strictly Conforming POSIX Application then you can't call exec with less than one argument. However, a Strictly Conforming POSIX Application may be *invoked* with zero arguments, and is required to handle that case as well to qualify as Strictly Conforming since POSIX does not specify that the argument list passed to main() will be non-empty and you can't assume every other program in the system is also Strictly Conforming.

Handling argc==0 in the kernel

Posted Jan 29, 2022 1:43 UTC (Sat) by Paf (guest, #91811) [Link]

It’s relevant insofar as the argument - which I believe was made in the article? - that “POSIX allows this” is used in support of it. It’s irrelevant insofar as there are real applications which use it (and one cares about them).

Handling argc==0 in the kernel

Posted Jan 29, 2022 10:43 UTC (Sat) by larkey (guest, #104463) [Link]

Absolutely not, POSIX says, more completely:

> Early proposals required that the value of argc passed to main() be "one or greater". This was driven by the same requirement in drafts of the ISO C standard. In fact, historical implementations have passed a value of zero when no arguments are supplied to the caller of the exec functions. This requirement was removed from the ISO C standard and subsequently removed from this volume of POSIX.1-2017 as well.

There's no hard requirement on argc.

> The wording, in particular the use of the word should, requires a Strictly Conforming POSIX Application to pass at least one argument to the exec function, thus guaranteeing that argc be one or greater when invoked by such an application. In fact, this is good practice, since many existing applications reference argv[0] without first checking the value of argc.

A *strictly* conforming application. This is more than just "plain" POSIX.

Handling argc==0 in the kernel

Posted Jan 28, 2022 15:47 UTC (Fri) by larkey (guest, #104463) [Link] (28 responses)

I think there are few things confused here and in the Kernel ML, namely the case a) argv == NULL and the case b) argv[0] == NULL:

> If the argv array passed to execve() is empty (or the argv pointer is simply NULL), the first pointer in the new program's argv array will be NULL, and the envp array will start immediately thereafter.

POSIX says that argv == NULL isn't allowed:

> The argument argv is an array of character pointers to null-terminated strings. The application shall ensure that the last member of this array is a null pointer.

(https://pubs.opengroup.org/onlinepubs/9699919799.2018edit...)

That's also what the kernel bug (https://bugzilla.kernel.org/show_bug.cgi?id=8408) from years back is about. It is *not* about enforcing argc >= 1 (i.e., argv[0] != NULL).

The patch by Ariadne Conill does *not* address what was reported in the bug, despite claiming so. It enforces that case b) never happens which is a stricter requirement and indeed not required by POSIX, as correctly pointed out by Heikki Kallasjoki and Rich Felker. This is just a recommendation:

> The value in argv[0] should point to a filename string that is
associated with the process [...]

I agree that Linux should not allow argv == NULL, that's definitely against the spec. argv[0] == NULL is a different matter although I think it is a reasonable demand, from a security stand point.

-------

Appendix:

POSIX on shall and should:

> shall
>
> For an implementation that conforms to POSIX.1-2017, describes a feature or behavior that is mandatory. An application can rely on the existence of the feature or behavior.

> should
>
> For an implementation that conforms to POSIX.1-2017, describes a feature or behavior that is recommended but not mandatory. An application should not rely on the existence of the feature or behavior. An application that relies on such a feature or behavior cannot be assured to be portable across conforming implementations.

(https://pubs.opengroup.org/onlinepubs/9699919799.2018edit...)

Handling argc==0 in the kernel

Posted Jan 28, 2022 16:26 UTC (Fri) by matthias (subscriber, #94967) [Link] (25 responses)

> I think there are few things confused here and in the Kernel ML, namely the case a) argv == NULL and the case b) argv[0] == NULL:

Probably the confusion is for a simple reason. The two things are more or less the same. If the execve() function is called with argv == NULL, the kernel actually constructs a NULL-terminated array argv with argv[0] == NULL and calls main(argv,0). Thus for the called application, these two things are indistinguishable.

> I agree that Linux should not allow argv == NULL, that's definitely against the spec.

From the examples in the discussions it seems that the "lazily written tests" do exactly this: call execve() with argv == NULL. If Linux disallows this, these tests will break due to an ABI/API change.

> argv[0] == NULL is a different matter although I think it is a reasonable demand, from a security stand point.

Yeah, but disallowing the former and allowing the latter does not make any sense, as the two cases cannot be distinguished by the called application. There would be no advantage from just disallowing argv == NULL.

The problematic case from the security persepctive is allowing argv[0] == NULL (or more precisely argc==0). The problematic case from the compatibility perspective is disallowing argv == NULL in the execve() call. Thus only disallowing argv == NULL in the call gives us the worst from both perspectives. We get all compatibility problems with no increase in security.

Handling argc==0 in the kernel

Posted Jan 28, 2022 16:35 UTC (Fri) by larkey (guest, #104463) [Link] (24 responses)

That makes sense, I didn't know that Linux constructed the {NULL} array. However, this makes many people be confused about the original bug and the BSD behavior. E.g., FreeBSD, until 2 days ago, allowed a {NULL} array as argv but not argv==NULL.

> Yeah, but disallowing the former and allowing the latter does not make any sense, as the two cases cannot be distinguished by the called application. There would be no advantage from just disallowing argv == NULL.
>
> The problematic case from the security persepctive is allowing argv[0] == NULL (or more precisely argc==0). The problematic case from the compatibility perspective is disallowing argv == NULL in the execve() call. Thus only disallowing argv == NULL in the call gives us the worst from both perspectives. We get all compatibility problems with no increase in security.

I disagree here though. Disallowing argv == NULL but allowing argv[0] == NULL would allow for this "iterator" pattern (which is as old as UNIX):

for (**arg = &argv[1]; arg < &argv[argc]; arg++)

Handling argc==0 in the kernel

Posted Jan 28, 2022 16:57 UTC (Fri) by corbet (editor, #1) [Link] (4 responses)

Allowing argv[0]==NULL would indeed allow the "old as Unix" pattern to work. But it also works fine now in the argv==NULL case. The pattern, perhaps equally old, that fails is:

    for (arg = argv + 1; *arg; arg++)
        /* ... */

...and that pattern will still fail badly with argv[0]==NULL.

Handling argc==0 in the kernel

Posted Jan 28, 2022 17:00 UTC (Fri) by larkey (guest, #104463) [Link] (3 responses)

Arguably a bad pattern :-)

But I think that Ariadne's patch should be applied nontheless. It's just that there's much confusion about these two different issues (although related).

Handling argc==0 in the kernel

Posted Jan 29, 2022 18:56 UTC (Sat) by adobriyan (subscriber, #30858) [Link] (2 responses)

The best, UB free, pattern is

for (int i = 1; i < argc; ++i) { const char *arg = argv[i]; ... }

Handling argc==0 in the kernel

Posted Jan 30, 2022 23:32 UTC (Sun) by rschroev (subscriber, #4164) [Link] (1 responses)

That's simple, readable, clearly states the intent. Why do people even try to come up with other ways to do things?

Handling argc==0 in the kernel

Posted Jan 31, 2022 1:01 UTC (Mon) by jreiser (subscriber, #11027) [Link]

>Why do people even try to come up with other ways to do things?

Because that does not accommodate state changes: deleting, re-arranging, or re-writing pieces of argv (and thus argc); because the GNU getopt routine has not always existed, has not always offered all its current features, and early suffered from bugs; and probably other reasons, too.

Handling argc==0 in the kernel

Posted Jan 28, 2022 17:12 UTC (Fri) by matthias (subscriber, #94967) [Link]

> I disagree here though. Disallowing argv == NULL but allowing argv[0] == NULL would allow for this "iterator" pattern (which is as old as UNIX):
> for (**arg = &argv[1]; arg < &argv[argc]; arg++)

Disallowing argv[0] still allows for this pattern. The security problem arises with applications relying on argc>0 (like pkexec did).

And there is really no reason to disallow argv == NULL (in the execve()-call) while allowing argv[0] == NULL. As Linux constructs the { NULL } array, both things cannot be distinguished from the called application. There is no reason to get all the compatibility problems arising from disallowing argv == NULL without gaining any security benefit. Either we want the get rid of the security issues. Then we have to disallow argv[0] == NULL. Or we can leave everything as it is now.

Handling argc==0 in the kernel

Posted Jan 29, 2022 8:31 UTC (Sat) by matthias (subscriber, #94967) [Link] (17 responses)

> for (**arg = &argv[1]; arg < &argv[argc]; arg++)

I just was wondering if this is UB if argc==0. Probably it is. According to the C standard argv is a argc+1 sized array, where argv[argc] == NULL. If argc is 0, argv[1] uses an index outside of the bounds of the array. Thus even writing argv[1] would be UB. The code argv+1 instead of &argv[1] would be valid, as it is perfectly allowed to construct a pointer pointing to the address immediately after the end of an array. However one is not allowed to dereference it.

I do not really expect this to break even in the case of argc==0 and of course as long as the program is called with argc>0 this code is not UB at all. But a conforming compiler can probably eliminate a check argc!=0 that occurs later on, because the very fact that argv[1] has been used tells the compiler that either argc>0 or the code is UB and the compiler is free to do whatever it wants.

Handling argc==0 in the kernel

Posted Jan 29, 2022 10:39 UTC (Sat) by larkey (guest, #104463) [Link] (12 responses)

> However one is not allowed to dereference it.

But there's no dereferencing? But more plainly perhaps:

for (char **arg = argv + 1; arg < argv+argv; arg++)

Handling argc==0 in the kernel

Posted Jan 29, 2022 11:05 UTC (Sat) by matthias (subscriber, #94967) [Link] (11 responses)

argv[1] is clearly dereferencing and the & operator then takes the address of the dereferenced value. If you have an invalid pointer, you are not allowed to dereference it. Writing &*a is UB if a is an invalid pointer, even if the combination of & and * should be a no-op. And writing &argv[1] is UB if argv is an array of length 1 (only containing the terminating NULL) as writing argv[1] is already UB. You cannot make the UB of argv[1] be defined again by later taking the address.

I agree that the compiler will probably generate the exact same code for **arg = &argv[1] and **arg = argv + 1. But in the first case, the compiler is allowed to conclude that the length of the argv array is at least two (including the final NULL value), as argv[1] is UB if the array has length one. I am not saying that any compiler actually does this, but when it comes to UB, compilers are known to do very crazy things, so it is not entirely impossible. And in the case of the main function it is even less likely that the compiler will do anything crazy. The pointer is provided by the OS and the compiler does not know the length of the array (unless it uses the special connection between argv and argc). So the code should probably be fine. But this is one of the things I really do not like about C. The threat of UB is almost everywhere. And at some point in the future some clever compiler will exploit this against you.

Handling argc==0 in the kernel

Posted Jan 29, 2022 11:22 UTC (Sat) by mchapman (subscriber, #66589) [Link]

> Writing &*a is UB if a is an invalid pointer, even if the combination of & and * should be a no-op.

No, this is not correct.

C specifies that &*a is equivalent to a: "neither [the * operator] nor the & operator is evaluated and the result is as if both were omitted". This expression is valid for any value of a, even a null pointer.

Handling argc==0 in the kernel

Posted Jan 29, 2022 23:08 UTC (Sat) by areilly (subscriber, #87829) [Link] (9 responses)

"And writing &argv[1] is UB if argv is an array of length 1 (only containing the terminating NULL) as writing argv[1] is already UB."

No: &argv[1] is semantically identical to argv + 1. It is pointer arithmetic and does not cause a dereference.

However: in deference to C support on AS400 (and Unisys? anything with fancy pointer representations) it *is* undefined behaviour for arrays of length zero. It's only legal to construct a pointer value to the single element past the end of the array, whether or not the pointer is subsequently dereferenced. This is also why it's technically illegal to use a post-decremented pointer traversal that accesses the first element of an array. The one-past-the-end rule doesn't extend to one-before-the-beginning. Sigh.

Handling argc==0 in the kernel

Posted Jan 30, 2022 10:43 UTC (Sun) by larkey (guest, #104463) [Link] (8 responses)

> However: in deference to C support on AS400 (and Unisys? anything with fancy pointer representations) it *is* undefined behaviour for arrays of length zero.

Aren't zero-length arrays a GNU extension? Or do you refer to flexible struct members?

Handling argc==0 in the kernel

Posted Jan 30, 2022 11:23 UTC (Sun) by areilly (subscriber, #87829) [Link] (1 responses)

To be honest, I was thinking about the argc==0 example here, where the argument vector allocation was real, because it was just the first part before an explicit null and the environment variable elements. In that case I don't think that the one-past-the-end situation arises, because the memory object is the whole thing and clearly not zero length. OK, I don't think that the zero-length issue is a thing, sorry.

Handling argc==0 in the kernel

Posted Jan 31, 2022 9:50 UTC (Mon) by larkey (guest, #104463) [Link]

Ah, I see, I was just curious if I missed something :) No problem!

Handling argc==0 in the kernel

Posted Jan 30, 2022 20:21 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (5 responses)

It is legal to call malloc(0), and it is legal for malloc(0) to return a value other than NULL. If it does so, then the only thing you're allowed to do with that value is pass it to free(). In particular, you cannot add one to it, because it's functionally equivalent to a zero-length array (but I don't know if the standard actually uses the exact phrase "zero-length array").

In practice, most implementations either return NULL or convert it into a call to malloc(1). But even then, that's a one-byte array, not a one-object array, so you still can't legally add one to it unless the pointer's type has sizeof() == 1.

Handling argc==0 in the kernel

Posted Jan 30, 2022 20:23 UTC (Sun) by NYKevin (subscriber, #129325) [Link]

> then the only thing you're allowed to do with that value is pass it to free().

Or realloc(), I suppose. Point is, you definitely can't dereference it.

malloc(0)

Posted Jan 31, 2022 1:20 UTC (Mon) by jreiser (subscriber, #11027) [Link] (2 responses)

>It is legal to call malloc(0), and it is legal for malloc(0) to return a value other than NULL. If it does so, then the only thing you're allowed to do with that value is pass it to free()

One of the uses of malloc(0) is to serve as an analogue of the Lisp function (gensym): return a unique value, one that does not point to anything else returned by malloc, both now and for the remaining life of the process. (Yes, the standard requires this.) An implementation for which malloc(0) always returns NULL is not acceptable. So in practice, many implementations begin with something like size += (0==size), i.e. malloc(0) is treated as malloc(1).

malloc(0)

Posted Jan 31, 2022 1:31 UTC (Mon) by ABCD (subscriber, #53650) [Link]

An implementation for which malloc(0) always returns NULL is expressly permitted by the POSIX standard:

If the size of the space requested is 0, the behavior is implementation-defined: the value returned shall be either a null pointer or a unique pointer.

The C17 standard says something similar:

If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned to indicate an error, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.

malloc(0)

Posted Jan 31, 2022 9:43 UTC (Mon) by NYKevin (subscriber, #129325) [Link]

The null pointer satisfies all of the technical requirements of a heap-allocated zero-length array (you can form a pointer that is one past the end, you can iterate over it zero times with a standard for loop, and you can pass it to realloc without breaking anything), and all heap-allocated zero-length arrays are functionally identical because they have no state which can be mutated, so returning NULL can be thought of as semantically equivalent to a copy elision (i.e. you could think of all heap-allocated zero-length arrays as "copies" of the zero-length array that lives at NULL, and then you can elide those copies because zero-length arrays are immutable, so making a copy is unnecessary).

The fact that it happens to vaguely resemble some feature of Lisp, but the standard does not actually require it to fulfill all the requirements of that feature is, frankly, Lisp's problem, not C's problem.

Handling argc==0 in the kernel

Posted Jan 31, 2022 10:02 UTC (Mon) by larkey (guest, #104463) [Link]

Fair enough, although I *think* C restricts the meaning of array to things that are declared T D[N]; Zero-allocations are definitely possible though, yes, and POSIX is a bit more lax about wording since they say "argv is an array" while, in C terminology, it's "just" a pointer to some memory that... and so on.

Anyway, the POSIX standard pretty clearly says that argv is NULL-terminated, that is, not by itself a zero-length array, but at least

char **argv = { NULL };

Handling argc==0 in the kernel

Posted Jan 30, 2022 3:24 UTC (Sun) by nybble41 (subscriber, #55106) [Link]

> If argc is 0, argv[1] uses an index outside of the bounds of the array. Thus even writing argv[1] would be UB.

Another comment by areilly hinted at this already, but evaluating argv[1] for an argv array of size 1 (where argc == 0) is not undefined behavior. You are allowed to construct (but not dereference) a pointer to the element immediately after the end of an array. (Taking the address of a dereference or array indexing expression does not count as actually dereferencing the pointer: even though argv[1] in an expression by itself would be UB, &argv[1] and &*(argv + 1) are both semantically identical to the pointer arithmetic operation argv + 1. The address-of operator "cancels out" the dereference.) So there is no undefined behavior in the example, even if argc == 0, as &argv[1] and &argv[0] are both valid pointers.

Handling argc==0 in the kernel

Posted Mar 28, 2022 21:41 UTC (Mon) by fest3er (guest, #60379) [Link] (1 responses)

"According to the C standard argv is a argc+1 sized array, where argv[argc] == NULL. If argc is 0, argv[1] uses an index outside of the bounds of the array."

Hmmm. Off-by-one seems to rear its head here. If argc is 0, then argv[] should contain 0+1 elements, thus only argv[0] exists and should be null and argv[1] is out-of-bounds.

Aside, why are people saying that argv is the same as argv[0]? Isn't argv the address of the array, and argv[0] the address of the zeroeth argument (or null to indicate the end of the array)? Thus, shouldn't argv always be a valid pointer to an array and the array always contain at least the terminating NULL pointer?

And when the kernel constructs the argv array, does it use its memory or the user's memory?

Handling argc==0 in the kernel

Posted Mar 29, 2022 12:51 UTC (Tue) by jem (subscriber, #24231) [Link]

>Hmmm. Off-by-one seems to rear its head here. If argc is 0, then argv[] should contain 0+1 elements, thus only argv[0] exists and should be null and argv[1] is out-of-bounds.

Yes, of course. If argc is 0, then the argument vector is empty and only contains the terminating NULL at index 0. You can't expect to find anything of interest in argv[1].

>And when the kernel constructs the argv array, does it use its memory or the user's memory?

The argv array is in the process (user) address space. You can print the address of argv if you are interested to find out where it is located.

Handling argc==0 in the kernel

Posted Mar 29, 2022 20:16 UTC (Tue) by nybble41 (subscriber, #55106) [Link]

> The code argv+1 instead of &argv[1] would be valid, as it is perfectly allowed to construct a pointer pointing to the address immediately after the end of an array. However one is not allowed to dereference it.

From C99 (draft) §6.5.3.2:

> If the operand is the result of a unary * operator, neither that operator nor the & operator is evaluated and the result is as if both were omitted, except that the constraints on the operators still apply and the result is not an lvalue. Similarly, if the operand is the result of a [] operator, neither the & operator nor the unary * that is implied by the [] is evaluated and the result is as if the & operator were removed and the [] operator were changed to a + operator.

So "&argv[1]" is effectively rewritten as "argv+1" and there is no undefined behavior either way. You could even have an expression like "&(*p)" or "&p[0]" where p == NULL and the result will be well-defined (NULL). A conforming compiler cannot assume that argc > 0 based on the presence of "&argv[1]".

Handling argc==0 in the kernel

Posted Jan 28, 2022 20:07 UTC (Fri) by khim (subscriber, #9252) [Link] (1 responses)

You found the right page, just, perhaps, not the right quote. I think that one should close that discussion about meanings of different words:

> The wording, in particular the use of the word should, requires a Strictly Conforming POSIX Application to pass at least one argument to the exec function, thus guaranteeing that argc be one or greater when invoked by such an application.

I don't think explanations can be more clear than that.

You don't need to dig around for strict meanings of different words, authors of the standard did that for you.

Handling argc==0 in the kernel

Posted Jan 28, 2022 21:13 UTC (Fri) by larkey (guest, #104463) [Link]

But in what way does this contradict what I say? "should" means that this is a recommendation, i.e., a *strictly* conforming application needs to fulfill this criterion. But POSIX only uses should for argv[0] being the program name, but they use *shall* for argv ≠ NULL. Which is precisely my point. The latter is a *must do*, the former is not as strict as a requirement.

Handling argc==0 in the kernel

Posted Jan 28, 2022 16:08 UTC (Fri) by judas_iscariote (guest, #47386) [Link] (7 responses)

POSIX says a lot a crazy things..and is technically wrong worringly often.. When will people stop language-lawyering things and get real ?

BSDs on this case return -EFAULT , other OSs return -EINVAL..so allowing this behaviour is not the common standard..

Handling argc==0 in the kernel

Posted Jan 28, 2022 16:18 UTC (Fri) by larkey (guest, #104463) [Link] (6 responses)

At least FreeBSD runs this fine. They do return -EFAULT though when you do:

execve("/bin/date", NULL, NULL);

as mentioned in the year-old bug report. But this patch is about

execve("/bin/date", &(char*){NULL}, NULL);

which neither FreeBSD thinks is faulty, nor POSIX requires (i.e., FreeBSD is exactly in line with POSIX here). POSIX however recommends that argv[0] != NULL.

There are two issues here, a) that Linux isn't POSIX conforming, allowing argv == NULL, and b) pkexec relying on argv[0] != NULL which is only a POSIX recommendation. I do think the patch from Ariadne Conill should be applied, but then Linux is decidedly stricter than FreeBSD.

Handling argc==0 in the kernel

Posted Jan 28, 2022 16:22 UTC (Fri) by larkey (guest, #104463) [Link]

However, FreeBSD has applied a similar patch as proposed by Ariadne 2 days ago:

https://github.com/freebsd/freebsd-src/commit/773fa8cd136...

Copied from OpenBSD, says the commit message.

Handling argc==0 in the kernel

Posted Jan 28, 2022 17:33 UTC (Fri) by ariadne (subscriber, #138312) [Link] (1 responses)

It's about both, Linux turns NULL into {NULL} which is then rejected by the patch.

Handling argc==0 in the kernel

Posted Jan 28, 2022 17:44 UTC (Fri) by larkey (guest, #104463) [Link]

Fair enough :D

But, at least to me, it was kind of confusing at first to see these two mangled.

Handling argc==0 in the kernel

Posted Jan 30, 2022 18:44 UTC (Sun) by anton (subscriber, #25547) [Link] (2 responses)

Linux isn't POSIX conforming, allowing argv == NULL

Can you support this claim with actual POSIX wording?

Handling argc==0 in the kernel

Posted Jan 31, 2022 9:58 UTC (Mon) by larkey (guest, #104463) [Link] (1 responses)

The POSIX standard says:

> The argument argv is an array of character pointers to null-terminated strings. The application shall ensure that the last member of this array is a null pointer.

There is nothing in POSIX saying "it may be an array, but it also may be NULL". If a pointer may be NULL in any C function, this is explicitly noted. Compare the standard on strtol(3):

> A pointer to the final string shall be stored in the object pointed to by endptr, provided that endptr is not a null pointer.

(https://pubs.opengroup.org/onlinepubs/9699919799/function...).

This is making it explicit that endptr may be NULL. The nptr argument, however, may not be NULL i.e.,

strtol(NULL, NULL, 10);

is *not* allowed. In the same vain, you can't just make argv be NULL.

Handling argc==0 in the kernel

Posted Jan 31, 2022 12:35 UTC (Mon) by anton (subscriber, #25547) [Link]

In all of this text I see only requirements on the application, not on the OS. An application must not pass argv=NULL, and POSIX does not define define what the OS shall do if the application is non-conforming in this respect. Linux has a certain behaviour, BSD a different one, both conforming AFAICT.

Handling argc==0 in the kernel

Posted Jan 28, 2022 18:53 UTC (Fri) by flussence (guest, #85566) [Link] (29 responses)

Just idly speculating here... what would the fallout be if instead of throwing an error, the kernel filled in an empty argv with "/proc/self/exe"? That's going from one Linux-specific behaviour to another but I'm guessing it'd still close the original exploit.

Handling argc==0 in the kernel

Posted Jan 28, 2022 19:16 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (8 responses)

IMHO the kernel should only try to provide a meaningful value there if we can be 100% sure it will actually work (because if it only works 95% of the time, someone will depend on it until it breaks). Problems include:

* /proc might not be mounted at all, or it might be mounted somewhere other than /proc.
* / might not be / (e.g. chroot, containers, etc.), and so /proc might be inaccessible.
* The callee might try to pass the argv[0] string on to some other process, but then /proc/self will be interpreted as that process.
* If you try to fix the above using an absolute PID instead of "self," then you run into PID namespaces.
* There are a variety of situations where the called process's binary is not directly accessible to the called process as a regular file, and /proc/self/exe appears to be a broken symlink. The only way to fix this problem is for the caller to provide a suitable alternative path, if a functional argv[0] is desired at all.

Handling argc==0 in the kernel

Posted Jan 30, 2022 21:42 UTC (Sun) by developer122 (guest, #152928) [Link] (7 responses)

Would the case where you delete the executable file also fall into this?

Handling argc==0 in the kernel

Posted Feb 1, 2022 1:34 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (6 responses)

I don't know the inner guts of procfs symlink resolution, but I imagine so.

Handling argc==0 in the kernel

Posted Feb 1, 2022 14:57 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (5 responses)

I've actually played around with this before. So if you create an executable and check its `/proc/$pid/exe` (I think I was actually playing with `/proc/$pid/fd/$n`, but I imagine it's similar behavior). It is linked to `/path/to/exe`. If you delete it, the symlink now "points to" `/path/to/exe (deleted)`. If you then create *this* path, the symlink is still broken even though the "target path" exists if you manually use the `readlink` result. Creating the original path also keeps it in the "(deleted)" state IIRC.

Handling argc==0 in the kernel

Posted Feb 1, 2022 18:07 UTC (Tue) by nybble41 (subscriber, #55106) [Link] (4 responses)

> If you delete it, the symlink now "points to" `/path/to/exe (deleted)`.

Interesting fact: Even though the /proc/$PID/exe symlink is "broken" you can still open it, producing a reference to the original file. The /proc filesystem has its own special rules for resolving symlinks. However, attempting to restore the file by creating a hard-link with `ln` or `link` gives the error "Invalid cross-device link" even though the original (deleted) file and the target directory are on the same filesystem. (You can *copy* /proc/$PID/exe to a new file, though.)

Handling argc==0 in the kernel

Posted Feb 7, 2022 9:28 UTC (Mon) by jwilk (subscriber, #63328) [Link] (3 responses)

"ln /proc/$PID/exe foo" fails with EXDEV, because it's trying to hardlink the symlink that resides in /proc.

You probably wanted "ln -L /proc/$PID/exe foo" to dereference the symlink… which still doesn't work, but at least you get a more understandable error message. (The dereferencing happens on the kernel side, so it could work in principle.)

Handling argc==0 in the kernel

Posted Feb 7, 2022 18:27 UTC (Mon) by nybble41 (subscriber, #55106) [Link] (2 responses)

> You probably wanted "ln -L /proc/$PID/exe foo" to dereference the symlink...

Yes, I tried it both with and without the -L option—I suppose I should have mentioned the difference in the error message. The kernel specifically blocks the creation of a hard-link to a previously deleted (zero-refcount) inode. It does work, however, if the inode still exists in the filesystem due to hard-links, even if the original path was deleted and the /proc/$PID/fd/N symlink is broken.

Handling argc==0 in the kernel

Posted Feb 10, 2022 21:21 UTC (Thu) by Jandar (subscriber, #85683) [Link] (1 responses)

> The kernel specifically blocks the creation of a hard-link to a previously deleted (zero-refcount) inode.

If you an open the deleted /proc/pid/exe, why would linkat(fd, "", AT_FDCWD, "/path/for/file", AT_EMPTY_PATH); not work? A filedescriptor obtained with open( ".", , | O_RDWR ) has to have a zero st_nlink as well.

I have never done an open(,O_TMPFILE) + linkat(fd, "", AT_FDCWD,...) so this may be a misconception.

Handling argc==0 in the kernel

Posted Feb 11, 2022 0:29 UTC (Fri) by nybble41 (subscriber, #55106) [Link]

I also find the restrictions a bit surprising, but Linus's response[0] when someone proposed an flink() system call[1]—which would have been more-or-less equivalent to hard-linking from /proc/$PID/fd/$N, with the primary use-case involving creating links to previously unlinked files—was a flat rejection: "there is no way in HELL we can do this securely without major other incursions." I believe the result of this discussion was a number of new restrictions on hard-linking files via /proc.

There are various things you can still do with /proc/$PID/fd/$N which may prove surprising from a security point of view. For example, if a file descriptor is opened read-only and passed to another user over a socket, or inherited by a child process, then normally the recipient can only read from that FD, even if they would have had write access to the original file. However, the target process inheriting that file descriptor can open /proc/self/fd/$N *for writing* and modify the contents of the file, even without access to the original path, so long as permissions on the file itself are compatible:

user$ mkdir my_dir
user$ chmod 0700 my_dir
user$ cd my_dir
user$ echo test > testfile
user$ chmod 0666 testfile
user$ exec 3< testfile
user$ su -c 'su nobody -s /bin/bash' # can't use sudo here as by default it would close the FD
nobody$ ls -l /proc/self/fd/3
lr-x------ 1 nobody nogroup 64 Feb 10 00:00 /proc/self/fd/3 -> /home/user/my_dir/testfile
nobody$ cat /home/user/my_dir/testfile # no access via my_dir
cat: /home/user/my_dir/testfile: Permission denied
nobody$ cat /proc/self/fd/3
test
nobody$ echo attempt1 1>&3 # writing to the FD itself fails, because it was opened read-only
bash: echo: write error: Bad file descriptor
nobody$ echo attempt2 > /proc/self/fd/3 # but this succeeds!
nobody$ cat /proc/self/fd/3
attempt2
nobody$ mkdir /tmp/other_dir && cd /tmp/other_dir
nobody$ ln -L /proc/self/fd/3 testfile_link # this works despite /proc/sys/fs/protected_hardlinks because we have read and write access to the file
nobody$ ls -l testfile_link
-rw-rw-rw- 3 user user 9 Feb 10 00:00 testfile_link
nobody$ echo attempt3 > testfile_link
nobody$ cat testfile_link
attempt3
nobody$ exit
user$ cat testfile # confirm that the original file was changed
attempt3

The overall moral of this example is that if you're relying on restrictive permissions on a parent directory to restrict access to otherwise-writable files it may not work out quite the way you hoped.

[0] https://marc.info/?l=linux-kernel&m=104973707408577&...

[1] https://marc.info/?l=linux-kernel&m=104965452917349

Handling argc==0 in the kernel

Posted Jan 28, 2022 20:06 UTC (Fri) by dskoll (subscriber, #1630) [Link] (18 responses)

You don't need to muck about in /proc. The first argument to execve() is the pathname being executed; that could be placed as argv[0].

Handling argc==0 in the kernel

Posted Jan 28, 2022 20:20 UTC (Fri) by matthias (subscriber, #94967) [Link] (1 responses)

Why not just pass "", i.e. the empty string?

If a callee does not inspect argv[0], then any string is good. If a callee expects some non-empty string as argv[0], then calling with argv pointing to { NULL } does not work today. And there is no reason why it should work tomorrow. Doing anything else than aborting with an error would only be to be backwards compatible with the current situation.

The only reason this could break anything that is working now that I can images is, that some callee actually expects to be called with argc==0 and argv pointing to { NULL }. But this would be really weird.

Handling argc==0 in the kernel

Posted Jan 30, 2022 21:43 UTC (Sun) by developer122 (guest, #152928) [Link]

You could run into issues with programs that check argc. They may expect it to be zero, or they may expect it to be 1 with a valid path, but they may not expect it to be 1 with an invalid path.

Handling argc==0 in the kernel

Posted Jan 28, 2022 20:24 UTC (Fri) by khim (subscriber, #9252) [Link] (4 responses)

Nope. It couldn't. Since that one, too, may be /proc/self/exe (and, indeed there are one case where pattern exec("/proc/self/exe", NULL, NULL); is used).

So it's not bullet-proof and, I think, few examples where this Linux-only misfeature is exploited are not worth inventing crazy schemes.

Handling argc==0 in the kernel

Posted Jan 28, 2022 21:07 UTC (Fri) by dskoll (subscriber, #1630) [Link] (3 responses)

But if execve("/proc/self/exe", NULL, NULL); succeeds, under what circumstances could /proc/self/exe not be relied on in the execed program?

Handling argc==0 in the kernel

Posted Jan 28, 2022 21:15 UTC (Fri) by khim (subscriber, #9252) [Link] (2 responses)

When someone would call `realpath` on it? That never happened in real-world usecase so yeah, you are right, that may a way out.

Of course this would also break most tests anyway thus it's not clear if couple of real programs are worth that complexity.

Kernel config option looks more and more sensible: we know such programs are rare, but how rare exactly?

Handling argc==0 in the kernel

Posted Jan 28, 2022 21:26 UTC (Fri) by dskoll (subscriber, #1630) [Link] (1 responses)

realpath is an executable; I think you meant readlink(2). But my point is this: If the original execve call succeeds, then "/proc/self/exe" must still be valid in the exec'd program. if /proc/self/exe were not valid, then the execve call would fail.

Handling argc==0 in the kernel

Posted Jan 29, 2022 2:48 UTC (Sat) by comex (subscriber, #71521) [Link]

Not to get lost in the weeds, but realpath is also the name of a function; see `man 3 realpath`.

Handling argc==0 in the kernel

Posted Jan 28, 2022 20:31 UTC (Fri) by floppus (guest, #137245) [Link] (7 responses)

But if any setuid program actually relies on argv[0] being the name of the program, that's a severe bug (symlinks, TOCTOU, etc., never mind the fact that the caller can always specify any string they want.)

If the kernel were to silently replace argc==0 with argc==1, it seems like using an empty string would be most parsimonious. At least an empty string is guaranteed not to be a valid filename, whereas /proc/self/exe or the name of the executable might or might not be.

Handling argc==0 in the kernel

Posted Jan 28, 2022 21:11 UTC (Fri) by dskoll (subscriber, #1630) [Link]

I think either an empty string or a copy of the first argument to execve is a good choice.

Handling argc==0 in the kernel

Posted Jan 28, 2022 21:21 UTC (Fri) by khim (subscriber, #9252) [Link] (3 responses)

> But if any setuid program actually relies on argv[0] being the name of the program, that's a severe bug (symlinks, TOCTOU, etc., never mind the fact that the caller can always specify any string they want.)

How? All examples I have ever saw are using argv[0] only to multiplex binaries. They don't ever look on the actual binary which is specified in argv[0], rather, they are only interested in `basename`. Symlinks, TOCTOU and even the fact that caller can specify anything there are irrelevant: for them it's just an argument of the command line which happen to include name of the program.

And empty arguments have tendency to make programs wonky.

Handling argc==0 in the kernel

Posted Jan 28, 2022 22:12 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (2 responses)

> And empty arguments have tendency to make programs wonky.

setuid programs already need to defend against argv[0] == "" (or any other actual string, for that matter) because they must not trust the caller. POSIX explicitly specifies that the caller can set argv[0] to whatever it wants, and passing an invalid or incorrect argv[0] has been well understood for a long time as A Thing That Can Be Done (even though it's probably a bad idea in most cases). OTOH, the "there is no argv[0]" case is much less obvious and (I think) is not supported at all on some platforms, so it should not be too surprising that at least one setuid program failed to check for it.

So, from a security perspective, changing "the string doesn't exist" to "the string is empty" is the most straightforward way to close the security hole, since the callee is very likely already checking for an empty string (to the extent that it cares about argv[0] at all).

Handling argc==0 in the kernel

Posted Jan 28, 2022 22:41 UTC (Fri) by JoeBuck (subscriber, #2330) [Link]

Replacing argv[0] via an exec call was a common trick back in the days of students sharing some Vax machine and wanting to hide the fact that they were playing games from "ps".

Handling argc==0 in the kernel

Posted Feb 10, 2022 4:45 UTC (Thu) by rlhamil (guest, #6472) [Link]

And arguments against altering argv in the kernel don't make sense, since that already happens for interpreter execs, right?

That's a much more complex transformation than inserting argv[0]="" if argv is NULL or argv[0] is NULL.

Handling argc==0 in the kernel

Posted Jan 30, 2022 21:48 UTC (Sun) by developer122 (guest, #152928) [Link] (1 responses)

I think changing argc from 0 to 1 and adding an empty string will probably trip up legitimate programs. If argc is 0, then they may handle that by not checking argv and all is well. If argc is one, then they may expect a valid entry in argv[0]. It's still not great to require a valid name in argv[0], but regardless of what vulnerabilities may exist already I can see this causing more breakage to programs that attempt to handle argc==0 correctly.

Handling argc==0 in the kernel

Posted Feb 1, 2022 1:37 UTC (Tue) by NYKevin (subscriber, #129325) [Link]

As I explained in another comment, the caller can set argv[0] to any string it wishes, and a setuid program must not trust the caller (because the caller is unprivileged). So all setuid programs *must* behave correctly for invalid or "wrong" argv[0]. If they don't, then that's a separate vulnerability which must be fixed regardless of what the kernel does in the argc == 0 case. By coalescing the two cases into one case, you halve the number of potential vulnerabilities that application writers need to worry about.

Handling argc==0 in the kernel

Posted Feb 4, 2022 16:48 UTC (Fri) by sbaugh (guest, #103291) [Link] (2 responses)

>The first argument to execve() is the pathname being executed; that could be placed as argv[0].

Not when using execveat(fd, "", ..., AT_EMPTY_PATH)

Handling argc==0 in the kernel

Posted Feb 4, 2022 17:04 UTC (Fri) by adobriyan (subscriber, #30858) [Link]

d_path() can name any file.

Handling argc==0 in the kernel

Posted Feb 5, 2022 1:14 UTC (Sat) by mchapman (subscriber, #66589) [Link]

What is the process name -- as returned by prctl(PR_GET_NAME) -- when you do that? It seems like whatever it chooses would be an appropriate thing to default argv[0] to.

Handling argc==0 in the kernel

Posted Jan 29, 2022 0:06 UTC (Sat) by ariadne (subscriber, #138312) [Link]

In the argv construction case, I suggested {bprm->filename, NULL} which is even better than that. But the conversation shifted back to "lets just see if we can reject this entirely," so let's see how that plays out first.

Handling argc==0 in the kernel

Posted Jan 29, 2022 16:45 UTC (Sat) by ballombe (subscriber, #9523) [Link]

Handling this in the kernel is likely too rigid.
What not a special ldsuid.so loader for SUID program that would sanitize everything before the real program start ?
This way security-conscious distro could ship a ldsuid.so program that implement the security they want.

Handling argc==0 in the kernel

Posted Jan 31, 2022 2:41 UTC (Mon) by marcH (subscriber, #57642) [Link] (10 responses)

> The envp array begins immediately after the NULL value that terminates argv.

First, a brief reminder that some standard requires argv[argc] to be NULL would be IMHO very useful. It's not obvious that it does because why would you need a clumsy NULL terminator when you already have a proper argc?

For a while I wondered "Did he mean the \0 byte terminating argv[argc - 1] instead?". The pointer arithmetic and discussion that follows clarifies but it comes after. Also, pointer arithmetic with a mix of pointers and arrays degrading to pointers is... not fun :-)

Handling argc==0 in the kernel

Posted Jan 31, 2022 8:53 UTC (Mon) by anselm (subscriber, #2796) [Link] (9 responses)

First, a brief reminder that some standard requires argv[argc] to be NULL would be IMHO very useful.

FWIW, my venerable copy of ANSI/ISO 9899-1990 says, in section 5.1.2.2.1:

argv[argc] shall be a null pointer.

Perhaps this was changed in the meantime but I would be surprised.

It's not obvious that it does because why would you need a clumsy NULL terminator when you already have a proper argc?

Presumably so you can say char **p; for (p = argv; *p; p++) {…}. One might argue that this is superfluous and that people ought to be doing something like int i; for (i = 0; i < argc; i++) {…} instead, but chances are that in the run-up to ANSI/ISO 9899-1990 there were C implementations that did guarantee argv[argc] == 0 already, and the former idiom was sufficiently widespread that it was easier to mandate this for everyone than to outlaw (or leave vague) something that worked fine for a large enough number of people.

Handling argc==0 in the kernel

Posted Jan 31, 2022 9:55 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (1 responses)

I would bet that there was somebody who insisted that one or the other of those loops was 2% faster on architecture X (for some value of X), and therefore demanded that it be allowed.

I would tend to assume that the int loop is marginally easier to branch-predict? But I'm somewhat skeptical that that was relevant on the hardware of the time...

Handling argc==0 in the kernel

Posted Jan 31, 2022 11:19 UTC (Mon) by mfuzzey (subscriber, #57966) [Link]

Especially as the time spent processing the argument list is likely to be negligable compared with the time doing the exec() and all the subsequent setup of the new process (loading shared libraries etc). The strace output for hello world is quite "frightening"...

Handling argc==0 in the kernel

Posted Jan 31, 2022 13:14 UTC (Mon) by jem (subscriber, #24231) [Link] (6 responses)

One might also argue that argc is superfluous, and iterating over argv by incrementing a pointer variable, stopping when the terminating NULL is reached, would be the idiomatic C thing to do. After all, this would be analogous to how C's null-terminated strings are handled.

Handling argc==0 in the kernel

Posted Jan 31, 2022 19:10 UTC (Mon) by marcH (subscriber, #57642) [Link] (5 responses)

> One might also argue that argc is superfluous, ...

Yes and that would be very ironic in an article about a pkexec security issue caused by not using argc as termination and fixed by looking at it :-)

Anyway comparing termination in C versus all other programming languages was not my point. Using _both_ argc and NULL at the same time is idiomatic in exactly zero programming language, so that's confusing. So I'm merely suggesting adding one short sentence about that.

Handling argc==0 in the kernel

Posted Jan 31, 2022 19:42 UTC (Mon) by nybble41 (subscriber, #55106) [Link] (4 responses)

> caused by not using argc as termination

One could equally well cast this issue as "caused by not checking the first element of the array for the NULL terminator". It's exactly the same error as if you received a NUL-terminated string and started iterating over it from the second element, assuming the string to be non-empty.

I'm in the Pascal-style string/list camp myself—lengths should be explicit—but using argc instead of checking for NULL is neither necessary nor sufficient to avoid this problem. (To illustrate the latter, consider the case of a program making the same non-empty argv assumption and using a loop of the form "int i = 1; do { ... } while (i++ < argc)" to shave off a "redundant" bounds check.)

Handling argc==0 in the kernel

Posted Jan 31, 2022 19:57 UTC (Mon) by marcH (subscriber, #57642) [Link] (3 responses)

> One could equally well cast this issue as "caused by not checking the first element of the array for the NULL terminator". It's exactly the same error as if you received a NUL-terminated string and started iterating over it from the second element, assuming the string to be non-empty.

It's "exactly the same" only if your termination looks like "i != argc" but no one ever does that. Everyone does "i < argc" or something like it. I mean no one writes:

for (int i; i++; i!=argc)

> but using argc instead of checking for NULL is neither necessary nor sufficient to avoid this problem.

I haven't looked at the actual code but if pkexec's loop had used an argc-based termination then the loop would have not run at all.

> int i = 1; do { ... } while (i++ < argc)" to shave off a "redundant" bounds check.)

That's bending over backwards to avoid a simpler for loop. I'm sure you can find some real world and valid cases for this but most of the time people just write a simpler for loop.

Handling argc==0 in the kernel

Posted Jan 31, 2022 22:19 UTC (Mon) by marcH (subscriber, #57642) [Link]

> for (int i; i++; i!=argc)

Of course I meant no one writes this:

for (int i=start; i!=argc; i++)

Sorry for the noise.

Handling argc==0 in the kernel

Posted Feb 1, 2022 1:58 UTC (Tue) by NYKevin (subscriber, #129325) [Link] (1 responses)

> It's "exactly the same" only if your termination looks like "i != argc" but no one ever does that. Everyone does "i < argc" or something like it. I mean no one writes:
>
> for (int i; i++; i!=argc)

Maybe not in C, but C++ tacitly encourages this sort of nonsense because iterators are not, strictly speaking, required to be ordered (for example, a linked-list iterator may only be comparable for equality, because there is no O(1) way of ordering two iterators). So if you're taking C++ code and porting it to straight C, the most straightforward translation of a foreach loop is to use an inequality comparison instead of a less-than comparison, and it's possible that some programmers who started out as C++ programmers and later learned C may have picked this up as an idiom.

(Incidentally, these people also tend to favor the preincrement operator over the postincrement operator in contexts where there's no semantic difference, because in C++, postincrement can be significantly costlier due to the need to make a copy.)

Handling argc==0 in the kernel

Posted Feb 1, 2022 12:59 UTC (Tue) by madscientist (subscriber, #16861) [Link]

In my experience with C++, people use < most of the time when looping over integers. You can't perform integer looping with an iterator, unless you add your own specializations which people don't really bother to do, so these kinds of loops are written differently than loops with STL iterators anyway.

Also even in C, old-school programmers preferred pre-increment because post-increment was less efficient. Of course it was hard to detect that inefficiency, but it was a peg to hang one's preferential hat on and that's all that's needed when faced with two essentially equivalent choices. In C++ with operator overloading it can be measurably less efficient, of course, depending on the type implementation.

Handling argc==0 in the kernel

Posted Feb 13, 2022 23:46 UTC (Sun) by Smon (guest, #104795) [Link]

There will be less breaking userspace, if two methods are combined:
Only set argv[0] to "" if the binary is setuid (or has other privileges).

I am also happy with a kernel parameter or compile flag. When all distributions are using it, it can get the kernel default behavior.