|
|
Subscribe / Log in / New account

Catching the mismatch

Catching the mismatch

Posted Feb 11, 2025 10:12 UTC (Tue) by farnz (subscriber, #17727)
In reply to: Catching the mismatch by josh
Parent article: Maintainer opinions on Rust-for-Linux

One key usability difference here (and C compilers could adopt this); Rust allows you to signal to the compiler that a parameter or variable is intentionally unused for now with a single-character prefix, _. The equivalent warning in GCC requires an attribute to suppress it; so you'd have to write something like bool f(int *a, [[gnu::unused]] int *b) or bool f(int *a, __attribute__((unused)) int *b) to suppress the warning (I may have put the attribute in the wrong place).


to post comments

Catching the mismatch

Posted Feb 11, 2025 11:20 UTC (Tue) by adobriyan (subscriber, #30858) [Link] (2 responses)

> warning: unused parameter '_' [-Wunused-parameter]

C compilers could (not warn) for underscored names while C programmers are waiting for C23.

Catching the mismatch

Posted Feb 11, 2025 11:28 UTC (Tue) by farnz (subscriber, #17727) [Link] (1 responses)

I just checked, because I recalled that the Standard reserves names beginning with underscore; but that only applies to globally visible names, not local ones, so it'd be fine to use a name beginning underscore to suppress the warning.

Catching the mismatch

Posted Feb 13, 2025 5:26 UTC (Thu) by aaronmdjones (subscriber, #119973) [Link]

It's more complicated than that. The standard reserves names beginning with a single underscore only if the following character is an ASCII uppercase letter. A name beginning with an underscore followed by a lowercase letter is not reserved. If you want to guarantee a reserved name, it needs to begin with two underscores.

Catching the mismatch

Posted Feb 11, 2025 11:24 UTC (Tue) by q3cpma (subscriber, #120859) [Link] (2 responses)

Does it matter much? Such a real world case shouldn't happen often. Also, a very common way of achieving the same result (we use that in our C++ code base to please clang-tidy and co.) is to comment the parameter name, like so: `bool f(int* a, int* /*b*/)`

Catching the mismatch

Posted Feb 11, 2025 11:41 UTC (Tue) by farnz (subscriber, #17727) [Link]

It makes a huge difference to the usability of the warning. Ideally, you want your codebase to be warning-free after each reviewed patch is applied[1], so you want to be able to quickly suppress an unused variable warning when it's only going to exist until the next patch in the series is applied. You also want the name to stay around, so that if the code is usable as-is, with only part of the series applied, anyone using it has a clue about what value to fill in.

That leads to you wanting a simple and obvious way to suppress the warning when you're not yet using a parameter, and yet to still give it a name. The alternative (and what most C and C++ codebases I've seen do) is to simply not bother enabling the warning at all, because it's too noisy.

[1] First, you want the codebase to be warning free because that means that a new warning appearing is something to fix; you want warnings to tell you that something is wrong with the codebase, and not to be something that you routinely ignore because they're meaningless. Second, you want maintainers to be able to say "patches 1 through 5 are applied, but I don't like the way you frobnicate your filament in patch 6, so please redo patches 6 onwards with the following advice in mind". And third, if I see a function bool do_useful_thing(int *operations_table, int index, bit_mask), it's hard to work out what the last parameter will mean later. If I see bool do_useful_thing(int *operations_table, int index, bit_mask _ignored_cpus_mask), it's obvious that that last parameter will be an ignored CPUs bitmask when ignoring CPUs is implemented later.

Catching the mismatch

Posted Feb 11, 2025 13:05 UTC (Tue) by Wol (subscriber, #4433) [Link]

> Does it matter much? Such a real world case shouldn't happen often.

When I was programming in C regularly, it happened a lot, and it was a pain in the arse - the warnings swamped everything else.

As soon as you start using callbacks, passing functions to libraries, what have you (and it's libraries that are the problem, where the generic case needs the parameter but your case doesn't), you want to be able to say "I know this variable isn't used, it's there for a reason".

This was Microsoft C6 with warning level 4, and I just couldn't find a way to suppress a warning - everything I tried might suppress the warning I was trying to get rid of, but just triggered a different warning instead.

Fortunately, it was me that set the rules about errors, warnings etc, so the rule basically said "all warnings must be fixed or explained away".

Cheers,
Wol

Catching the mismatch

Posted Feb 11, 2025 11:44 UTC (Tue) by josh (subscriber, #17465) [Link] (14 responses)

Yeah, having a one-character way to signal "this is intentional" goes a long way towards making this comfortable to have on by default.

One of the most common cases for it: you need to pass a closure accepting an argument, and you want to ignore the argument: |_| do_thing()

Underscores in Rust

Posted Feb 11, 2025 12:30 UTC (Tue) by farnz (subscriber, #17727) [Link] (13 responses)

One "interesting" surprise in Rust, however, is that in let bindings, there is a significant difference between let _ = … and let _foo = …; in the former, whatever you put in place of … is dropped immediately, while in the latter, it's dropped when _foo goes out scope (just as it would be if you used foo instead of _foo).

I understand the reasons for this, and why it's also a useful distinction to be able to make, but I've also seen people get caught out by it because they're trying to copy locking habits from other languages, and write let _ = structure.mutex.lock(); expecting that this means they hold the mutex - where mutex has type mutex: Mutex<()> to imitate a plain lock from another language.

Underscores in Rust

Posted Feb 11, 2025 12:48 UTC (Tue) by heftig (subscriber, #73632) [Link] (2 responses)

As useful as this can sometimes be, e.g. when you really don't care about a Result, I wish clippy would suggest using drop(…) instead of allowing let _ = ….

Underscores in Rust

Posted Feb 11, 2025 14:05 UTC (Tue) by khim (subscriber, #9252) [Link]

Have you submitted a request? Clippy is not supposed to be “opionated by default”, but it is Ok for it to have “opionated lints” (as long as they are not default) and that one sounds both useful (for some people) and easy to implement.

Underscores in Rust

Posted Feb 11, 2025 14:22 UTC (Tue) by farnz (subscriber, #17727) [Link]

There's a lint to help with that - #![warn(let_underscore_drop)] will catch all the footgun cases where you've used let _ =. It won't stop you using it completely, just in the cases where there's a risk of changed behaviour due to the rules around drop timing.

Underscores in Rust

Posted Feb 11, 2025 13:56 UTC (Tue) by adobriyan (subscriber, #30858) [Link] (9 responses)

Did you just said that Rust has footguns? Blasphemy!

Underscores in Rust

Posted Feb 11, 2025 14:46 UTC (Tue) by Wol (subscriber, #4433) [Link]

> because they're trying to copy locking habits from other languages

If you give a crossbow-man a musket, of course he's going to try to shoot his foot off :-)

Cheers,
Wol

Underscores in Rust

Posted Feb 11, 2025 15:36 UTC (Tue) by farnz (subscriber, #17727) [Link] (7 responses)

All usable languages have footguns - if they didn't, they'd also be blocking you from doing something useful.

And it's unhealthy to not talk about the footguns in a language you like - just because you like it doesn't mean it's completely perfect :-)

Underscores in Rust

Posted Feb 11, 2025 15:45 UTC (Tue) by adobriyan (subscriber, #30858) [Link] (6 responses)

I, for one, didn't even know that renaming variable in Rust could affect code in such a critical way.

Underscores in Rust

Posted Feb 11, 2025 15:59 UTC (Tue) by farnz (subscriber, #17727) [Link] (1 responses)

It's very specifically a special case where you name a let binding _; you can't read it (_ isn't a real variable), and it drops anything bound to it immediately. _foo and foo behave in exactly the same way, however.

Underscores in Rust

Posted Feb 11, 2025 20:42 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

A small nitpick: `_foo` will not trigger diagnostics if it is not used whereas an unused `foo` will.

Underscores in Rust

Posted Feb 12, 2025 3:58 UTC (Wed) by geofft (subscriber, #59789) [Link] (2 responses)

It's not quite as bad as you might think from the above example about a mutex, because a Rust-y mutex API (such as std::sync::Mutex) returns a guard object, which holds the mutex locked until it's dropped, and there's no way to get to the value protected by the mutex without having a guard object. (That is, if you want a mutex-protected structure, you write it as a mutex object that wraps the rest of the data, to make you deal with the mutex before getting to the data, as opposed to a structure with several members, one of which is the mutex that protects the other members, where you can easily bypass the mutex intentionally or unintentionally.) Usually this is implemented via the Deref/DerefMut traits, where you can treat the guard object as a smart pointer and do let mut guard = mutex.lock(); *guard += 1 or whatever, but you can also choose to design an API where the guard object has some sort of methods that return borrowed references to the data. The borrows cannot outlast the guard object, and the mutex is locked so long as the guard object remains in scope.

So if you did write let _ = structure.mutex.lock();, yes, you would unlock the mutex immediately, but you also wouldn't have the ability to access the data behind the mutex unless you gave a name to the variable. Because Rust prevents you from completely forgetting to lock the mutex and accessing the data without first locking it, it also prevents you from ineffectively locking the mutex and accessing the data after you unlocked it.

Or in other words, there usually isn't a pattern of "get this RAII object and keep it around for its side effect while doing other stuff". Either you get the RAII object and actually reference it in the stuff you're doing, or you're using some non-RAII API like raw bindings to explicit lock() and unlock() calls where automatic drop isn't relevant.

Underscores in Rust

Posted Feb 12, 2025 7:09 UTC (Wed) by mb (subscriber, #50428) [Link]

>there usually isn't a pattern of "get this RAII object and keep it around for its side effect while doing other stuff".

That's true. It's not done like this in the vast majority of cases.
But there are rare exceptions:
https://docs.rs/tokio/1.43.0/tokio/sync/struct.Semaphore....

But misusing this wouldn't (and must not) cause UB.

Underscores in Rust

Posted Feb 12, 2025 11:04 UTC (Wed) by farnz (subscriber, #17727) [Link]

Or in other words, there usually isn't a pattern of "get this RAII object and keep it around for its side effect while doing other stuff". Either you get the RAII object and actually reference it in the stuff you're doing, or you're using some non-RAII API like raw bindings to explicit lock() and unlock() calls where automatic drop isn't relevant.

There isn't such a pattern in idiomatic Rust, but it gets written when you're still thinking in terms of C++ std::mutex or similar facilities from other languages.

And that makes this a very important footgun to call out, since someone who learnt about concurrency using Rust won't even realise this is an issue, while someone who comes from another language will perceive it as Rust's promises around "fearless concurrency" being broken unless they've already been made aware of this risk - or ask a Rust expert to explain their bug.

Underscores in Rust

Posted Feb 12, 2025 8:31 UTC (Wed) by ralfj (subscriber, #172874) [Link]

The thing is, it's not just renaming a variable. '_' is not a variable name, it is an entirely different syntactic entity: it is a *pattern*. Other examples of patterns are 'Some(x)' and '(a, b, c)'. The meaning of the '_' pattern is "discard the value matched against this pattern as soon as possible".

That said, I agree this is quite surprising, and I've been bitten by this myself in the past.

Catching the mismatch

Posted Feb 11, 2025 12:22 UTC (Tue) by TomH (subscriber, #56149) [Link] (2 responses)

You can also leave the name out completely in C++ to avoid the warning.

In C that's only technically allowed from C23 on but gcc won't complain unless -pedantic is used, though clang will.

Catching the mismatch

Posted Feb 11, 2025 12:43 UTC (Tue) by farnz (subscriber, #17727) [Link] (1 responses)

The comment trick q3cpma points out would work, but I'd normally want to have a name in there because of the partially applied patch series problem, where I want people to be aware that this parameter does have a meaning.

And yes, I can (and have) worked around this with documentation comments, but those have a bad habit of getting stale, such that the doc comment refers to a parameter called "ignored_cpu_mask" that's been removed and replaced by a "active_cpu_mask"…

Catching the mismatch

Posted Feb 11, 2025 17:35 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

There are `clang-tidy` checks that check that the declaration parameter name matches the definition name. Nothing more fun than staring at code with a `int f(int a, int b)` declaration but a `int f(int b, int a)` typo in the definition. Though these tools can "read" comments to know that `/*b*/` at the definition site matches the declaration name as well.

See this Rust issue as well for when `_arg` naming is unwelcome: https://github.com/rust-lang/rust/issues/91074

Catching the mismatch

Posted Feb 11, 2025 14:37 UTC (Tue) by gray_-_wolf (subscriber, #131074) [Link]

I would write:

int f(int *a, int *b) {
(void)b;
...
}

Sure, takes one extra line, but I find it more readable compared to the attributes.

Unused parameters in C are easy

Posted Feb 11, 2025 14:42 UTC (Tue) by alx.manpages (subscriber, #145117) [Link] (5 responses)

The way to have unused parameters in C is even easier than in Rust. No need for an underscore; just leave it unnamed.

In the following program, I don't use argc. Simply don't name it.

```
alx@devuan:~/tmp$ cat unused.c
#include <stdio.h>
int
main(int, char *argv[])
{
puts(argv[0]);
return 0;
}
alx@devuan:~/tmp$ gcc -Wall -Wextra unused.c
alx@devuan:~/tmp$ ./a.out
./a.out
```

Unused parameters in C are easy

Posted Feb 11, 2025 14:47 UTC (Tue) by farnz (subscriber, #17727) [Link] (4 responses)

OK, so how do I name the parameter for documentation purposes while marking it as unused?

Unused parameters in C are easy

Posted Feb 11, 2025 15:15 UTC (Tue) by alx.manpages (subscriber, #145117) [Link] (3 responses)

> how do I name the parameter for documentation purposes while marking it as unused?

Then you'll need what others have mentioned.

Either a comment:

int
main(int /*argc*/, char *argv[])
{...}

or a cast to void:

int
main(int argc, char *argv[])
{
(void) argc;

...
}

But Rust's _ is no documentation name either, so the C equivalent is indeed no name at all.
The cast-to-void is nice, when you want a name.

Unused parameters in C are worse than in Rust

Posted Feb 11, 2025 15:18 UTC (Tue) by farnz (subscriber, #17727) [Link] (2 responses)

I'm sorry, you've confused me; why is Rust's use of _ignored_cpu_mask "no documentation name either"? All names starting with an underscore are flagged as explicitly unused, not just the underscore on its own. And the C things so far are all much more work than just adding an `_` to the name to indicate that it's currently deliberately unused - my contention is that if you don't make it really simple, people will prefer to turn off the entire warning rather than use the convention for unused parameters.

Unused parameters in C are worse than in Rust

Posted Feb 11, 2025 15:30 UTC (Tue) by alx.manpages (subscriber, #145117) [Link]

> why is Rust's use of _ignored_cpu_mask "no documentation name either"?
> All names starting with an underscore are flagged as explicitly unused, not just the underscore on its own.

Ahh, sorry, I missed that. How about this?

#define _ [[maybe_unused]]

int
main(_ int argc, char *argv[])
{...}

The _() function already exists (for internationalization of strings), which is why I wouldn't shadow it with this macro, but if you don't use _(), when you could define the undescore to be [[maybe_unused]]. Or you could find another name that serves you.

Unused parameters in C are worse than in Rust

Posted Feb 11, 2025 21:33 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

I filed this issue[1] about the case of default implementations of trait methods which don't use all of their arguments. I don't want the underscore-leading name to show up in the docs for the trait, but I also don't want the diagnostic. `let _for_rustdoc = unused_arg;` works and documents *why* I am doing it, but I'd still like a better solution.

[1] https://github.com/rust-lang/rust/issues/91074


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds