Catching the mismatch
Catching the mismatch
Posted Feb 11, 2025 10:12 UTC (Tue) by farnz (subscriber, #17727)In reply to: Catching the mismatch by josh
Parent article: Maintainer opinions on Rust-for-Linux
One key usability difference here (and C compilers could adopt this); Rust allows you to signal to the compiler that a parameter or variable is intentionally unused for now with a single-character prefix, _. The equivalent warning in GCC requires an attribute to suppress it; so you'd have to write something like bool f(int *a, [[gnu::unused]] int *b) or bool f(int *a, __attribute__((unused)) int *b) to suppress the warning (I may have put the attribute in the wrong place).
Posted Feb 11, 2025 11:20 UTC (Tue)
by adobriyan (subscriber, #30858)
[Link] (2 responses)
C compilers could (not warn) for underscored names while C programmers are waiting for C23.
Posted Feb 11, 2025 11:28 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (1 responses)
Posted Feb 13, 2025 5:26 UTC (Thu)
by aaronmdjones (subscriber, #119973)
[Link]
Posted Feb 11, 2025 11:24 UTC (Tue)
by q3cpma (subscriber, #120859)
[Link] (2 responses)
Posted Feb 11, 2025 11:41 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
That leads to you wanting a simple and obvious way to suppress the warning when you're not yet using a parameter, and yet to still give it a name. The alternative (and what most C and C++ codebases I've seen do) is to simply not bother enabling the warning at all, because it's too noisy.
[1] First, you want the codebase to be warning free because that means that a new warning appearing is something to fix; you want warnings to tell you that something is wrong with the codebase, and not to be something that you routinely ignore because they're meaningless. Second, you want maintainers to be able to say "patches 1 through 5 are applied, but I don't like the way you frobnicate your filament in patch 6, so please redo patches 6 onwards with the following advice in mind". And third, if I see a function bool do_useful_thing(int *operations_table, int index, bit_mask), it's hard to work out what the last parameter will mean later. If I see bool do_useful_thing(int *operations_table, int index, bit_mask _ignored_cpus_mask), it's obvious that that last parameter will be an ignored CPUs bitmask when ignoring CPUs is implemented later.
Posted Feb 11, 2025 13:05 UTC (Tue)
by Wol (subscriber, #4433)
[Link]
When I was programming in C regularly, it happened a lot, and it was a pain in the arse - the warnings swamped everything else.
As soon as you start using callbacks, passing functions to libraries, what have you (and it's libraries that are the problem, where the generic case needs the parameter but your case doesn't), you want to be able to say "I know this variable isn't used, it's there for a reason".
This was Microsoft C6 with warning level 4, and I just couldn't find a way to suppress a warning - everything I tried might suppress the warning I was trying to get rid of, but just triggered a different warning instead.
Fortunately, it was me that set the rules about errors, warnings etc, so the rule basically said "all warnings must be fixed or explained away".
Cheers,
Posted Feb 11, 2025 11:44 UTC (Tue)
by josh (subscriber, #17465)
[Link] (14 responses)
One of the most common cases for it: you need to pass a closure accepting an argument, and you want to ignore the argument: |_| do_thing()
Posted Feb 11, 2025 12:30 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (13 responses)
I understand the reasons for this, and why it's also a useful distinction to be able to make, but I've also seen people get caught out by it because they're trying to copy locking habits from other languages, and write let _ = structure.mutex.lock(); expecting that this means they hold the mutex - where mutex has type mutex: Mutex<()> to imitate a plain lock from another language.
Posted Feb 11, 2025 12:48 UTC (Tue)
by heftig (subscriber, #73632)
[Link] (2 responses)
Posted Feb 11, 2025 14:05 UTC (Tue)
by khim (subscriber, #9252)
[Link]
Have you submitted a request? Clippy is not supposed to be “opionated by default”, but it is Ok for it to have “opionated lints” (as long as they are not default) and that one sounds both useful (for some people) and easy to implement.
Posted Feb 11, 2025 14:22 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
Posted Feb 11, 2025 13:56 UTC (Tue)
by adobriyan (subscriber, #30858)
[Link] (9 responses)
Posted Feb 11, 2025 14:46 UTC (Tue)
by Wol (subscriber, #4433)
[Link]
If you give a crossbow-man a musket, of course he's going to try to shoot his foot off :-)
Cheers,
Posted Feb 11, 2025 15:36 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (7 responses)
And it's unhealthy to not talk about the footguns in a language you like - just because you like it doesn't mean it's completely perfect :-)
Posted Feb 11, 2025 15:45 UTC (Tue)
by adobriyan (subscriber, #30858)
[Link] (6 responses)
Posted Feb 11, 2025 15:59 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (1 responses)
Posted Feb 11, 2025 20:42 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
Posted Feb 12, 2025 3:58 UTC (Wed)
by geofft (subscriber, #59789)
[Link] (2 responses)
So if you did write let _ = structure.mutex.lock();, yes, you would unlock the mutex immediately, but you also wouldn't have the ability to access the data behind the mutex unless you gave a name to the variable. Because Rust prevents you from completely forgetting to lock the mutex and accessing the data without first locking it, it also prevents you from ineffectively locking the mutex and accessing the data after you unlocked it.
Or in other words, there usually isn't a pattern of "get this RAII object and keep it around for its side effect while doing other stuff". Either you get the RAII object and actually reference it in the stuff you're doing, or you're using some non-RAII API like raw bindings to explicit lock() and unlock() calls where automatic drop isn't relevant.
Posted Feb 12, 2025 7:09 UTC (Wed)
by mb (subscriber, #50428)
[Link]
That's true. It's not done like this in the vast majority of cases.
But misusing this wouldn't (and must not) cause UB.
Posted Feb 12, 2025 11:04 UTC (Wed)
by farnz (subscriber, #17727)
[Link]
There isn't such a pattern in idiomatic Rust, but it gets written when you're still thinking in terms of C++ std::mutex or similar facilities from other languages.
And that makes this a very important footgun to call out, since someone who learnt about concurrency using Rust won't even realise this is an issue, while someone who comes from another language will perceive it as Rust's promises around "fearless concurrency" being broken unless they've already been made aware of this risk - or ask a Rust expert to explain their bug.
Posted Feb 12, 2025 8:31 UTC (Wed)
by ralfj (subscriber, #172874)
[Link]
That said, I agree this is quite surprising, and I've been bitten by this myself in the past.
Posted Feb 11, 2025 12:22 UTC (Tue)
by TomH (subscriber, #56149)
[Link] (2 responses)
In C that's only technically allowed from C23 on but gcc won't complain unless -pedantic is used, though clang will.
Posted Feb 11, 2025 12:43 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (1 responses)
And yes, I can (and have) worked around this with documentation comments, but those have a bad habit of getting stale, such that the doc comment refers to a parameter called "ignored_cpu_mask" that's been removed and replaced by a "active_cpu_mask"…
Posted Feb 11, 2025 17:35 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
See this Rust issue as well for when `_arg` naming is unwelcome: https://github.com/rust-lang/rust/issues/91074
Posted Feb 11, 2025 14:37 UTC (Tue)
by gray_-_wolf (subscriber, #131074)
[Link]
int f(int *a, int *b) {
Sure, takes one extra line, but I find it more readable compared to the attributes.
Posted Feb 11, 2025 14:42 UTC (Tue)
by alx.manpages (subscriber, #145117)
[Link] (5 responses)
In the following program, I don't use argc. Simply don't name it.
```
Posted Feb 11, 2025 14:47 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (4 responses)
Posted Feb 11, 2025 15:15 UTC (Tue)
by alx.manpages (subscriber, #145117)
[Link] (3 responses)
Then you'll need what others have mentioned.
Either a comment:
int
or a cast to void:
int
...
But Rust's _ is no documentation name either, so the C equivalent is indeed no name at all.
Posted Feb 11, 2025 15:18 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (2 responses)
Posted Feb 11, 2025 15:30 UTC (Tue)
by alx.manpages (subscriber, #145117)
[Link]
Ahh, sorry, I missed that. How about this?
#define _ [[maybe_unused]]
int
The _() function already exists (for internationalization of strings), which is why I wouldn't shadow it with this macro, but if you don't use _(), when you could define the undescore to be [[maybe_unused]]. Or you could find another name that serves you.
Posted Feb 11, 2025 21:33 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
Catching the mismatch
I just checked, because I recalled that the Standard reserves names beginning with underscore; but that only applies to globally visible names, not local ones, so it'd be fine to use a name beginning underscore to suppress the warning.
Catching the mismatch
Catching the mismatch
Catching the mismatch
It makes a huge difference to the usability of the warning. Ideally, you want your codebase to be warning-free after each reviewed patch is applied[1], so you want to be able to quickly suppress an unused variable warning when it's only going to exist until the next patch in the series is applied. You also want the name to stay around, so that if the code is usable as-is, with only part of the series applied, anyone using it has a clue about what value to fill in.
Catching the mismatch
Catching the mismatch
Wol
Catching the mismatch
One "interesting" surprise in Rust, however, is that in let bindings, there is a significant difference between let _ = … and let _foo = …; in the former, whatever you put in place of … is dropped immediately, while in the latter, it's dropped when _foo goes out scope (just as it would be if you used foo instead of _foo).
Underscores in Rust
As useful as this can sometimes be, e.g. when you really don't care about a Underscores in Rust
Result
, I wish clippy would suggest using drop(…)
instead of allowing let _ = …
.
Underscores in Rust
There's a lint to help with that - #![warn(let_underscore_drop)] will catch all the footgun cases where you've used let _ =. It won't stop you using it completely, just in the cases where there's a risk of changed behaviour due to the rules around drop timing.
Underscores in Rust
Underscores in Rust
Underscores in Rust
Wol
All usable languages have footguns - if they didn't, they'd also be blocking you from doing something useful.
Underscores in Rust
Underscores in Rust
It's very specifically a special case where you name a let binding _; you can't read it (_ isn't a real variable), and it drops anything bound to it immediately. _foo and foo behave in exactly the same way, however.
Underscores in Rust
Underscores in Rust
It's not quite as bad as you might think from the above example about a mutex, because a Rust-y mutex API (such as std::sync::Mutex) returns a guard object, which holds the mutex locked until it's dropped, and there's no way to get to the value protected by the mutex without having a guard object. (That is, if you want a mutex-protected structure, you write it as a mutex object that wraps the rest of the data, to make you deal with the mutex before getting to the data, as opposed to a structure with several members, one of which is the mutex that protects the other members, where you can easily bypass the mutex intentionally or unintentionally.) Usually this is implemented via the Deref/DerefMut traits, where you can treat the guard object as a smart pointer and do let mut guard = mutex.lock(); *guard += 1 or whatever, but you can also choose to design an API where the guard object has some sort of methods that return borrowed references to the data. The borrows cannot outlast the guard object, and the mutex is locked so long as the guard object remains in scope.
Underscores in Rust
Underscores in Rust
But there are rare exceptions:
https://docs.rs/tokio/1.43.0/tokio/sync/struct.Semaphore....
Underscores in Rust
Or in other words, there usually isn't a pattern of "get this RAII object and keep it around for its side effect while doing other stuff". Either you get the RAII object and actually reference it in the stuff you're doing, or you're using some non-RAII API like raw bindings to explicit lock() and unlock() calls where automatic drop isn't relevant.
Underscores in Rust
Catching the mismatch
The comment trick q3cpma points out would work, but I'd normally want to have a name in there because of the partially applied patch series problem, where I want people to be aware that this parameter does have a meaning.
Catching the mismatch
Catching the mismatch
Catching the mismatch
(void)b;
...
}
Unused parameters in C are easy
alx@devuan:~/tmp$ cat unused.c
#include <stdio.h>
int
main(int, char *argv[])
{
puts(argv[0]);
return 0;
}
alx@devuan:~/tmp$ gcc -Wall -Wextra unused.c
alx@devuan:~/tmp$ ./a.out
./a.out
```
OK, so how do I name the parameter for documentation purposes while marking it as unused?
Unused parameters in C are easy
Unused parameters in C are easy
main(int /*argc*/, char *argv[])
{...}
main(int argc, char *argv[])
{
(void) argc;
}
The cast-to-void is nice, when you want a name.
I'm sorry, you've confused me; why is Rust's use of _ignored_cpu_mask "no documentation name either"? All names starting with an underscore are flagged as explicitly unused, not just the underscore on its own. And the C things so far are all much more work than just adding an `_` to the name to indicate that it's currently deliberately unused - my contention is that if you don't make it really simple, people will prefer to turn off the entire warning rather than use the convention for unused parameters.
Unused parameters in C are worse than in Rust
Unused parameters in C are worse than in Rust
> All names starting with an underscore are flagged as explicitly unused, not just the underscore on its own.
main(_ int argc, char *argv[])
{...}
Unused parameters in C are worse than in Rust