Google's effort to mitigate memory-safety issues

[Posted February 18, 2021 by corbet]

The Google Security Blog carries an announcement of a heightened effort to reimplement security-critical software in memory-safe languages. "The new Rust-based HTTP and TLS backends for curl and now this new TLS library for Apache httpd are an important starting point in this overall effort. These codebases sit at the gateway to the internet and their security is critical in the protection of data for millions of users worldwide."

Google's effort to mitigate memory-safety issues

Posted Feb 18, 2021 14:23 UTC (Thu) by ledow (guest, #11753) [Link] (60 responses)

How much of it is marked as "unsafe" Rust code?

Google's effort to mitigate memory-safety issues

Posted Feb 18, 2021 14:30 UTC (Thu) by ledow (guest, #11753) [Link] (57 responses)

I'll reply to myself:

Just the HTTP library alone has 60+ "unsafe" keywords in it.

Google's effort to mitigate memory-safety issues

Posted Feb 18, 2021 16:07 UTC (Thu) by TheGopher (subscriber, #59256) [Link] (25 responses)

Why would you need to do unsafe things in an encoding/decoding implementation?

Google's effort to mitigate memory-safety issues

Posted Feb 18, 2021 18:28 UTC (Thu) by jpab (subscriber, #105231) [Link] (24 responses)

The 'hyper' library does more than just raw encoding & decoding; it pulls many pieces together - e.g., it pulls in HTTP parsing libraries for both HTTP/1.1 and HTTP/2 and detects which to use for an incoming connection, it does the book-keeping for live connections, it provides async/futures based implementation of the protocols, etc.

For the core HTTP (not HTTP/2) 'decoding' part you want to look at the 'httparse' library (which hyper uses). httparse also includes uses of 'unsafe'. From some grepping (but no deep review), it looks like there are two main categories of unsafe use in httparse: (a) Use to access SIMD intrinsics for some fast scanning - you can't call these functions without using an unsafe block, just like you can't call out to many C library functions without an unsafe block, so I consider this normal and expected. It needs some care in review but assessing safety should be relatively easy and only require knowledge of the local piece of code. Not a big deal IMHO. (b) Some manual byte array scanning that skips some index bounds checks. That stuff worries me more. I haven't reviewed the code in any detail so I wouldn't want to make any overall claims about whether this is ok or not (it has been used a lot, it's got good tests, fuzzing, etc), but I would personally hope that this category of uses of unsafe can be reduced, even if they can't be eliminated entirely.

As a point of comparison, the 'rustls' library, which implements the TLS protocol, does not use 'unsafe' at all (in the library itself - some of the libraries it depends on do use 'unsafe'!), and went through an independent security audit which it passed with flying colours (the report is available in the rustls git repository, https://github.com/ctz/rustls/tree/main/audit). I can't claim to be an expert, but I believe rustls is one of the strongest TLS implementations available, quite possibly *the* strongest. rustls achieves that strength because the developers have made it an overriding concern in the design and development of the library. As always, the priorities of individual developers still make a difference. Rust provides the *capability* of achieving a very high level of confidence and robustness.

Google's effort to mitigate memory-safety issues

Posted Feb 19, 2021 15:04 UTC (Fri) by hmh (subscriber, #3838) [Link] (23 responses)

I wonder about the supply-chain security of this thing? curl is no joke, it is used *everywhere*.

It can't itself be subject to library bloat, that would be much much more dangerous than its already-matured C back-ends. There is no language-safety against trojanned supply-chain attacks. And you can't secure a supply-chain without full agreement of all involved.

Google's effort to mitigate memory-safety issues

Posted Feb 19, 2021 16:06 UTC (Fri) by sandsmark (guest, #62172) [Link] (20 responses)

Well, more or less the only reason I haven't bothered learning rust yet is because it seems like the reliance on cargo (npm style) makes it almost impossible to package rust projects properly, which in turn makes auditing the supply chain harder.

And then there's the fact that a lot of rust projects tell me to do a classic "curl http://foo.bar/baz | sudo sh"... (quasi-pun using curl not intended)

Google's effort to mitigate memory-safety issues

Posted Feb 19, 2021 21:14 UTC (Fri) by NAR (subscriber, #1313) [Link] (19 responses)

What do you do when you develop a a C++ program, find a library that could be really useful, but it's not packaged on your favourite distribution (or worse - a too old version is packaged)?

Google's effort to mitigate memory-safety issues

Posted Feb 20, 2021 1:19 UTC (Sat) by LtWorf (subscriber, #124958) [Link] (18 responses)

I'm a different person, but the answer is "I package it".

It has happened a couple of times in python for me.

Google's effort to mitigate memory-safety issues

Posted Feb 20, 2021 12:20 UTC (Sat) by roc (subscriber, #30627) [Link] (17 responses)

For every version of every Linux distribution any of your users uses?

Google's effort to mitigate memory-safety issues

Posted Feb 21, 2021 8:17 UTC (Sun) by laf0rge (subscriber, #6469) [Link] (1 responses)

Packaging stuff for many distributions has become extremely easy ever since OBS (open build service) has been around. I'm running a couple of projects that use OBS for providing packages to a variety of distributiosn, including our own dependencies - and as needed -third party dependencies.

Google's effort to mitigate memory-safety issues

Posted Feb 21, 2021 23:59 UTC (Sun) by roc (subscriber, #30627) [Link]

How do you distribute those packages?

Google's effort to mitigate memory-safety issues

Posted Feb 22, 2021 7:32 UTC (Mon) by LtWorf (subscriber, #124958) [Link] (14 responses)

For 1 big one.

Then between derivatives and users on arch and gentoo usually being very proactive in making packages.

Anyway making a setup.exe or a osx .app is 100000x more time consuming than packaging for all linux distributions that exist.

Google's effort to mitigate memory-safety issues

Posted Feb 22, 2021 10:40 UTC (Mon) by mbunkus (subscriber, #87248) [Link] (2 responses)

> Anyway making a setup.exe or a osx .app is 100000x more time consuming than packaging for all linux distributions that exist.

You're completely wrong. In the almost 20 years I've been maintaining MKVToolNix now, I've probably spent quadruple the amount of time on Linux packaging that I've had to spend on Windows packaging, if not more. Even packaging for macOS, which is a major PITA, cost me a lot less time than having to package for Debian, Ubuntu & CentOS (combined).

And Linux is my primary development platform, not the other systems.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 7:28 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (1 responses)

I also maintain something linux/windows (used to for osx as well). https://ltworf.github.io/relational/

Windows doesn't include the redistributables. For no reason, since the install is huge anyway.

Python links different redistributables.

Then you must be sure to install Qt.

Then you might want to install fonts if you use math symbols.

Then you must use a python2exe kind of tool that detects the dependencies and packages them.

As of last summer there are a number of those but none is actually working for me, so my installer is embedding the microsoft redistributable, the python installer, launching them, then running a shell script to use pip to download all the dependencies.

Generally every time I do a release, something that used to work no longer works so the process needs to be tweaked and can't be automated.

Compare this to making a control file with the list of packages that need to be installed and an install target in the makefile that copies the various files where they need to be.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 13:16 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

Well, Python is its own nightmare, yes, I will grant you that. We just build our own Python and ship it (without pip because we don't want to ship libssl due to the urgency bugs in it causes) on each platform. That does save a lot of trouble. And yes, we also bundle the redistrbutables, but CMake/CPack make that part easy.

Google's effort to mitigate memory-safety issues

Posted Feb 22, 2021 13:11 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

> Anyway making a setup.exe or a osx .app is 100000x more time consuming than packaging for all linux distributions that exist.
That's... not even close to truth. Even for the simplest of packages a Windows installer is way easier than any Linux distro packaging. It's even easier for macOS, especially if you are OK with Apple Store.

The whole Docker mess happened exactly because it's close to impossible to make reliable native Linux packages.

Google's effort to mitigate memory-safety issues

Posted Feb 22, 2021 13:50 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (2 responses)

Agreed. Windows is "easy" to package an app because you just toss a pile of DLLs in a directory and call it a day. SDKs are harder, but that's something completely different.

Apple is a pain because if you're not using Xcode, you're off in the weeds of undocumented directory layouts and such. RPATH is also asinine (IMO) on the platform. SDKs are also a nightmare without Xcode to just do it for you.

Linux is hard to make a single package for (we don't have the testing capacity for per-distro, nevermind per-distro-release binaries) because you have to decide what is a system package (X, OpenGL, libc) and what you need to bundle (just about everything else). We also just make tarballs so as to not require root permissions just to test a new build. Soname shifts are a real pain to deal with too. We just end up building on CentOS 7 (mainly for age and because devtoolset is a nice way to get new compilers without requiring a newer stdlib) so as to support any platform with a newer libc around.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 7:30 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (1 responses)

> Agreed. Windows is "easy" to package an app because you just toss a pile of DLLs in a directory and call it a day. SDKs are harder, but that's something completely different.

Try to do it with a python app.

On linux I just list the dependencies, on windows I actually have to provide them and testing on a clean system requires a vm rather than a chroot, then licensing issues happen. It's so much harder to automate.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 13:17 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

*shrug* We bundle everything on Linux too because we can't make a package for each platform *and* maintain it. So even Linux gets a tarball that we expect to just be extracted wherever.

Google's effort to mitigate memory-safety issues

Posted Feb 22, 2021 15:07 UTC (Mon) by rahulsundaram (subscriber, #21946) [Link] (6 responses)

> Anyway making a setup.exe or a osx .app is 100000x more time consuming than packaging for all linux distributions that exist.

No, exactly the opposite. I have done distro packaging for like two decades including for the distribution itself but also for various jobs and Linux is by far more time consuming. It is a wild wild jungle of all sort of different library versions, packaging formats etc. This is in part what Docker solves but that isn't suitable for everyone either. The equivalent for desktop applications would be something like Flatpak and you essentially have to bundle all your major dependencies (outside of the flatpak runtimes) to be able to pull that off.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 7:33 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (5 responses)

> It is a wild wild jungle of all sort of different library versions,

Target stable distributions.

Also for packaging for distributions I meant put it in the distribution so they automatically recompile whenever that's needed.

> packaging formats etc

Packaging formats normally expect a standard Makefile, and have a bit of overhead for metadata. Unless you are doing some overly complicated system software or had a terrible build from before, making a package is just about filling in some fields on a few text files.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 8:19 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

> Target stable distributions.
Like RHEL 6?

Versions differ by quite a lot even across "stable" distributions.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 9:18 UTC (Tue) by zdzichu (subscriber, #17118) [Link] (1 responses)

Ha ha, RHEL 6 reached end of life in November last year. It is not stable, it's dead.
If recipient of your software have money to pay RH for Extended Life Cycle support beyond EOL, then they should find money to pay you extra for packaging software for obsolete platform.
At the same time, if they're using dead platform for misguided "stability" reasons, then they do not want any fresh software on it, including yours.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 10:35 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

Or maybe RHEL 7 then (which is still very much alive)? Enjoy the 8 year old libraries!

Never mind that the official repos don't actually have that many libraries and you have to vendor them.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 10:26 UTC (Tue) by rahulsundaram (subscriber, #21946) [Link]

> Target stable distributions.

>Also for packaging for distributions I meant put it in the distribution so they >automatically recompile whenever that's needed.

There isn't just one stable distribution but plenty depending on how you define it and they all might have consistent (within their distro lifecycle) but different library versions and no, you cannot always put it in the distribution because in the case of commercial distributions, they ship what they can support, your licensing isn't compliant with the distribution requirements or you just have a different lifecyle to the distro (you know, the thing that makes the distribution "stable" )

> making a package is just about filling in some fields on a few text files

Sure if you have something very simple to package. Lucky you.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 13:10 UTC (Tue) by roc (subscriber, #30627) [Link]

Lots of rr users aren't using stable distributions. We don't get to dictate what distros our users use.

Google's effort to mitigate memory-safety issues

Posted Feb 19, 2021 16:31 UTC (Fri) by jpab (subscriber, #105231) [Link]

Yeah, supply-chain security is an interesting question. Both the packages themselves and the common infrastructure at crates.io must be a very attractive target.

I think at a technical level cargo provides reasonably good facilities to mitigate supply chain risk. It follows the now-commonplace pattern of keeping a lock file that includes a content hash for each dependency - that kind of thing. You can of course also make sure to use local sources for all your dependencies (ie, vendor them). I think at the level of "can the curl developers set up their build system etc so that they have strong assurances about what they're building", then I would say the answer is yes, the capability is there. I don't know anything about curl's build process in general so I don't know how much integration pain there is there.

I think the social side is much more interesting - does the Rust community have a good culture w.r.t. dependency management, auditing, vetting, etc. I certainly see a lot of packages on crates.io have quite a lot of dependencies in total (cargo makes it easy, so it happens). I'm not in general too worried about the total counts/dep tree size. I'm more worried about the level of review and auditing that happens - or doesn't happen - when dependencies are added, and the difficulty of re-reviewing things given ongoing development across a whole dependency tree.

It's clear that people in the community are taking this seriously though. In particular, check out 'crev', a distributed review project: https://web.crev.dev/rust-reviews/
crev gives me a lot of hope that the supply chain risks can be mitigated effectively. I think to some extent it will come with maturity - as more of the most commonly used libraries reach a state where changes are small and infrequent (because the library is basically "done"), making review status relatively easy to maintain, and as participation in distributed reviews increases, and review tooling improves and so on. Of course it's also possible for attackers to just provide their own bogus reviews, so... you've got to also somehow decide which reviewers to trust. It's not trivial :-)

With regard to the big vs. small dependency count question, which often comes up as an indicator of supply chain risk: I think just looking at number of packages in the dependency tree is not quite the right thing: in many cases there are groups of packages all maintained by the same people as part of the same project, they've just chosen to split up the packages for technical or organizational reasons. For example, 'hyper' and 'httparse' which I mentioned above are separate packages but they have the same developer/maintainer and they kind of go together - in terms of supply chain risk it's clearly not 2x the risk just because it's split into two packages.

Google's effort to mitigate memory-safety issues

Posted Mar 7, 2021 3:38 UTC (Sun) by ssmith32 (subscriber, #72404) [Link]

From what I've heard of the company, since this is Google, pretty much the entire supply chain will be in house. They do have a tendency towards vendoring and monorepos... We'll end up with a 50 MB 'gurl' executable you can install as a flatpak

Google's effort to mitigate memory-safety issues

Posted Feb 18, 2021 16:09 UTC (Thu) by matthias (subscriber, #94967) [Link]

> Just the HTTP library alone has 60+ "unsafe" keywords in it.

And most of them are in the foreign function interface, where they merely encode that it is unsafe to trust a pointer that is received from C code (or other unsafe languages). The rust compiler cannot verify that the pointer points to a valid structure and thus the access has to be declared unsafe.

If you access this library from rust, this part is dead code. No rust code will use the FFI to access the library. Thus the FFI will not be linked to the rust executable at all. There are just a few unsafe keywords outside of the FFI.

Google's effort to mitigate memory-safety issues

Posted Feb 18, 2021 17:23 UTC (Thu) by lunaryorn (subscriber, #111088) [Link] (29 responses)

At least you now know which parts of the code are unsafe (by Rust's definition of safety). I think this is some steps forwards compared to C's "everything is unsafe" approach.

Google's effort to mitigate memory-safety issues

Posted Feb 18, 2021 19:53 UTC (Thu) by quanstro (guest, #77996) [Link] (11 responses)

i have found in the code i deal with that rust programs while a fraction of total code, are the dominant source of significant bugs.

"memory safe" is not a panacea, and i think that the unfamiliar patterns in rust, combined with the "safe" mantra induce incorrect thinking about safety with rust.

Google's effort to mitigate memory-safety issues

Posted Feb 18, 2021 22:12 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (10 responses)

Do you have more than spooky stories to tell around the campfire here? Maybe some details about the kinds of bugs you tend to encounter? Are you a web developer? Kernel hacker? I'd expect different answers for different fields at least.

Google's effort to mitigate memory-safety issues

Posted Feb 19, 2021 7:20 UTC (Fri) by mti (subscriber, #5390) [Link] (9 responses)

I have seen similar things with respect to Java. When moving a big complex product from C to Java both the development time and number of bugs have increased.

Not what I would have expected!

Quite a few of the bugs are related to timeouts, locks and so on. So part of the problem might be moving from single-threaded to multi-threaded design.

But mostly I think the problem is not related to the programming language at all, but more to changes in the organization, changes in testing, lack of a system architect etc. Maybe a bit of "second system effect".

It would be interesting to make a deeper analysis ...

Google's effort to mitigate memory-safety issues

Posted Feb 19, 2021 13:58 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (8 responses)

Timeouts and locks are problems no language (I know of) has tackled as a compilation guarantee. Maybe some of the verified languages like Idris or the like, but nothing "mainstream".

> But mostly I think the problem is not related to the programming language at all, but more to changes in the organization, changes in testing, lack of a system architect etc. Maybe a bit of "second system effect".

This is certainly more likely. The lack of testing suites in older projects doesn't help for sure.

FWIW, we had this nasty Python script that was handling our git checks upon pushes to MRs and code repositories. The thing was, quite frankly, terrible because it was borne out of "need to get this deployed before we can migrate to GitLab" and was based on what we had running before for Gerrit. Rewriting it in Rust allowed us to hammer down our types, stop fighting Unicode issues, library-ify everything relevant, get a proper test suite up and running, etc. The thing has been rock solid except for things where you're just going to have issues in any language: GraphQL schemas changing, JSON payloads suddenly having `null` where you've never observed them before, etc. But Rust's error handling meant that it shuffled such things off into a place where we got alerts instead of having a `KeyError` bubble up into main (because defensive Python programming is pretty terrible and verbose) and take the thing down during the weekend.

Google's effort to mitigate memory-safety issues

Posted Feb 20, 2021 1:23 UTC (Sat) by LtWorf (subscriber, #124958) [Link] (2 responses)

> stop fighting Unicode issues

Well it's a bit your fault for still using python2…

Google's effort to mitigate memory-safety issues

Posted Feb 20, 2021 3:08 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

Python2 actually has no Unicode issues. It works fine. In Py3 I have to put encode/decode in tons of places that mix real world data and the happy-land of Py3 Unicode strings.

With Rust or Go it's back to Py2 level of Unicode usability.

Google's effort to mitigate memory-safety issues

Posted Feb 20, 2021 3:50 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

In this case, it was Python2 (the code died 5 years ago now). We were stuck because of dependencies not supporting Python3. The problem was that 'literal_str' % unicode died a fiery death and we had to catch each field individually because we didn't have a test suite or anything to check our types. Not the greatest code, but the utilities around it were in Python2. The replacement might have been in Python3 if there had been a useful inotify library available at the time (without dragging Twisted around).

Google's effort to mitigate memory-safety issues

Posted Feb 20, 2021 12:23 UTC (Sat) by roc (subscriber, #30627) [Link] (4 responses)

Unlike Java, Rust does statically guarantee no data races in safe code.

Google's effort to mitigate memory-safety issues

Posted Feb 20, 2021 14:51 UTC (Sat) by mathstuf (subscriber, #69389) [Link] (3 responses)

True, but I also read deadlocks in "locking problems" which Rust doesn't aim or claim to solve.

Google's effort to mitigate memory-safety issues

Posted Feb 20, 2021 18:32 UTC (Sat) by ibukanov (subscriber, #3942) [Link]

Deadlocks do not violate memory safety.

Google's effort to mitigate memory-safety issues

Posted Feb 21, 2021 5:45 UTC (Sun) by roc (subscriber, #30627) [Link]

That's true. Fortunately deadlocks are typically a lot easier to debug because the system is in a stuck state that usually shows the problem directly.

compile time deadlock guarentees

Posted Feb 22, 2021 12:03 UTC (Mon) by tim_small (guest, #35401) [Link]

Although Rust doesn't aim to guarantee deadlock free execution at the language level, it's possible to construct such systems on top of Rust using the type system. For instance, in the (bare metal) embedded space, the RTIC framework https://rtic.rs/ features "Deadlock free execution guaranteed at compile time".

Google's effort to mitigate memory-safety issues

Posted Feb 19, 2021 9:56 UTC (Fri) by ledow (guest, #11753) [Link] (16 responses)

I think that provides a false sense of security, myself.

It's indicative that people working on the code would assume that problems can only happen in the "unsafe" parts, when those unsafe parts can easily silently destroy safety guarantees elsewhere, and also trick people into spending far less time checking the "safe" parts.

My question is really: Are those unsafe parts necessary? In an HTTP library. At what point do you need to trust or manipulate an outside pointer in an HTTP library? To my knowledge, there's nothing in HTTP that would require that. Yet there are instances of unsafe in code that is inside a third-party project, that's sucked into this "safe" library that's pitching itself explicitly as being "safe".

I think that's disingenuous.

And looking at the code for those, there seems to be very, very little explanation as to why unsafe is used (I'll take the comments on the FFI interface from another poster here onboard, but there's still over a dozen instances of unsafe where there's no explanatory comment of why unsafe is necessary).

I'll point out one example where it is documented:

// this is safe because the only operations permitted on this data structure require exclusive
// access or ownership
unsafe impl<T: Send> Sync for SyncWrapper<T> {}

Now, I admit this is only a cursory-glance kind of analysis, but this is a comment on the bottom of a file explaining that this is "safe" based on the assumption that the entire code above never changes basically because "we don't do things like that". That doesn't inspire confidence, to me, and this appears to be (again, I'm no expert in this code), some kind of thread-mutex wrapper - which to my mind isn't the kind of places we should be playing such games.

I'd be looking for an explanation why that unsafe is necessary, and some assertion throughout the file that changing the "operations permitted" on this are somehow warned against quite drastically.

Google's effort to mitigate memory-safety issues

Posted Feb 19, 2021 13:37 UTC (Fri) by matthias (subscriber, #94967) [Link] (15 responses)

I just grepped for unsafe in the hyper library.

92 uses of unsafe in total
76 in FFI interface (documenting that trusting calls from C code is unsafe)
4 because of accessing low level socket functions (this is ffi in the other direction, calling into the OS which is non-rust)
2 dealing with polling io into buffers which are shared between OS and userspace
1 io related that I did not understand immediately
3 in tests and benchmarks (doing some mock up, not in production code)
5 to avoid double initialization (these could be avoided, but would have a performance hit, it is documented why this is safe)

and this one:
// this is safe because the only operations permitted on this data structure require exclusive
// access or ownership
unsafe impl<T: Send> Sync for SyncWrapper<T> {}

It is clearly documented, why this is safe. The structure is a wrapper that has three operations defined: a constructor, one access function and a "destructor". The access function and the destructor can only be called when there is exclusive access (statically verified by the compiler). There is not only the comment at the bottom of the file, but a big comment for every method explaining why this is safe plus a big comment at the start of the file explaining how this structure works. The whole file has roughly 10 lines of code and roughly 100 lines of documentation. It is impossible to miss the requirements on operations of this structure.

In my view this is an excellent example of how unsafe is intended to work. The developer needs sth. that the compiler cannot verify that it is safe. This fact is extensively documented such that a reviewer can verify that the code is indeed safe.

Also it is clearly documented why this is necessary: To avoid mutexes for structures where there can be only exclusive access. The wrapper is written in a way that the compiler can verify that the access is exclusive. And yes, performance matters. If this library would have crappy performance, then most people would use the C version which has thousands of unsafe pointer accesses, most of them without any documentation why they are safe.

The habit of rust that every unsafe access has to be documented allowed me to look at this in a few minutes. A careful validation will take a it longer. How long would this take for C code, i.e., checking for every pointer access that the pointer cannot be null, that it points to initialized memory, that it points to the correct type of structure?

Google's effort to mitigate memory-safety issues

Posted Feb 22, 2021 19:13 UTC (Mon) by nknight (subscriber, #71864) [Link] (14 responses)

If the compiler can verify it’s safe, why is it unsafe?

Rust thinks it’s bringing us into a magical era of safety. What it’s really doing is pushing an unstable language (whose devs are even actively hostile to standardization and alternate implementations, great open source cred there) with incomplete tooling claiming to have solved problems that have been researched in both academia and business for decades. It hasn’t. It’s just another take on the same old themes with its own deep flaws.

Google's effort to mitigate memory-safety issues

Posted Feb 22, 2021 20:13 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (11 responses)

It is marked unsafe because the compiler cannot verify that it is safe (at least Rust's definition of safe). What the comments around it are doing is explaining why the code is, in fact, upholding the rules that Rust expects within the codeblock that the compiler cannot fully reason about.

Also, evidence is needed for "actively hostile" towards standardization (either in terms of a spec or some ISO-like thing) or alternative implementations. There are discussions and efforts towards making a spec, but at different levels of the stack. mrustc exists. Where is there any indication that devs discouraged such a thing?

Incomplete tooling? What is missing?

Google's effort to mitigate memory-safety issues

Posted Feb 22, 2021 22:45 UTC (Mon) by nknight (subscriber, #71864) [Link] (10 responses)

> the compiler cannot verify that it is safe

Then it's not actually safe, and could be silently broken in the future. "Unsafe" may be the worst design own-goal I've ever seen. Even just its name. Unsafe is Rust's monad: Inscrutably named, gets you harassment if you actually use it (cf. actix-web), and yet absolutely necessary in some form. Even just naming it "noborrowcheck" would have *drastically* improved the situation.

(Rust, Haskell, Postgres, Java, I'm sure there are others. In my mind the pattern is well-established: Ecosystems that think they're somehow purer attract masses of the most vocally toxic, violent people. Almost like all of human history. Shocking.)

> Also, evidence is needed for "actively hostile" towards standardization

From a discussion just last year:

https://news.ycombinator.com/item?id=24528193

https://news.ycombinator.com/item?id=25236525

(Ironically, I could agree with the author of these comments on many of his points, but overall it's utterly tone-deaf and thinks Python's mistakes somehow justify Rust's. Forest, trees.)

> There are discussions and efforts towards making a spec

Yes yes, discussions and efforts. I've been hearing that since at least 2014 before 1.0. Where is it? The docs team was even disbanded last year, that's how much Rust cares about having useful docs, even for its one non-neutered implementation [0], and the gaslighting by Rust partisans has only gotten more intense since then. :(

As a practical, real-world matter utterly divorced from the propaganda, Rust has adopted the Python model of standardization, learned nothing from Python's failures in the area, and proceeded to make it *worse* by cranking the churn up to 11. The ecosystem breaks every six weeks, I guess because Rust developers desperately need to create and consume shiny new features of dubious (or even negative) benefit.

Rust presents as a disco full of hyperactive ferrets. Unless you have a team of your own hyperactive ferrets, you're going to struggle to keep up with the churn. Down some meth and maybe you'll keep up for a while, but then the crash comes and you realize boring languages have staying power in the real world for a reason.

Until this mess actually manages to produce a spec, or a radical change happens in Rust's development practices, I won't believe one is *ever* coming.

Even once a standard comes, if it does, I'll be waiting a while to see if I can actually use it in any meaningful sense. I fear will look much like Microsoft's standardization of C#, dump a document on a standards body that becomes obsolete six weeks later.

In almost any other language, even bus factor 1 projects usually go years without API breakage. In Rust, if I'm not using the latest compiler, updated crates won't work at all, and even if I am, API breakage and a disregard for anything more than six weeks old means that I more often than not have to modify my code, even if all I needed was a security patch. None of this happens to me in any other language I have ever used.

The entire ecosystem is a complete clusterfuck, and it starts right at the top. The release cadence is too fast, the barrier for new features too low, and the appetite for documenting anything before it becomes obsolete again appears nonexistent.

I don't always agree with their particular decisions, but WG14 has more or less internalized all of this for C, as have the developers of D, Go, and even Python after getting bitten in the ass by it enough times. New features have a high barrier because people *will* use them and if they're wrong they *will* make the ecosystem worse.

Rust is refusing to learn the lessons of the past, to its and everyone else's detriment. Rust has all the problems of every other programming language, has invented a few new problems, and if a house burns down, they just rebuild it from scratch without addressing the actual cause of the fires.

The sad irony is, the projects Rust is needed for the most are the projects that can't use it, due to instability.

> Incomplete tooling? What is missing?

Huge portions of a reasonable standard library, a sane, fast compiler, a sane, fast build system, a sane, fast package management system, and, apparently, the ability to determine what is or isn't "safe" in the language. Hmm... Maybe a spec would help with that last one? Oh well, probably not important...

[0] I consider mrustc more of a meme/joke than anything. Rust without a borrow checker is itself churn for no gain.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 0:02 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (7 responses)

> gets you harassment if you actually use it (cf. actix-web), and yet absolutely necessary in some form.

actix-web was using unsafe to do things that Rust's compiler cannot check and then not guaranteeing those things once the block was done. That's just asking for crashes and the like. The compiler can't check all code. That's just a fact of life. There are documented expectations of Rust code that unsafe blocks *still must adhere to*. But since the compiler can't check it, you're left with manual checks and comments documenting the logic. Even if it is, as you say, "a meme", it is *far far* better than what C or C++ even attempt to offer.

> Even just naming it "noborrowcheck" would have *drastically* improved the situation.

That's a terrible name. There is more than just borrow checking that is relaxed in an unsafe block. Dereferencing a raw pointer for example.

> Ecosystems that think they're somehow purer attract masses of the most vocally toxic, violent people.

I don't know what corner of the ecosystem you've interacted with, but it certainly hasn't been seen in the larger community. Yes, there are folks who are vehemently "RIIR", but I've mainly just seen them on Reddit and not part of developer circles (AFAIK).

> As a practical, real-world matter utterly divorced from the propaganda, Rust has adopted the Python model of standardization, learned nothing from Python's failures in the area, and proceeded to make it *worse* by cranking the churn up to 11. The ecosystem breaks every six weeks, I guess because Rust developers desperately need to create and consume shiny new features of dubious (or even negative) benefit.

I've got a 50k+ SLOC project that's about 4 or 5 years old now. I haven't done anything to it that Rust *compiler* that has ever broken for non-CI considerations (the lockfile format change broke the "mindeps" build, but that is mostly academic at this point anyways). *Dependencies* have migrated to new features, but the last big thing was async (and understandable that crates doing async stuff would start to use it). The biggest changes made these days are for clippy fixes which is completely voluntary based on our CI setup. For comparison, Python 3.8 *still* broke things even now (os.add_dll_directory). 3.9 looks like it did too (changing structure members in PyTypeObject). I don't think the comparison is even in the same ballpark.

> Until this mess actually manages to produce a spec, or a radical change happens in Rust's development practices, I won't believe one is *ever* coming.

So I take it you don't use Python, Ruby, or Perl either?

> In Rust, if I'm not using the latest compiler, updated crates won't work at all, and even if I am, API breakage and a disregard for anything more than six weeks old means that I more often than not have to modify my code, even if all I needed was a security patch. None of this happens to me in any other language I have ever used.

Minimum versions across crates I maintain still sit at 1.31, a couple at 1.32, a 1.36, one at 1.37, 1.40, 1.42, then the top-level crates are at 1.45 (306 entries in Cargo.lock if you think they're "tiny"). Certainly not an indicator that one needs to update "every six weeks".

> the barrier for new features too low

Ha! Have you seen what it takes for features to land?

> The sad irony is, the projects Rust is needed for the most are the projects that can't use it, due to instability.

My experience is directly counter to this, so I don't know what to tell you.

> Huge portions of a reasonable standard library, a sane, fast compiler, a sane, fast build system, a sane, fast package management system, and, apparently, the ability to determine what is or isn't "safe" in the language.

I don't know what you want from a stdlib, but if it's Python's mess, no thank you. Fast compiler is underway (I much prefer a *good* compiler than a fast crappy compiler). Cargo gets performance improvements and I don't have issues with it. Cargo is also the sanest package management system I've ever worked with (maybe Nix is better because it handles more, but that's not exactly cross-platform either). As for determining what is safe, the compiler checks it and if you're worred about your deps, cargo-geiger and cargo-crev are there to help you.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 1:58 UTC (Tue) by nknight (subscriber, #71864) [Link] (6 responses)

... Wow. Defending the shit that happened with actix-web is just beyond the pale. I'm done here.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 5:18 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

Wow, it's funny how Rust-haters latch-on to some small issue and ignore everything else.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 22:53 UTC (Tue) by nknight (subscriber, #71864) [Link] (3 responses)

Death threats are a “small issue”. Noted. I will let everyone know this is Rust’s view, that criminal behavior is welcome and encouraged.

Google's effort to mitigate memory-safety issues

Posted Feb 24, 2021 21:52 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

Woah, I knew there was friction over at actix, but death threats are news to me. Links? Feel free to email them (the address is easy to find based on the handle) if you don't want to publicize them.

Google's effort to mitigate memory-safety issues

Posted Feb 25, 2021 22:38 UTC (Thu) by rahulsundaram (subscriber, #21946) [Link]

> I will let everyone know this is Rust’s view, that criminal behavior is welcome and encouraged.

You would be knowingly and actively misleading them since you are well aware no-one person's represents "Rust's view" since such a thing does not exist.

Google's effort to mitigate memory-safety issues

Posted Mar 13, 2021 8:02 UTC (Sat) by nilsmeyer (guest, #122604) [Link]

To me you come across as pretty hostile, others behaving in a similar way is no excuse for that. No one is condoning death threats, accusing someone of this is pretty rotten behaviour and serves to further poison the discussion.

I think death threats are a matter for law enforcement.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 10:35 UTC (Tue) by farnz (subscriber, #17727) [Link]

Fundamentally, actix-web was getting performance by relying on Undefined Behaviour always working out in its favour; in Rust, UB is only possible if you use the unsafe keyword.

Did the author deserve the flak they got for that? No - they were no worse than many C and C++ programmers in that respect, and demonstrated that if you take the same shortcuts C++ programmers often do, you can get as good a result from Rust. Was it a good thing for the ecosystem as a whole? Also no, because the selling point of Rust is that it can perform as well as (or better than) C and C++ but with less effort needed to avoid UB.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 6:27 UTC (Tue) by roc (subscriber, #30627) [Link]

I have a 200K line, 5 years old Rust project with 700 dependent crates and 75 crates of its own, and our experience is nothing like this. We do stick to relatively recent compilers, we do update dependency minor versions a lot, and we almost never have to modify our code due to compiler changes or dependency API breaks.

Google's effort to mitigate memory-safety issues

Posted Feb 25, 2021 22:22 UTC (Thu) by apoelstra (subscriber, #75205) [Link]

> Even just naming it "noborrowcheck" would have *drastically* improved the situation.

The borrow checker operates exactly the same inside an unsafe block as outside one. "unsafe" has nothing to do with the borrow-checker.

I agree that "unsafe' isn't a great name and would have preferred "trustme", but I think the existing name is close enough to having the right connotation.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 6:23 UTC (Tue) by roc (subscriber, #30627) [Link] (1 responses)

> claiming to have solved problems that have been researched in both academia and business for decades. It hasn’t.

It really has solved the problem "fast, safe systems language with no GC that is actually attractive to developers", which really was researched in academia (less so in business) for decades.

Google's effort to mitigate memory-safety issues

Posted Feb 25, 2021 9:04 UTC (Thu) by marcH (subscriber, #57642) [Link]

It's actually nice to start with the most obviously wrong statement: saves time paying attention to the more complex assertions that followed.

Google's effort to mitigate memory-safety issues

Posted Feb 18, 2021 14:51 UTC (Thu) by jpab (subscriber, #105231) [Link] (1 responses)

You can use cargo-geiger (https://crates.io/crates/cargo-geiger) to get a report of how much 'unsafe' is used across a whole dependency tree. Unfortunately, the build for geiger is broken right now (https://github.com/rust-secure-code/cargo-geiger/issues/185). There is a workaround noted in the issue. Hopefully it will be fixed in the published version some time soon.

Google's effort to mitigate memory-safety issues

Posted Feb 18, 2021 15:15 UTC (Thu) by ledow (guest, #11753) [Link]

Useful info, thanks.

Google's effort to mitigate memory-safety issues

Posted Feb 18, 2021 17:23 UTC (Thu) by gerdesj (subscriber, #5446) [Link] (4 responses)

What about stopping curl piping direct to BASH (it's always BASH)? 8)

Google's effort to mitigate memory-safety issues

Posted Feb 18, 2021 20:49 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (3 responses)

Because POSIX-compatible shell is harder to write than bash. I try to and yet bash-isms still leak in time to time.

Google's effort to mitigate memory-safety issues

Posted Feb 22, 2021 10:09 UTC (Mon) by tao (subscriber, #17563) [Link] (2 responses)

Personally I've found that at the point where I'd need bash:isms it's typically time to switch to python.
A good way to avoid bash:isms leaking in is to link /bin/sh to /bin/dash and use #! /bin/sh instead of #! /bin/bash...

Google's effort to mitigate memory-safety issues

Posted Feb 22, 2021 13:51 UTC (Mon) by mathstuf (subscriber, #69389) [Link] (1 responses)

It's not about needing them. It's about accidentally using them. Personally I use ShellCheck to guide me for anything that isn't one-off.

Google's effort to mitigate memory-safety issues

Posted Feb 23, 2021 12:17 UTC (Tue) by abo (subscriber, #77288) [Link]

Or, you know, just write for bash and put /bin/bash at the top. It's simple, honest and usually not a problem, and if it is a problem then it's likely someone else's problem!