Rust 1.0 released

[Posted May 15, 2015 by corbet]

Version 1.0 of the Rust language has been released. "The 1.0 release marks the end of that churn. This release is the official beginning of our commitment to stability, and as such it offers a firm foundation for building applications and libraries. From this point forward, breaking changes are largely out of scope (some minor caveats apply, such as compiler bugs). That said, releasing 1.0 doesn’t mean that the Rust language is “done”. We have many improvements in store. In fact, the Nightly builds of Rust already demonstrate improvements to compile times (with more to come) and includes work on new APIs and language features, like std::fs and associated constants."

Great!

Posted May 15, 2015 17:24 UTC (Fri) by HelloWorld (guest, #56129) [Link] (58 responses)

So when can we start writing kernel drivers in it?

Great!

Posted May 15, 2015 17:40 UTC (Fri) by cesarb (subscriber, #6266) [Link]

> So when can we start writing kernel drivers in it?

Good question.

The Rust compiler has a mode which seems to be the equivalent of the "freestanding" mode in C (http://doc.rust-lang.org/stable/book/no-stdlib.html). It probably would be possible with it to generate a static library with no C library dependency. However, it needs a #![feature(...)] to be enabled, and annoyingly #![feature(...)] is AFAIK disabled in "stable" Rust releases.

It won't surprise me if somebody manages to write a working "Hello, world" Linux kernel driver using the "nightly" Rust release in less than a month. A more functional driver would have one large obstacle: the Linux kernel uses a lot of C macros and inline functions, struct layouts which vary depending on the kernel configuration, and arcane gcc magic.

Rust kernel modules

Posted May 15, 2015 17:49 UTC (Fri) by rillian (subscriber, #11344) [Link] (6 responses)

https://github.com/tsgates/rust.ko

Rust kernel modules

Posted May 19, 2015 2:24 UTC (Tue) by voltagex (guest, #86296) [Link] (5 responses)

Nice trick - is there a way to do it without the C shim?

Rust kernel modules

Posted May 19, 2015 10:44 UTC (Tue) by cesarb (subscriber, #6266) [Link]

> Nice trick - is there a way to do it without the C shim?

Probably not. Let's look at what this C shim has:

BUG() is an architecture-dependent C macro which calls inline assembly.

kmalloc() is a C inline function.

kfree() is not a C inline function, but should be kept together with kmalloc() for symmetry.

printk() is "asmlinkage", which is an architecture-specific calling convention, which might or might not be the same as the default calling convention for the architecture. For x86-32, for instance, it's regparm(0), while IIRC the default calling convention might or might not be regparm(3) depending on the kernel configuration.

module_init(), module_exit(), and MODULE_LICENSE() are C macros which do linker magic. You need the first one for the module to do anything useful, and the last one for it to not taint the kernel.

Rust kernel modules

Posted May 19, 2015 19:00 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

Yes, but not without some trickery.

Inline functions have to be reimplemented in Rust to remain inline, cross-language inlining would require whole-program optimization.

BUG can be reimplemented in Rust (yes, it has inline assembly support).

Linker attributes are currently missing, but they can be added.

Rust kernel modules

Posted May 19, 2015 19:17 UTC (Tue) by cesarb (subscriber, #6266) [Link] (2 responses)

> Inline functions have to be reimplemented in Rust to remain inline, cross-language inlining would require whole-program optimization.

Be sure to reimplement the kernel CONFIG_ system. Because some of these inline functions (like, say, kmalloc) change depending on the kernel configuration.

Don't forget to reimplement the data structures accessed by any of the inline functions. By the way, these also often change depending on the kernel configuration.

And don't forget that these inline function definitions are considered part of their own subsystem, so the reimplementation must be kept in sync with any changes to their subsystem. Changes which can easily happen on any minor release, since they're hidden within the inline function body.

Really, the only sane way to do it would be to either use a C stub to wrap all calls to macros or inline functions, or to somehow autogenerate the Rust equivalent from the kernel C structures, inline functions, and macros.

Rust kernel modules

Posted May 19, 2015 21:38 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

>Don't forget to reimplement the data structures accessed by any of the inline functions. By the way, these also often change depending on the kernel configuration.
That's not a problem, any realistic project to integrate with the kernel infrastructure would use bindgen to automatically generate bindings for structure definitions and functions based on C source.

The trickiest part would be inline functions. And I think Rust is also missing a simple way to wrap unions.

> Really, the only sane way to do it would be to either use a C stub to wrap all calls to macros or inline functions, or to somehow autogenerate the Rust equivalent from the kernel C structures, inline functions, and macros.
It should be possible to eventually create something like Go's "import C" feature, that allows to automatically wrap C code in native bindings. Perhaps even preserving inline qualifiers by using Clang to compile C blocks within Rust code.

That's a sizable amount of work, but it's _possible_. And in extreme, Rust can be used just like pure "C" - simply slap 'unsafe' qualifier on your code and go mad with raw pointers.

From what I see, right now Rust is the only major language capable of replacing pure C/C++ even for the most low-level tasks.

Rust kernel modules

Posted May 19, 2015 23:22 UTC (Tue) by cesarb (subscriber, #6266) [Link]

> From what I see, right now Rust is the only major language capable of replacing pure C/C++ even for the most low-level tasks.

Yes! That's what makes me so interested in it. IMO, it sits between C and C++, and adds a few very interesting new features of its own (like the borrow checker). And it has a gradual series of "escape valves" in case it feels limiting: unsafe blocks, inline assembly (in a future version or in unstable nightly), and as a last resort near-seamless linking to C (and indirectly to C++).

Great!

Posted May 15, 2015 21:04 UTC (Fri) by wahern (subscriber, #37304) [Link] (48 responses)

Somebody has already rewritten significant parts of an operating system in Rust: http://scialex.github.io/reenix.pdf

The biggest problem with Rust is that it aborts the current task on allocation failure. Regardless of one's opinion regarding this in userspace, it's a significant problem in the context of a kernel. See section 3.3 of the above paper.

Great!

Posted May 15, 2015 21:57 UTC (Fri) by leech (subscriber, #25047) [Link] (47 responses)

Looking at that paper, it's talking about heap memory allocation. In Rust that's handled in liballoc by building up the Box, Rc, and Arc types over a common heap API which itself is built on top of a libc style malloc implementation. It sounds like there was an attempt to map the heap API onto a slab style allocator, and reuse the existing heap API. A serious kernel project would most likely implement it's own memory managment.

Great!

Posted May 15, 2015 23:24 UTC (Fri) by wahern (subscriber, #37304) [Link] (46 responses)

Interoperating with the kernel slab allocator was a separate issue entirely. It still remains that Rust by-and-large cannot handle heap allocation failure.

"Secondly, large parts of the rust standard library depend on the pseudo-infallible nature of allocation present at this time. At an absolute minimum every single one of the collections data structures and smart pointers will need to be rewritten to support a fallible allocation model, as well as all data-structures that directly depend on them. Furthermore many common interfaces will need to be changed since it can no longer be assumed that, for example, inserting a value into a list will succeed. In the end making this change would probably require rewriting almost the entire standard library, adding functions that expose the fallible nature of memory."

Rust looks really interesting, but I think it's a shame so many of the developers are of the abort-on-OOM mindset.

If I have an asynchronous network daemon juggling 10,000 connections on a single thread, it's preposterous for any engineer to simply think killing all 10,000 connections is okay just because we couldn't service one of them. Just in terms of robustness it doesn't make any sense, but from a security perspective it leaves plenty of opportunities for DoS attacks.

Actually. I should be more specific. Obviously on OOM you usually want to abort _something_. It's just not always the entire thread or process. Most complex applications have much more fine-grained contexts that can be unwound in isolation. If doing so is too complex, that sounds to me like a PEBCAK issue--if they had been committed to handling OOM from the outset, they would have designed their data structures and control flow more carefully.

It's even more of a shame because following strict disciplines like RAII, single owner, and immutability, all of which Rust embraces almost to a fault, are precisely the kinds of methods which make handling OOM easier and more convenient. So Rust could potentially be an excellent environment for careful, robust programming, even when dealing with OOM conditions.

Great!

Posted May 15, 2015 23:49 UTC (Fri) by tetromino (guest, #33846) [Link] (17 responses)

> It still remains that Rust by-and-large cannot handle heap allocation failure.

Rust the language != Rust's standard library.

Of course the Rust stdlib cannot be used by kernel code. For the same reasons that libc cannot be used by kernel code either. Userspace and kernelmode libraries have rather different design requirements.

Great!

Posted May 16, 2015 1:03 UTC (Sat) by wahern (subscriber, #37304) [Link] (16 responses)

That also means no Rc, Arc, or box, which are idiomatic. I'm also unsure whether there are hidden problems with the dereference operator and coercions.

In any event, if the accepted practice for the Rust community is to simply ignore OOM, then that would suck. It's generally understood to be best practice in the Unix world for _libraries_ to always handle OOM. Applications can choose to abort or recover, but at least they have the choice.

One of the nice things I like working with Lua is that Lua can propagate OOM while maintaining VM consistency. That kind of attention to detail shows a concern for code correctness. Like Perl, Python, or Rust they could have punted, but they didn't. That means _I_ get to make the choice of how to handle OOM, based on the particular constraints and requirements of my project. And it means library authors who bothered to worry about OOM haven't wasted their time.

Great!

Posted May 16, 2015 12:03 UTC (Sat) by ms_43 (subscriber, #99293) [Link] (15 responses)

If you find a library that _actually_ handles OOM properly instead of just adding 30% more buggy and untested code in a failed attempt to do so that crashes so often that there's no real benefit, consider yourself lucky.

The only way it could actually work in practice is if you have a robust test suite that achieves 100% code coverage of the error paths via fault injection (returning NULL for every memory allocation call site).

Here's the story of libdbus, which tried to do this:

http://blog.ometer.com/2008/02/04/out-of-memory-handling-...

Great!

Posted May 16, 2015 16:18 UTC (Sat) by ncm (guest, #165) [Link] (3 responses)

When the response to failure is to run destructors, then the failure response code is exercised on every run of the program. In a well-designed system, the code that runs only on failure is very small, and runs on all failures, so is easy to exercise.

On rust, which doesn't have exception handling, programs idiomatically use standard library components to effectively emulate exception handling. These library components are simple and well-tested. It does depend on the whole system returning Result and Option objects for operations that can fail. I don't know how disruptive it would be to change a low-level function that once could not fail to one that returns a Result.

It is worth noting that the 1.0 release asserts a stable core language, but the library is not declared stable.

Great!

Posted May 16, 2015 21:57 UTC (Sat) by roc (subscriber, #30627) [Link] (2 responses)

The problem is that in most languages, error paths tend to leave your state in a half-done mess.

Also, you'd better be sure your destructors don't trigger allocation directly or indirectly.

Neither of those issues are tested "on every run of the program".

Great!

Posted May 19, 2015 13:34 UTC (Tue) by dgm (subscriber, #49227) [Link] (1 responses)

This is hardly a problem with the language, but with the code written in it, or more exactly, with the coder that wrote it. It's not difficult (even if tedious) to keep in mind your invariants, and rewind changes in case of failure. My personal experience is that the easiest way to do it is follow commit semantics (which for the uninitiated means not changing object's state until no exceptions can be raised).

Great!

Posted May 21, 2015 5:08 UTC (Thu) by roc (subscriber, #30627) [Link]

In practice, following commit semantics for large complex modules is too difficult and/or too expensive at run-time.

Ultimately all bugs are "problems with the coder that wrote it", but coders aren't perfect and their time isn't free, so making coding easier matters.

Great!

Posted May 16, 2015 16:24 UTC (Sat) by cesarb (subscriber, #6266) [Link]

Obligatory quote (from http://www.multicians.org/unix.html):

> We went to lunch afterward, and I remarked to Dennis that easily half the code I was writing in Multics was error recovery code. He said, "We left all that stuff out. If there's an error, we have this routine called panic, and when it is called, the machine crashes, and you holler down the hall, 'Hey, reboot it.'"

Great!

Posted May 18, 2015 13:01 UTC (Mon) by ibukanov (subscriber, #3942) [Link] (9 responses)

> a robust test suite that achieves 100% code coverage of the error paths via fault injection

It is trivial to get 100% coverage by changing error-handling style. Consider, for example, numerical floating point calculations. Typically such code does not check for overflow/underflow errors and rather relies on NaN propagation. Such code simply does not have separated error paths so you get automatic full error path coverage as long as the code is tested at all. The drawback of cause is that it is harder to debug NaN issues due too poor tooling support, but the code itself is robust.

It is possible to use this style for non-numerical code as well, but the problem is that language and library support typically is no-existent.

Great!

Posted May 18, 2015 16:44 UTC (Mon) by tterribe (guest, #66972) [Link] (8 responses)

Until you actually inject NaNs, you don't know what the code will do in that case. Simply reasoning that it follows the same control flow doesn't mean it will do something good, and it's not even obvious it will follow the same control flow, if there are any comparisons against numerical values, or conversions to int, etc. Sure, you may have hit every line of code, but that doesn't mean the code always works.

Some examples:
https://git.xiph.org/?p=opus.git;a=commitdiff;h=d6b56793d...
https://git.xiph.org/?p=opus.git;a=commitdiff;h=58ecb1ac1...

I agree with ms_43: the only way this works is if you actually test with fault-injection. We had next to no allocations in libopus (just a few mallocs when setting up the encoder and decoder, absolutely nothing when encoding or decoding a frame), and we _still_ had to do this to catch our bugs, and it's not like this is our first project in C.

Great!

Posted May 18, 2015 17:22 UTC (Mon) by ibukanov (subscriber, #3942) [Link] (7 responses)

> Some examples:
> https://git.xiph.org/?p=opus.git;a=commitdiff;h=d6b56793d...
> https://git.xiph.org/?p=opus.git;a=commitdiff;h=58ecb1ac1...

These bugs happen when NaN model interacts with the code that does not follow it, which is a nice demonstration of my point of poor language/library support for doing that style in the rest of code.

Now consider what happens if the code with those bugs when the whole code is compiled as asm.js target which defines precisely what should happen when code accesses unallocated memory, a sort of NaN for pointers. The end result would be no crash or possibilities for arbitrary code execution, but rather a corrupted video frame.

Great!

Posted May 18, 2015 20:51 UTC (Mon) by tterribe (guest, #66972) [Link] (6 responses)

Audio, you mean. But considering that at least one of these is an encoder bug, a corrupt frame is not precisely a good outcome. There's still a bug, it's just now harder to find and fix because you've removed causes even further from effects. Not every bug is about memory safety.

But I don't even buy the argument about interacting with "code that does not follow [the NaN model]". Take the first change for example:

- if (x>=8)
+ /* Tests are reversed to catch NaNs */
+ if (!(x<8))

The behavior differs precisely because and only when the comparison follows the NaN model, and while in this case the difference happened to lead to a crash later on because of a float->int conversion, there are plenty of other cases where it would simply produce a wrong result. I do not believe that every time a developer writes a comparison against a float, they are asking themselves, "What happens if one of these values is NaN (or Inf)?" and even when they do ask, that they reason correctly about what *should* happen without testing it. Think about things like convergence tests, etc., that could lead to infinite loops. There's no "NaN for pointers" that is going to fix that.

Great!

Posted May 18, 2015 21:16 UTC (Mon) by ibukanov (subscriber, #3942) [Link] (5 responses)

> it's just now harder to find

This is a tooling issue. Implementation can generate a stacktrace when it generates NaN the first time.

> The behavior differs precisely because and only when the comparison follows the NaN model,

NaN model in this case does not add an extra branch. If one has a test coverage for both branches for normal code, that test coverage covers NaN case as well.

> Think about things like convergence tests, etc., that could lead to infinite loops.

I can trivially kill a media application that stuck in ∞ loop. Compare that with a bug that leads to arbitrary code execution coming from a randomly downloaded media file. These are vastly different outcomes in consequences. Similarly, compare the same buggy C code that corrupts memory compiled as asm.js and run in a browser (effectively forcing something like NaN model for C pointers) and run as a native desktop application. I personally would vastly prefer to experience the bug in its former incarnation rather than latter.

In general I do not claim that NaN model leads to less bugs. Rather the claim is that the total cost of consequences of those bugs is lower.

Great!

Posted May 19, 2015 14:38 UTC (Tue) by nybble41 (subscriber, #55106) [Link] (4 responses)

> If one has a test coverage for both branches for normal code, that test coverage covers NaN case as well.

I think the real point here is that branch coverage is _insufficient_. The tests should exercise the boundaries of all of the "equivalence classes", groups of inputs which are expected to cause similar behavior in the code. That includes covering all the branches, but in this case a NaN input—or an input which can cause NaN in an intermediate calculation—would also be a separate equivalence class, even if the same branches are used (because the real branches are hidden in the ALU hardware).

NaN inputs, or ranges of inputs which result in NaN, represent discontinuities in the program, and discontinuities need to be tested. Branches are merely a special case of this rule, discontinuities in a piecewise-defined function which determines the control flow.

Great!

Posted May 19, 2015 16:40 UTC (Tue) by ibukanov (subscriber, #3942) [Link] (3 responses)

> The tests should exercise the boundaries of all of the "equivalence classes"

Ideally this should be expressed by types. For example, consider the following fragment which should report an error if z becomes NaN, not only when it is negative:

double x, y, z;
x = f1();
y = f2();
z = x * y;
if (z < 0) return error;

With better typesystem the relational operators could only be applied to non-NaN doubles requiring one to write something like:

double x, y, z;
x = f1();
y = f2();
z = x * y;
if (isNaN(t) || t.asDefinedNumber() < 0) return error;

Now, this looks like a contradiction to my assertion that NaN model reduces the number of branches and the number of tests, but consider if NaN would not be supported. Then the code becomes using a common C practice to return a false value to indicate an error:

double x, y, z;
if (!f1(&x)) return error;
if (!f2(&y)) return error;
if (!mult(x, y, &z)) return error;
if (z < 0) return error;

Notice that now there 5 branches rather than 3 with NaN. This reduction comes from the multiplication having well-defined semantic for NaN values. In turn this directly translates into simpler test coverage as to test the code path leading to error with NaN values it is sufficient to arrange for f1 to return NaN, rather than making f1, f2 and mult to return false.

Great!

Posted May 19, 2015 19:06 UTC (Tue) by nybble41 (subscriber, #55106) [Link]

> ... but consider if NaN would not be supported.

I don't think anyone has really suggested removing support for NaN—sum types of this kind are indeed very useful for error handling and other tasks—but merely that the NaN cases need to be tested separately from the non-NaN cases.

> ... In turn this directly translates into simpler test coverage as to test the code path leading to error with NaN values it is sufficient to arrange for f1 to return NaN, rather than making f1, f2 and mult to return false.

Treating f1() and f2() as the inputs, I would define six equivalence classes based on the _product_ of the inputs, z, and how z is used in the condition: -Inf, <0, 0, >0, +Inf, NaN. After all, the point of the exercise is not to test the machine's multiplication algorithm, and the behavior of the code depends only on the product and not the individual inputs. Of course, this presumes white-box testing; unless the specifications are unusually detailed, a black-box tester wouldn't be able to assume the use of the multiplication primitive and would therefore need more test cases to cover different combinations of inputs.

Full branch coverage would only require two tests, but that isn't enough to show that the condition is implemented correctly for +/-Inf, NaN, or zero, each of which could easily exhibit incorrect behavior without suggesting deliberate malice on the part of the implementer.

Great!

Posted May 20, 2015 11:58 UTC (Wed) by paulj (subscriber, #341) [Link] (1 responses)

Ideally this should be expressed by types.

E.g., Monads? Isn't this what they were invented for?

Great!

Posted May 20, 2015 13:06 UTC (Wed) by ibukanov (subscriber, #3942) [Link]

> E.g., Monads?

Yes, ideally Monads as implemented if not as in Koka [1] but at least as in PureScript [2]. Haskell typesystem is not powerful enough to express many useful idioms that are required by a system language or language where one need to interface a lot with code in other languages. But even just supporting kind-2 types and some syntax sugar should help Rust a lot.

[1] - http://research.microsoft.com/en-us/projects/koka/
[2] - http://www.purescript.org/

Great!

Posted May 16, 2015 0:05 UTC (Sat) by HelloWorld (guest, #56129) [Link] (4 responses)

> At an absolute minimum every single one of the collections data structures and smart pointers will need to be rewritten to support a fallible allocation model, as well as all data-structures that directly depend on them.
Oh look how terrible, we have to rewrite all those generic, type-safe data structures! Obviously C is a much better choice because it doesn't have them in the first place and is generally stuck in the 60s.

Great!

Posted May 16, 2015 1:40 UTC (Sat) by wahern (subscriber, #37304) [Link] (1 responses)

The smart pointers standard in Rust (e.g. Rc) use heap memory, and they can't propagate allocation failure. You'd have to create your own type entirely (not simply implementation of the interfaces). Kernel Rust and userspace Rust code would end up looking much different than do kernel C and userspace C code.

Obviously you can manage. But handling errors is one of the most fundamental and difficult aspects of programming, and for a so-called systems language like Rust to skip around it kinda sucks.

try!{}, unwrap, and the language features undergirding those idioms apparently came too late in development. Those features represent a compromise and resolution of internal disputes that could have changed the course of other aspects of the language and core types, as well as the standard library. For example, perhaps the Copy trait (which is part of the language reference, not the standard library) could have been amended to allow returning something like Result<T, E>, and a language construct added to inform the compiler of where to branch when copying failed.

Anyhow, at the end of the day it's the dude who _actually_ wrote parts of a _real_ operating system using Rust who lamented these deficiencies.

Great!

Posted May 16, 2015 10:49 UTC (Sat) by roc (subscriber, #30627) [Link]

> Kernel Rust and userspace Rust code would end up looking much different
> than do kernel C and userspace C code.

But there isn't a C implementation of smart pointers that's shared by kernel and userspace code. So how would Rust look "much [more] different"?

On error handling: it's a very difficult problem. I don't think we're ready to declare a winning approach that's worth betting a language on. I don't think any of the popular approaches (other than "fail catastrophically") deal with the core problem: fine-grained error handling introduces an explosion of rarely-taken code paths that are very expensive to test and verify (much like threads with fine-grained locking).

As a browser developer, what I want from Rust (which isn't there yet in 1.0, but hopefully someday) is for OOM and other difficult errors to cause task termination, and for applications and libraries to detect and recover from task termination --- i.e. using tasks to delineate boundaries of failure and recovery. Mapping all catastrophic errors onto "the task died" should reduce the number of observable error states, in particular because Rust provides tools to constrain communication between tasks (e.g. preventing data races).

BTW from Mozilla's point of view, systems programming includes browsers and low-level userspace libraries as well as kernels. Robust OOM handling for every individual allocation is so unknown in userspace that it would have been a bad idea to complicate Rust to allow for it. Heck, as recently seen on LWN, it's not even the rule in the Linux kernel.

Great!

Posted May 16, 2015 2:31 UTC (Sat) by viro (subscriber, #7872) [Link] (1 responses)

Go forth, young[1] <whatever>, and implement a kernel in Rust. Don't forget to post the (working) source. Oh, wait - your interest in languages is pure and not to be besmirched by such a lowly activity as writing anything working. Sorry, I've forgotten for a minute...

Said that, the language is interesting and might be usable for writing a kernel; would take a work, though, and not of advocacy/awareness raising kind.

[1] for some value of young - there is such thing as youthful maximalism seamlessly transitioning into senile dementia, so it's hard to estimate the chronological age...

Great!

Posted May 18, 2015 21:32 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]

There is a couple of bare-metal toy kernels implemented in Rust: http://zinc.rs/

There's nothing fundamental that can prevent Rust from being used for kernel development. I know folks who are already building Linux kernel API bindings to allow Rust modules. And my friend works on a commercial OS X product that has a driver written in Rust.

Great!

Posted May 16, 2015 1:59 UTC (Sat) by cesarb (subscriber, #6266) [Link] (1 responses)

There's a third option: blocking until there's memory available. As discussed last year (https://lwn.net/Articles/627419/), that's currently the default for small allocations.

And it's not only OOM handling. Memory allocation within the Linux kernel also needs a context (is it GPF_ATOMIC, GPF_KERNEL, GPF_NOIO, GPF_NOFS, or something else?), so the standard library would need to be modified anyways. I'd expect anyone trying to write a Linux kernel driver in Rust to disable the standard library (#![no_std]) and use lower-level interfaces.

Great!

Posted May 16, 2015 3:08 UTC (Sat) by wahern (subscriber, #37304) [Link]

Blocking on memory failure and dipping into reserve pools are definitely options. I tend to think of them more as kludges. Blocking is non-deterministic. Dipping into the reserve pool is problematic if you don't have a fallback option--what if you miscalculated?

I prefer to fail fast as a discipline. That means I always try make sure that at any point of failure--memory, thread-contended resource, data source (file, socket), etc--my state is consistent. Focus on keeping your state consistent and transparent, and everything else follows from that.

Whether I _actually_ fail fast in the end, or leverage some other technique (blocking, reserve pool, cache flushing, etc) is more of a policy decision. But if you rely on those other techniques from the outset, I think you'll tend to end up with messy and complex code. For one thing, those measures are leaky--blocking, reserve pools, and cache flushing all tend to create cross-component dependencies, at both the interface level and at run-time. A data sink allocating a buffer now needs to know about some other random component's caching interface, or it needs to share or duplicate a reserve pool. And in the end it might all be for naught. It's seeing that kind of complexity, I think, where people get the idea that handling OOM is impractical.

Plus, much like the phenomenon of buffer bloat, trying to hide resource exhaustion will often only compound the problem at the macro scale. Decisions on how to handle exhaustion are better handled at the edges, not by the actors in the middle. They can decide to fail or retry. The bulk of the software in the middle should simply be concerned with facilitating such decisions.

Great!

Posted May 16, 2015 14:02 UTC (Sat) by epa (subscriber, #39769) [Link] (12 responses)

Gosh, that's a real step backwards. After all that effort to create a provably safe core language it is surprising that the standard containers are stuck in the mythical land where memory allocation never fails. For all the faults of C++ its standard template library does handle allocation failures sanely.

Great!

Posted May 16, 2015 14:32 UTC (Sat) by HelloWorld (guest, #56129) [Link]

C++ containers throw exceptions on failed allocations, and whether the C++ exception handling model is sane is highly debatable.

Great!

Posted May 16, 2015 17:05 UTC (Sat) by ncm (guest, #165) [Link] (10 responses)

In principle, try! and Result implement a library-level exception mechanism. To the degree that all code is expected to use them, it might have been better to build them into the core language, so you could declare when you are *not* using them. That leads to something like C++'s noexcept, which introduces the wrinkle that it's sometimes messy (and error-prone) to express whether something might throw. In C++, overly-pessimistic declarations result in copying when you expected to move. Overly optimistic declarations result in abort-on-throw, if something does throw.

C++ needed noexcept to support move semantics on objects in containers, where moving can throw. In rust, moving is a much simpler operation, and can't fail. The consequences of declaring a Result return unnecessarily should be less than of failing to propagate noexcept on a C++ move constructor.

Since I gather most of the standard traits don't (yet?) allow for returning Result, it remains to be seen how deeply the discipline of comprehensive failure handling can penetrate. The reason to hope for improvement is that the rust community have routinely embraced changes that break every existing program. People are understandably reluctant, though, to wrap every statement in a try! block. Rust may be forced to accept a flavor of exception mechanism, which would be less onerous if it doesn't come with a demand for mannered "exception-safe" style.

Great!

Posted May 19, 2015 17:02 UTC (Tue) by ofranja (guest, #11084) [Link] (9 responses)

Using try! macros is one alternative, but I see this mechanism more like an attempt to present proper error handling into an "idiomatic exception handling" format without weakening Rust safety guarantees.

However, I fail to see how it could be called an exception handling mechanism, as it has very different properties from any exception handling mechanism and doesn't rely on separate unwinding stacks.

Personally, I would rather see Rust going towards a monadic or any similar (sounder) approach than falling in the tales of exception handling à lá C++ - doing otherwise would only weaken its safety and bring the language backwards in terms of safety.

Great!

Posted May 21, 2015 13:06 UTC (Thu) by ncm (guest, #165) [Link] (8 responses)

If every call to a function that might fail has to be wrapped in a try! block, and every such function return a Result, it is a poor man's exception mechanism implemented piecemeal by users of the language. "Poor man's" because it requires clutter at every call site and every function declaration, and subject to omissions when somebody doesn't participate correctly. "Exception mechanism" because what happens in a try!/Result is no different, in principle, from unwinding one stack frame. Pile them up, and you unwind as much of the stack as an exception thrown would, but with more source-code clutter.

How can you know, when you make a trait, whether the implementation might fail? The only sensible spec for any function you aren't implementing yourself is to return Result. Changing it later requires changing every caller.

This is exactly the sort of thing that is better automated in the language. Fear of the "exceptions" boogie man leads you straight into the arms of his much-worse cousin.

Since Rust lacks every one of the flaws that C++ inherits from C that make exception-safe programming difficult and hazardous, there is nothing to fear from exceptions. Make everything return a Result by default and you're halfway there. Make every function body implicitly a try! block and you're 90% there, painlessly. The rest is code you would be obliged to write anyway.

Great!

Posted May 21, 2015 15:25 UTC (Thu) by cesarb (subscriber, #6266) [Link] (2 responses)

> Make everything return a Result by default and you're halfway there.

What about std::ops::Drop (http://doc.rust-lang.org/stable/std/ops/trait.Drop.html)? What would the compiler do with the return value of its drop method?

Great!

Posted May 22, 2015 5:51 UTC (Fri) by ncm (guest, #165) [Link] (1 responses)

Destructors never throw in C++. (They could, in principle, but it would be stupid to code it.

Great!

Posted May 22, 2015 13:43 UTC (Fri) by jwakely (subscriber, #60262) [Link]

And since C++11 you have to advertise your stupidity by putting noexcept(false) on the destructor, otherwise if it tries to throw it will terminate the program. That's a pretty big incentive to avoid throwing from destructors.

Great!

Posted May 21, 2015 19:26 UTC (Thu) by ofranja (guest, #11084) [Link] (4 responses)

> If every call to a function that might fail has to be wrapped in a try! block, and every such function return a Result, it is a poor man's exception mechanism [...]

I'm talking about error handling, exceptions are an specialization of that. I understand the analogy, but keep in mind it doesn't have to be done by wrapping everything in a try!() - albeit that's more idiomatic for people used to exception handling mechanisms.

> "Exception mechanism" because what happens in a try!/Result is no different, in principle, from unwinding one stack frame. [...] How can you know, when you make a trait, whether the implementation might fail? The only sensible spec for any function you aren't implementing yourself is to return Result. Changing it later requires changing every caller.

Some of which can be considered advantages depending on your point of view. Changing all the callers might be interpreted as a feature: the compiler is able to state that your error handling is unsound, instead of the programmer having to do manual inspection of every single caller because the compiler won't be able to catch the inconsistencies.

Also, you miss the point of the trait declaration here: if you are implementing a trait you should indeed use Result<T,E>, unless you *want* the function not to return an error, in which case the implementor would have to fail!(). That's an *explicit* way to encode if errors should be handled or returned, in a compiler-checked way, which is far superior comparing to the guarantees of the exception mechanism in C++.

> This is exactly the sort of thing that is better automated in the language. Fear of the "exceptions" boogie man leads you straight into the arms of his much-worse cousin.

This is where we disagree. If you think using a C++ mindset, you'll think using the type system is a weaker alternative, but that's because C++ cannot express code the same way, at least not without a lot of boilerplate inplace. Rust can easily use the type system to assure your code is sound regarding error handling too, which is much more powerful and another great feature of the language.

> Since Rust lacks every one of the flaws that C++ inherits from C that make exception-safe programming difficult and hazardous, there is nothing to fear from exceptions. Make everything return a Result by default and you're halfway
> there. Make every function body implicitly a try! block and you're 90% there, painlessly. The rest is code you would be obliged to write anyway.

Maybe a static/checked exception mechanism would have some similar properties, but it would bring some additional issues too - and complexities.

I'd rather go to a monadic approach, still allowing the user to explicitly fail!() when it can't deal with the error in a sensible way. Falling back to exception approach à lá C++ would only create implicit holes in the soundness of the programs, which IMHO would be a major step back in the language.

Great!

Posted May 22, 2015 6:51 UTC (Fri) by ncm (guest, #165) [Link] (3 responses)

Apparently you are not aware that a C++ program can call abort() at any time? Or ignore exceptions and let terminate() be called automatically, which calls abort()? There's nothing innovative or clever about fail!. A C++ function can, as in Rust, return an object that must be cleaned up manually, but it is not done because that would be bad design. Talking about "mindset" seeks to distract attention from a failure of reasoned argument. You can pretend that weakness is really an advantage, but the argument refutes itself: C is weaker yet, but not therefore better.

Manifestly, any regime that makes you fill your program with boilerplate try! blocks (or worse) and Result<> apparatus is weak. Any that makes it impossible to recover gracefully from failures is worse. If your type system increases your cognitive load you are worse off, and no compounding of monads will save you. The type system can be twisted up in increasingly labored knots to try to encompass error handling, but the resulting mess just demonstrates it is a poor match for the job.

Rust implements lots of new, innovative, and good ideas, but it still can easily fail, and will if the response to its failings is to insist it is better for them. C++ started out with many flaws inherited from C, and succeeded because it has never suffered from delusions of perfection.

Great!

Posted May 22, 2015 8:18 UTC (Fri) by ofranja (guest, #11084) [Link] (2 responses)

I'm well aware of the explicit abort() and the std::terminate() implicit execution - either by ignoring exceptions or throwing inside a destructor. I had my years of pain with C++ already.

What I'm talking about is not how fail!() is clever or how "weakness" is good, but how modeling a language over an explicit model of error handling is better than an implicit one.

In that sense, it's indeed a C++ mindset to consider it inferior to C++ only because in C++ that would be bad design. I can say a Haskell programmer - for instance - would have very different opinions about exceptions, error handling, and good design. The same way, Rust and C++ are different languages, so not necessarily the same choices apply.

I'm not talking about perfection here - but soundness. It's tempting to throw soundness away when we are faced with the complexity of the problem, but doing so for the sake of some pre-concepted syntatic pattern or some quirky language construct makes little sense when you are trying to make a real improvement on the language level.

Great!

Posted May 23, 2015 16:04 UTC (Sat) by ncm (guest, #165) [Link] (1 responses)

Ideological purity is very appealing up to the point that it begins to actively interfere with sound system engineering principles. At that point, its main effect is to restrict use of the language to toy applications and academia. Toys are fun, but we have an overabundance of toy languages already. It would be tragic if a language with as much promise as Rust retreated to toyland.

An alternative would be to fork the language, and let Rustoy go the way of so many before it, while those of us who have serious engineering goals get on with them. That would be unfortunate but not tragic.

Great!

Posted May 23, 2015 16:50 UTC (Sat) by ofranja (guest, #11084) [Link]

Sound system engineering principles are a form of ideological purity.

I'd rather view this as a design decision instead of just ideological purity - afterall, if you are trying to design a safe systems language, you have to limit the unsafeness to strictly necessary points. If you are trying to argue that safeness and soundness are not important, Rust might not be a proper fit for your needs.

I guess Rust is not trying to be the glorious successor of C++ or a new C++ with aesthetic improvements - as others like D did - but it's trying to be something different. And this might be the key point of its success.

Great!

Posted May 17, 2015 23:17 UTC (Sun) by jameslivingston (guest, #57330) [Link] (7 responses)

> I think it's a shame so many of the developers are of the abort-on-OOM mindset.

I think that's about the only sensible thing to do for general purpose libraries, since all the alternatives I'm aware of cause their own set of problem that are just as bad. In specific situations (e.g. a kernel) alternatives may be better, but I don't believe so in the general case.

You can block waiting until memory is free, which would sometimes work if the problem is transient memory usage rather than increases semi-permanent usage. Now, what happens when you block during allocation while holding a lock, or other thread(s) are waiting for a result from the blocking one?

You can propagate errors up the call stack, whether via exceptions, manual handling, or so on. Generally, the error handling paths will be less tested, and there is always the problem that correctly handling the error may require allocating some memory.

What you really need to do is split your application into pieces which can independently fail without disrupting the other pieces. Using your network example, aborting one connection when running out of memory would be fine - if you can correctly terminate the connection and the resources it uses, which is non-trivial if you have any shared state (including locks).

It's certainly possible, but it's quite difficult to make it so that it doesn't pervade the design to such an extent that you are forced to make tradeoffs you don't want to do to have it.

Great!

Posted May 18, 2015 15:06 UTC (Mon) by epa (subscriber, #39769) [Link] (6 responses)

Maybe abort-on-OOM is after all the best choice for 'general purpose libraries' but the core standard library for the language needs to be held to a higher standard. Then at least programmers have the choice of whether to use the abort-on-OOM style or do something more elaborate (with all the extra effort and trade-offs that required). If even the core container classes will kaboom your program on OOM, it's not really possible to build anything on top of that.

To take your suggestion - splitting the application into pieces is a fine solution. But the small supervisor core which co-ordinates the pieces and is in charge of restarting them on failure must itself be robust, and cannot just die and be restarted - otherwise it's turtles all the way down... So there needs to be the possibility of writing code in a provably safe way, even if the amount of code that ends up being like that is quite small.

Great!

Posted May 18, 2015 15:42 UTC (Mon) by Limdi (guest, #100500) [Link] (3 responses)

> it's not really possible to build anything on top of that.

What is the current strategy to avoid oom apart from praying to god that the guy in the front seat managed to calculate the allowed memory use for every application correctly and the applications obey and he did not overcommit on purpose?

Great!

Posted May 18, 2015 15:45 UTC (Mon) by epa (subscriber, #39769) [Link] (2 responses)

I admit, if running on Linux there isn't much you can do to avoid being squashed by the OOM killer. There are however operating systems which take a more cautious approach.

Avoiding the OOM Killer by Quotas?

Posted May 20, 2015 17:42 UTC (Wed) by gmatht (guest, #58961) [Link] (1 responses)

It seems to me that (1) allocating memory and (2) reserving memory are two different things; where (1) adds memory to an address space, while (2) requests a guarantee that some amount of memory is reserved for your use. As I understand, Linux allows you to do (1), but doesn't allow you to do (2) without disabling overcommit (which comes with its own problems).

However (2), could be a syscall of its own. Roughly, the idea is that to avoid the OOM killer, a process is responsible for reserving memory prior to allocating it. This allows, a process to choose, for example, to:
1) not bother reserving memory, it probably won't be OOM killed anyway,
2) Permanently reserve 640K (which should be enough, right?),
3) Release all reservations when in a safe idle state, allowing a OOM kill,
4) Make sure it has 10MB extra in reserve before accepting a new connection, or
5) Wrap every call to malloc, fork, etc. to make sure it has enough in reserve for all children.

Doing (5) may mean the process is denied reservations long before the machine is low on memory, but that's already what happens with overcommit disabled. That may be what the application writer wants. (1) can be quite nice too; sometimes it is much easier to recover from an abort than to recover from low memory. Allowing some processes to choose 1 and 5 already seems like an advantage. For lots of user space code (4) seems cleaner to me than checking the return of every "malloc(sizeof int)". When a bug is found fixing the result of corruption from an abort may be easier than fixing the result of mishandled null pointer. How often do we really need (5) in userspace?

Avoiding the OOM Killer by Quotas?

Posted May 23, 2015 8:53 UTC (Sat) by epa (subscriber, #39769) [Link]

Reserving memory in advance is a useful operation - so you can fail earlier if memory is short. Even if your app does have full checking of all allocations with rollback when necessary, it may just be easier for all concerned to refuse the incoming connection if there probably won't be the resources to service it. I think I would still prefer that malloc() of ten megabytes would reserve that space - not just pretend to succeed and then dynamite my process at some undetermined later point. But for handling legacy or 'lazy' code which prefers to rely on overcommit, a mixed model such as those you suggest could work.

Great!

Posted May 18, 2015 21:17 UTC (Mon) by roc (subscriber, #30627) [Link] (1 responses)

Yes, the runtime (if there is one, Rust doesn't necessarily need one) that enforces isolation should be able to handle OOM and ensure that it can kill an OOM-ing isolate without bringing down the others. But that is orthogonal to whether the standard library APIs abort on OOM.

As others have pointed out, on Linux and other mainstream OSes handling OOM in userspace is pointless because your process is likely to be killed by the system before it sees any out-of-memory errors. Therefore having the standard library return OOM errors is a bad tradeoff, because you're adding complexity for all Rust users that most of them will never be able to benefit from.

For the niche users who can benefit from explicit OOM errors, it would make sense to offer an alternative standard library which does return those errors. Fortunately, Rust is one of the few languages which lets you rip out the entire standard library and replace it with something else (or nothing).

Great!

Posted May 19, 2015 10:39 UTC (Tue) by epa (subscriber, #39769) [Link]

This is something of a vicious circle: userspace doesn't check allocation success, so tends to allocate more than it needs (since allocating more memory than you will use doesn't cause any test failures, it will naturally tend to happen, just as other classes of error will inevitably creep in if the test suite and everyday usage does not cover them). So then the kernel has to allow overcommit - which means that userspace doesn't bother to check allocation... (There is also the vexed issue of fork() requiring overcommit, which has been discussed previously.)

I do agree that in practice, doing unchecked allocations may be the best tradeoff for a lot of code. Although I suggest that it needs better tools and runtime support to set limits: in my process, allocations made from *this* particular shared library should not exceed 100 megabytes total, while *that* function may only allocate at most 2 megs each time it is called... Since if libpng goes mad and develops a memory leak, I would much rather have the application die quickly (and with an informative message) than have it get slower and slower, thrashing the disk more and more until finally the OOM killer puts it out of its misery. Of course, breaking the program into several independent processes is one way to do this, but possibly with a bit more userspace accounting of memory usage the same goal could be achieved without needing separate processes.

However, safe allocation is not just for 'niche users', or if so, kernel programming is quite a large niche. And there may well be a case for writing small parts of your program in the checked-allocation style while leaving other parts to assume allocation never fails. So then if my app does go kaboom, at least I can be certain it wasn't my string class that did it.

seL4

Posted May 18, 2015 22:17 UTC (Mon) by dvainsencher (guest, #4143) [Link]

I am specifically curious how appropriate Rust and its ownership model are for implementing a verified microkernel such as [1]. Seems like the proof burden for maintenance should be smaller given Rust's guarantees...

[1] https://sel4.systems/