Zig heading toward a self-hosting compiler

By Jake Edge
October 6, 2020

The Zig programming language is a relatively recent entrant into the "systems programming" realm; it looks to interoperate with C, while adding safety features without sacrificing performance. The language has been gaining some attention of late and has announced progress toward a Zig compiler written in Zig in September. That change will allow LLVM to become an optional component, which will be a big step forward for the "maturity and stability" of Zig.

Zig came about in 2015, when Andrew Kelley started a GitHub repository to house his work. He described the project and its goals in an introductory blog post in 2016. As he noted then, it is an ambitious project, with a goal to effectively supplant C; in part, that is done by adopting the C application binary interface (ABI) for exported functions and providing easy mechanisms to import C header files. "Interop with C is crucial. Zig embraces C like the mean older brother who you are a little afraid of but you still want to like you and be your friend."

Hello

The canonical "hello world" program in Zig might look like the following, from the documentation:

const std = @import("std");

pub fn main() !void {
    const stdout = std.io.getStdOut().outStream();
    try stdout.print("Hello, {}!\n", .{"world"});
}

The @import() function returns a reference to the Zig standard library, which gets assigned to the constant std. That evaluation is done at compile time, which is why it can be "assigned" to a constant. Similarly, stdout is assigned to the standard output stream, which then gets used to print() the string (using the positional formatting mechanism for "world"). The try simply catches any error that might get returned from print() and returns the error, which is a standard part of Zig's error handling functionality. In Zig, errors are values that can be returned from functions and cannot be ignored; try is one way to handle them.

As the documentation points out, though, the string being printed is perhaps more like a warning message; perhaps it should print to the standard error stream, if possible, and not really be concerned with any error that occurs. That allows for a simpler version:

const warn = @import("std").debug.warn;

pub fn main() void {
    warn("Hello, world!\n", .{});
}

Because this main() cannot return an error, its return type can be void, rather than !void as above. Meanwhile, the formatting of the string was left out in the example, but could be used with warn() as well. In either case, the program would be put into hello.zig and built as follows:

$ zig build-exe hello.zig
$ ./hello
Hello, world!

Compiler and build environment

The existing compiler is written in C++ and there is a stage-2 compiler written in Zig, but that compiler cannot (yet) compile itself. That project is in the works; the recent announcement targets the imminent 0.7.0 release for an experimental version. The 0.8.0 release, which is due in seven months or so, will replace the C++ compiler entirely, so that Zig itself will be the only compiler required moving forward.

The Zig build system is another of its distinguishing features. Instead of using make or other tools of that sort, developers build programs using the Zig compiler and, naturally, Zig programs to control the building process. In addition, the compiler has four different build modes that provide different tradeoffs in optimization, compilation speed, and run-time performance.

Beyond that, Zig has a zig cc front-end to Clang that can be used to build C programs for a wide variety of targets. In a March blog post, Kelley argues that zig cc is a better C compiler than either GCC or Clang. As an example in the post, he downloads a ZIP file of Zig for Windows to a Linux box, unzips it, runs the binary Zig compiler on hello.c in Wine targeting x86_64-linux, and then runs the resulting binary on Linux.

That ability is not limited to "toy" programs like hello.c. In another example, he builds LuaJIT, first natively for his x86_64 system, then cross-compiles it for aarch64. Both of those were accomplished with some simple changes to the make variables (e.g. CC, HOST_CC); each LuaJIT binary ran fine in its respective environment (natively or in QEMU). One of the use cases that Kelley envisions for the feature is as a lightweight cross-compilation environment; he sees general experimentation and providing an easy way to bundle a C compiler with another project as further possibilities.

The Zig compiler has a caching system that makes incremental builds go faster by only building those things that truly require it. The 0.4.0 release notes have a detailed look at the caching mechanism, which is surprisingly hard to get right, due in part to the granularity of the modification time (mtime) of a file, he said:

The caching system uses a combination of hashing inputs and checking the fstat values of file paths, while being mindful of mtime granularity. This makes it avoid needlessly hashing files, while at the same time detecting when a modified file has the same contents. It always has correct behavior, whether the file system has nanosecond mtime granularity, second granularity, always sets mtime to zero, or anything in between.

The tarball (or ZIP) for Zig is around 45MB, but comes equipped with the cross-compilation and libc targets for nearly 50 different environments. Multiple architectures are available, including WebAssembly, along with support for the GNU C library (glibc), musl, and Mingw-w64 C libraries. A full list can be found in the "libc" section toward the end of the zig cc blog post.

Types

Types in Zig have first-class status in the language. They can be assigned to variables, passed to functions, and be returned from them just like any other Zig data type. Combining types with the comptime designation (to indicate a value that must be known at compile time) is the way to have generic types in Zig. This example from the documentation shows how that works:

fn max(comptime T: type, a: T, b: T) T {
    return if (a > b) a else b;
}
fn gimmeTheBiggerFloat(a: f32, b: f32) f32 {
    return max(f32, a, b);
}
fn gimmeTheBiggerInteger(a: u64, b: u64) u64 {
    return max(u64, a, b);
}

T is the type that will be compared for max(). The example shows two different types being used: f32 is a 32-bit floating-point value, while u64 is an unsigned 64-bit integer. That example notes that the bool type cannot be used, because it will cause a run-time error when the greater-than operator is applied. However, that could be accommodated if it were deemed useful:

fn max(comptime T: type, a: T, b: T) T {
    if (T == bool) {
        return a or b;
    } else if (a > b) {
        return a;
    } else {
        return b;
    }
}

Because the type T is known at compile time, Zig will only generate code for the first return statement when bool is being passed; the rest of the code for that function is discarded in that case.

Instead of null references, Zig uses optional types, and optional pointers in particular, to avoid many of the problems associated with null. As the documentation puts it:

Null references are the source of many runtime exceptions, and even stand accused of being the worst mistake of computer science.

Zig does not have them.

Instead, you can use an optional pointer. This secretly compiles down to a normal pointer, since we know we can use 0 as the null value for the optional type. But the compiler can check your work and make sure you don't assign null to something that can't be null.

Optional types are indicated by using "?" in front of a type name.

// normal integer
const normal_int: i32 = 1234;

// optional integer
const optional_int: ?i32 = 5678;

The value of optional_int could be null, but it cannot be assigned to normal_int. A pointer to an integer could be declared of type *i32, but that pointer can be dereferenced without concern for a null pointer:

    var ptr: *i32 = &x;
    ...
    ptr.* = 42;

That declares ptr to be a (non-optional) pointer to a 32-bit signed integer, the address of x here, and later assigns to where it points using the ".*" dereferencing operator. It is impossible for ptr to get a null value, so it can be used with impunity; no checks for null are needed.

So much more

It is a bit hard to consider this article as even an introduction to the Zig language, though it might serve as an introduction to the language's existence and some of the areas it is targeting. For a "small, simple language", Zig has a ton of facets, most of which were not even alluded to above. It is a little difficult to come up to speed on Zig, perhaps in part because of the lack of a comprehensive tutorial or similar guide. A "Kernighan and Ritchie" (K&R) style introduction to Zig would be more than welcome. There is lots of information available in the documentation and various blog posts, but much of it centers around isolated examples; a coherent overarching view of the language seems sorely lacking at this point.

Zig is a young project, currently, but one with a seemingly active community with multiple avenues for communication beyond just the GitHub repository. In just over five years, Zig has made a good deal of progress, with more on the horizon. The language is now supported by the Zig Software Foundation, which is a non-profit that employs Kelley (and, eventually, others) via donations. Its mission is:

[...] to promote, protect, and advance the Zig programming language, to support and facilitate the growth of a diverse and international community of Zig programmers, and to provide education and guidance to students, teaching the next generation of programmers to be competent, ethical, and to hold each other to high standards.

It should be noted that while Zig has some safety features, "Zig is not a fully safe language". That situation may well improve; there are two entries in the GitHub issue tracker that look to better define and clarify undefined behavior as well as looking at ways to add even more safety features. Unlike with some other languages, though, Zig programmers manually manage memory, which can lead to memory leaks and use-after-free bugs. Kelley and other Zig developers would like to see more memory safety features, especially with respect to allocation lifetimes, in the language.

Rust is an obvious choice for a language to compare Zig to, as both are seen as potential replacements for C and C++. The Zig wiki has a page that compares Zig to Rust, C++, and the D language that outlines advantages the Zig project believes the language has. For example, both flow control and allocations are not hidden by Zig; there is no operator overloading or other mechanisms where a function or method might get called in a surprising spot, nor is there support for new, garbage collection, and the like. It is also interesting to note that there is a project to use Zig to build Linux kernel modules, which is also an active area of interest for Rust developers.

One of the more interesting parts of the plan for a self-hosting Zig compiler is an idea to use in-place binary patching, instead of always rebuilding the binary artifact for a build. Since the Zig-based compiler will have full control of the dependency tracking and code generation, it can generate machine code specifically to support patching and use that technique to speed up incremental builds of Zig projects. It seems fairly ambitious, but is in keeping with Zig's overall philosophy. In any case, Zig seems like a project to keep an eye on in coming years.

Zig heading toward a self-hosting compiler

Posted Oct 7, 2020 7:37 UTC (Wed) by pabs (subscriber, #43278) [Link] (3 responses)

It seems it will still be possible to bootstrap Zig via LLVM:

https://github.com/ziglang/zig-bootstrap

Zig heading toward a self-hosting compiler

Posted Oct 7, 2020 13:30 UTC (Wed) by flussence (guest, #85566) [Link] (2 responses)

It's useful to keep a bootstrap entry point that doesn't depend on a blockchain-like structure of previous versions. A lot of other self-hosting languages don't, and I imagine it causes downstream “reproducible build” efforts endless migraines.

Zig heading toward a self-hosting compiler

Posted Oct 7, 2020 16:39 UTC (Wed) by willy (subscriber, #9762) [Link] (1 responses)

Yes.

https://elephly.net/posts/2017-01-09-bootstrapping-haskel...

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 1:16 UTC (Thu) by pabs (subscriber, #43278) [Link]

The umbrella project for these sorts of efforts:

https://bootstrappable.org/

Comprehensive guide for zig

Posted Oct 7, 2020 15:26 UTC (Wed) by Sobeston (guest, #142410) [Link] (1 responses)

> It is a little difficult to come up to speed on Zig, perhaps in part because of the lack of a comprehensive tutorial or similar guide.

I am trying to fill this gap via https://ziglearn.org/ (https://github.com/Sobeston/ziglearn) :)

Comprehensive guide for zig

Posted Oct 25, 2020 16:29 UTC (Sun) by atomiczep (guest, #142685) [Link]

Much appreciated! I have found this site very useful, even though more documentation will be needed.

Zig heading toward a self-hosting compiler

Posted Oct 7, 2020 17:26 UTC (Wed) by khim (subscriber, #9252) [Link] (49 responses)

It's strange that article shows us the new language — and then fails to explain why would we ever want it. This is especially puzzling since it does ultimately gives us link to a place which explains it very clearly and unambiguously.

Because, surprisingly enough, the biggest selling point for languages today is not what they can do, but, on the contrary, the important thing is what they couldn't do!

Times of fixed languages which are delivered to you as a binary blob are gone. Languages is one area where Free Software (nor even open source) have truly won. Proprietary languages are mostly hold-outs of the old era — and even they are developing constantly.

Thus, essentially, mean that something like “my language have modules, your language doesn't, it's time to switch” is not very compelling offer: today my language have no modules, tomorrow they would be added… why would I need to start from scratch again?

But “your language requires something, my languages doesn't…” that one may be compelling enough to bother switching. Because, ultimately, it's not hard to add something to the language, but quite often it's insanely hard to carve out something.

And Zig offers something really unique in a modern era: the ability to survive in a world where memory is finite. This is really surprising since this seems like something any low-level language should support, but both C++ and Rust are doing poorly there (Rust the language should, in theory, be fine, but Rust's standard library is not really designed for that, which, for all practical purposes makes an event of “running out of memory” very hard to handle).

I'm not entirely sure I'm sold (Zig is fairly new and it's not yet easy to see if it could actually be used for what's promised), but the fact that it's not even mentioned in the article is surprising.

So while I'm not sure I would [try to] switch any time soon from C++ (which also have the same problem as Rust and which, actually, stopped even pretending you can live without infinite memory in C++20)… but the idea sounds quite compelling…

So… yeah… Zig seems like a project to keep an eye on in coming years.

Zig heading toward a self-hosting compiler

Posted Oct 7, 2020 17:46 UTC (Wed) by snopp (guest, #138977) [Link] (2 responses)

>>> And Zig offers something really unique in a modern era: the ability to survive in a world where memory is finite

Do you mind elaborate a bit more about the point above? Or a link to where I can read more about it.

Zig heading toward a self-hosting compiler

Posted Oct 7, 2020 18:17 UTC (Wed) by chris.sykes (subscriber, #54374) [Link] (1 responses)

Check out:

https://ziglang.org/#Manual-memory-management

https://ziglang.org/documentation/master/#Memory

The second link is is worth reading in its entirety if you're interested in an overview of the language features.

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 13:43 UTC (Thu) by kleptog (subscriber, #1183) [Link]

The use of arenas is good method for dealing with out-of-memory issues. Programs like PostgreSQL and Samba use these techniques (in C) to handle out of memory, but they are also needed if you want to do any kind of exception handling. So this is something Zig does well. But then I read things like:

> The API documentation for functions and data structures should take great care to explain the ownership and lifetime semantics of pointers. Ownership determines whose responsibility it is to free the memory referenced by the pointer, and lifetime determines the point at which the memory becomes inaccessible (lest Undefined Behavior occur).

Since we now have Rust as demonstration that all the ownership checks can be done at compile time (thus no runtime cost) this feels like a missed opportunity.

Zig heading toward a self-hosting compiler

Posted Oct 7, 2020 18:52 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (16 responses)

> Rust the language should, in theory, be fine, but Rust's standard library is not really designed for that, which, for all practical purposes makes an event of “running out of memory” very hard to handle
You can't realistically get an "allocation failed" situation in Linux, because of all the overcommit.

So this mostly leaves the small constrained devices. And it's not really relevant there as well. It's very hard to dig yourself out of the OOM hole, so you write the code to avoid getting there in the first place.

Zig heading toward a self-hosting compiler

Posted Oct 7, 2020 20:04 UTC (Wed) by ballombe (subscriber, #9523) [Link] (9 responses)

You can disable overcommit, see /proc/sys/vm/overcommit_memory

Zig heading toward a self-hosting compiler

Posted Oct 7, 2020 20:07 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (8 responses)

It doesn't actually disable it. You still will get killed by the OOM killer rather than get null from malloc(). In my experience to force malloc() on Linux to return NULL, you need to disable overcommit and try a really large allocation.

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 13:17 UTC (Fri) by zlynx (guest, #2285) [Link] (7 responses)

I don't think that you set it correctly then, because strict commit definitely works. I run my servers that way.

You have to read the documentation pretty carefully because there's actually three modes: 0 for heuristic, 1 for overcommit anything, and 2 is strict commit (well, strict depending on the overcommit_ratio value).

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 17:40 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (6 responses)

Strict commit works, sure. In the sense that the OOM killer will come out immediately, rather than later.

As I've shown, there's simply no way to get -ENOMEM out of sbrk() as an example.

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 18:25 UTC (Fri) by zlynx (guest, #2285) [Link] (5 responses)

And yet, it does do it somehow. I just wrote a little C program to test it, and tried it on my laptop and one of my servers.

#include <assert.h>
#include <errno.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>

intptr_t arg_to_size(const char *arg) {
  assert(sizeof(intptr_t) == sizeof(long));

  errno = 0;
  char *endp;
  long result = strtol(arg, &endp, 0);
  if (errno) {
    perror("strtol");
    exit(EXIT_FAILURE);
  }
  if (*endp != '\0') {
    switch (*endp) {
    default:
      exit(EXIT_FAILURE);
      break;
    case 'k':
      result *= 1024;
      break;
    case 'm':
      result *= 1024 * 1024;
      break;
    case 'g':
      result *= 1024 * 1024 * 1024;
      break;
    }
  }
  return result;
}

int main(int argc, char *argv[]) {
  if (argc < 2)
    exit(EXIT_FAILURE);
  intptr_t inc = arg_to_size(argv[1]);
  if (inc < 0)
    exit(EXIT_FAILURE);

  printf("allocating 0x%lx bytes\n", (long)inc);
  void *prev = sbrk(inc);
  if (prev == (void *)(-1)) {
    perror("sbrk");
    exit(EXIT_FAILURE);
  }

  return EXIT_SUCCESS;
}

On a 32 GiB server with strict overcommit:

$ ./sbrk-large 24g
allocating 0x600000000 bytes

$ ./sbrk-large 28g
allocating 0x700000000 bytes
sbrk: Cannot allocate memory

Here are the interesting bits from the strace on the strict commit server for ./sbrk-large 32g. You can see sbrk is emulated by getting the current brk, adding the sbrk increment to it. Then it sees that brk did not move and returns an error code.

brk(NULL)                               = 0x1d71000
brk(0x801d71000)                        = 0x1d71000

And on the laptop after turning on full overcommit. Heuristic was failing on big numbers but with overcommit_memory set to 1 no problems.

./sbrk-large 64g
allocating 0x1000000000 bytes

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 18:28 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

Try allocating in small increments, instead of a huge allocation that blows past the VMA borders.

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 19:14 UTC (Fri) by zlynx (guest, #2285) [Link] (3 responses)

With sbrk it won't make any difference. It's a single contiguous memory block.

I'm not even writing into it. It's the writing that triggers OOM. The Linux OOM system is happy to let you have as much virtual memory as you want as long as you don't use it.

But as you can see when I exceed the amount of available RAM (free -g says there's 27g available) in a single allocation on the server with strict overcommit it fails immediately.

Zig heading toward a self-hosting compiler

Posted Oct 11, 2020 17:47 UTC (Sun) by epa (subscriber, #39769) [Link] (2 responses)

It's the writing that triggers OOM.

Isn't that exactly the point? If the memory isn't actually available, the allocation appears to succeed, but then blows up when you try to use it. There is not a way to say "please allocate some memory, and I do intend to use it, so if we're out of RAM tell me now (I'll cope), and if not, please stick to your promise that the memory exists and can be used".

It's good that a single massive allocation returns failure, but that does not come close to having a reliable failure mode in all cases.

Zig heading toward a self-hosting compiler

Posted Oct 11, 2020 18:29 UTC (Sun) by zlynx (guest, #2285) [Link] (1 responses)

With strict commit any allocation that succeeds is guaranteed to be available. You won't get the OOM handler killing anything when the memory is used. That's why I run my servers that way. Server applications tend to be built to handle memory allocation failures.

Unless it's Redis. You have to run Redis with full overcommit enabled.

Zig heading toward a self-hosting compiler

Posted Oct 18, 2020 15:06 UTC (Sun) by epa (subscriber, #39769) [Link]

Thanks, sorry I misunderstood your earlier comment.

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 9:46 UTC (Thu) by khim (subscriber, #9252) [Link] (3 responses)

> It's very hard to dig yourself out of the OOM hole, so you write the code to avoid getting there in the first place.

Practically speaking you end up in a situation where you need to hit the reset switch (or wait for the watchdog to kill you), anyway.

This may be an Ok approach for the smartphone or even your PC. But IoT with this approach is a disaster waiting to happen (read about Beresheet to know how it works, ultimately).

So yeah, Zig is "worth watching". Let's see how it would work.

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 11:23 UTC (Thu) by farnz (subscriber, #17727) [Link] (2 responses)

IME, having worked in the sort of environment where you can't OOM safely, you don't actually care that much about catching allocation failures at the point of allocation; the Rust approach of unwind to an exception handle via catch_unwind is good enough for allocation failures.

The harder problem is to spend a lot of effort bounding your memory use at compile time, allowing for things like fragmentation. Rust isn't quite there yet (notably I can't use per-collection allocation pools to reduce the impact of fragmentation).

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 15:04 UTC (Thu) by khim (subscriber, #9252) [Link] (1 responses)

Yup. And as I wrote right in the initial post (just maybe wasn't clear enough): it's not even question of language design — but more features of it's standard library.

Both Rust and C++ should, in theory, support design for limited memory. Both have standard libraries which assume that memory is endless and, if we ever run out of memory then it's Ok to crash. And now, with C++20, C++ have finally got language constructs which deliver significant functionality, not easily achievable by other methods — yet rely on that “memory is endless and if it ever runs out then it's Ok to crash” assumption.

So Zig is definitely covering unique niche which is not insignificant. But only time will tell if it's large enough to sustain it.

Zig heading toward a self-hosting compiler

Posted Oct 11, 2020 17:52 UTC (Sun) by epa (subscriber, #39769) [Link]

It's unfashionable to write programs with fixed size buffers or arbitrary limits, but I think that would often be a way to get better reliability in more "static" applications where the workload is known in advance. Of course, you need to fail gracefully when the buffer is full or the limit is reached -- but you can write test cases for that, certainly a lot more easily than you can have test cases for running out of memory at every single dynamic allocation in the codebase, or even worse, being OOM killed at any arbitrary point.

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 3:39 UTC (Fri) by alkbyby (subscriber, #61687) [Link] (1 responses)

not entirely true. Programs may and sometimes do have their own limits of total malloced sized. And it is very useful sometimes. And you already posted below that larger allocations can fail (but not entirely correctly btw; actually even with default overcommit larger allocations or forks may fail)

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 3:58 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

Some years ago I did try to test my program for OOM robustness. It was a regular C++ program compiled for glibc. I was not actually able to get allocations to fail! Instead the OOM killer usually just murdered something unrelated.

Out of curiosity I decided to look how allocators are implemented. I don't want to wade through glibc source code, so I looked in Musl. The allocator there uses the good old set_brk syscall to expand the heap (and direct mmap for large allocations).

Yet the sbrk() source code in Linux does _not_ support ENOMEM return: https://elixir.bootlin.com/linux/latest/source/mm/mmap.c#... Even if you lock the process into the RAM via mlockall(MCL_FUTURE), sbrk() will simply run infallible mm_populate() that will cause the OOM killer to awake if it's out of RAM.

You certainly can inject failures by writing your own allocator, but for regular glibc/musl based code it's simply not going to happen.

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 2:21 UTC (Thu) by roc (subscriber, #30627) [Link] (16 responses)

It is insanely difficult to handle out-of-memory conditions reliably in a complex application. First you have to figure out what to do in every situation where an allocation fails. Usually there is nothing reasonable you *can* do other than give up on some request. Then you have to implement the failure handling --- make sure that your error handling (including RAII cleanup!) doesn't itself try to allocate memory! Then you have to test all that code, which requires the ability to inject failures at every allocation point. Then you have to run those tests continuously because otherwise things will certainly regress.

For almost every application, it simply isn't worth handling individual allocation failures. It *does* make sense to handle allocation failure as part of large-granularity failure recovery, e.g. by isolating large chunks of your application in separate processes and restarting them when they die. That works just fine with "fatal" OOM handling.

In theory, C and C++ support handling of individual allocation failures. In practice, it's very hard to find any C or C++ application that reliably does so. The vast majority don't even try and most of the rest pretend to try but actually crash in any OOM situation because OOM recovery is not adequately tested.

Adding OOM errors to every library API just in case one of those unicorn applications wants to use the library adds API complexity just where you don't want it. In particular, a lot of API calls that normally can't fail now have a failure case that needs to be handled/propagated.

Therefore, Rust made the right call here, and Zig --- although it has some really cool ideas --- made the wrong call.

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 8:54 UTC (Thu) by smcv (subscriber, #53363) [Link]

> In theory, C and C++ support handling of individual allocation failures. In practice, it's very hard to find any C or C++ application that reliably does so. The vast majority don't even try and most of the rest pretend to try but actually crash in any OOM situation because OOM recovery is not adequately tested.

dbus, the reference implementation of D-Bus, is perhaps a good example: it's meant to handle individual allocation failures, and has been since 2003, with test infrastructure to verify that it does (which makes the test suite annoyingly slow to run, and makes tests awkward to write, because every "assert success" in a test that exercises OOM turns into "if OOM occurred, end test successfully, else assert success"). Despite all that, we're *still* occasionally finding and fixing places where OOM isn't handled correctly.

The original author's article on this from 2008 <https://blog.ometer.com/2008/02/04/out-of-memory-handling...> makes interesting reading, particularly these:

> I wrote a lot of the code thinking OOM was handled, then later I added testing of most OOM codepaths (with a hack to fail each malloc, running the code over and over). I would guess that when I first added the tests, at least 5% of mallocs were handled in a buggy way

> When adding the tests, I had to change the API in several cases in order to fix the bugs. For example adding dbus_connection_send_preallocated() or DBUS_DISPATCH_NEED_MEMORY.

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 15:22 UTC (Thu) by khim (subscriber, #9252) [Link] (12 responses)

It's true that OOM-handling couldn't be added to the existing codebase. I'm not so sure it's as hard as you describe if you design everything from scratch.

It's like exception safety: it's insanely hard to redo an existing codebase to make it exception-safe. Google Style Guide even expressly forbids it. Yet if you use certain idioms and libraries — it becomes manageable.

If you want/need to handle OOM case the situation is similar: you change your code structure to handle that case… and suddenly it becomes much less troubling and hard to deal with.

I'm not sure Zig would manage to pull it off… but I wouldn't dismiss it because it tries to solve that issue: lots of issues with OOM handling in the existing applications/libraries come just from the fact that they design API for the usual “memory is infinte” world… and then try to add OOM handling to that… it doesn't work.

But you can go and check these old MS-DOS apps which had to deal with limited memory. They handle it just fine and it's not hard to make them show you “couldn't allocate memory” message without crashing. Please don't say that people were different back then and could do that, but today we lost that art. That's just not true.

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 22:28 UTC (Thu) by roc (subscriber, #30627) [Link] (11 responses)

C and C++ and their standard libraries *were* designed from scratch to allow handling of individual allocation failures. Lots of people built libraries and applications on top of them that they thought would handle allocation failures. That didn't work out.

MS-DOS apps were a lot simpler than what we have today and often did misbehave when you ran out of memory. Those that did not often just allocated a fixed amount of memory at startup and were simple enough they could ensure they worked within that limit, without handling individual allocation failures. For example if you look up 'New' in the Turbo Pascal manual (chapter 15), you can see it doesn't even *mention* New returning OOM or how to handle it. The best you can do is call MaxAvail before every allocation, which I don't recall anyone doing. http://bitsavers.trailing-edge.com/pdf/borland/turbo_pasc...

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 23:27 UTC (Thu) by khim (subscriber, #9252) [Link] (10 responses)

It's funny that you have picked Turbo Pascal 3.0 — the last version without proper care for the out-of-memory case. Even then it had $K+ options which was enabled by default and would generate a runtime error if memory was exhausted.

If you open the very site which you showed and look on the manual for the Turbo Pascal 4.0 — you'll find out HeapError error-handling routine there. Turbo Vision manual even have whole chapter 6 named “Writing safe programs” — complete with “safety pool”, “LowMemory” condition and so on. It worked.

Turbo Pascal itself used it and many other programs did, too. I don't quite sure when the notion of “safe programming” was abandoned, but I suspect it was when Windows arrived. Partly because Windows itself handles OOM conditions poorly (why bother making your program robust if the whole OS would come crashing down on you if you run out of memory?) and partially because it brought many new programmers to the PC which were happy to make programs which would work sometimes and cared not about making them robust.

Ultimately there's nothing mystic in writing such programs. Sure, you need tests. Sure, you need proper API. But hey, it's not as if you can handle other kinds of failures properly without tests and it's not as if you don't need to think about your API if you want to satisfy other kinds of requirements.

It's kind of a pity that Unix basically pushed us down the road of not caring about OOM errors with it's fork/exec model. It's really elegant… yet really flawed. Once you go that road the only way to efficiently use the whole memory available is via overcommit and once you have overcommit and malloc stops returning NULL and you get SIGSEGV at random time… you can no longer write reliable programs so people just stop writing reliable libraries, too.

Your only hope at that point is something like what smartphones and routers are doing: split your hardware into two parts and put “reliable” piece into one and “fail-happy” piece into another. People would just have to deal with the need to do hard reset at times.

But is that the good way to go for the ubiquitous computing? Where failure and watchdog-induced reset may literally mean life-and-death? Maybe this two parts approach would scale. Maybe it would. IDK. Time will tell.

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 3:08 UTC (Fri) by roc (subscriber, #30627) [Link] (1 responses)

Thanks for the TP4/Vision references. The vogue for OOM-safe application programming in DOS, to the extent it happened, must have been quite brief.

> But hey, it's not as if you can handle other kinds of failures properly without tests and it's not as if you don't need to think about your API if you want to satisfy other kinds of requirements.

It sounds like you're arguing "You have to have *some* tests and *some* API complexity so why not just make those a lot more work".

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 7:04 UTC (Fri) by khim (subscriber, #9252) [Link]

> It sounds like you're arguing "You have to have *some* tests and *some* API complexity so why not just make those a lot more work".

No. It's “a lot more work” if you don't think about it upfront. It's funny that this link is used as an example if how hard it is to handle OOM. Because it's really shows how easy it is to do. 5% of mallocs were handled in a buggy way — means that 95% of them were handled correctly on the first try. That's a success rate much higher than for most other design decisions.

Handling OOM conditions is not hard, really. It's only hard if you already have finished code designed for the “memory is infinite” world and want to retrofit OOM-handling into it. Then it's really hard. Situation is very analogous to thread-safety, exception-safety and many other such things: just design primitives which handle 95% of work for you, and write tests to cover the remaining 5%.

Zig heading toward a self-hosting compiler

Posted Oct 10, 2020 15:16 UTC (Sat) by dvdeug (guest, #10998) [Link] (7 responses)

I suspect it was when Windows arrived, because that's when the first serious multiprocess computing happened on PC and when the first complex OS interactions were happening. If I have several applications open when I hit the memory limit, what fails will be more or less at random; it's possible to be Photo Shop, or the music player or some random background program. It's also possible to be some lower-level OS code that had little option but to invoke the OOM killer or crash the system. It's quite possible you can't open a dialog box to tell the user of the problem without memory, nor save anything. As well as the fact your multithreaded code (and pretty much all GUI programs should run their interface on a separate thread) may be hitting this problem on multiple threads at once. What was once one program running on an OS simple enough to avoid memory allocation is now a complex collection of individually more complicated programs on a complex OS.

Zig heading toward a self-hosting compiler

Posted Oct 10, 2020 22:50 UTC (Sat) by khim (subscriber, #9252) [Link] (2 responses)

>It's quite possible you can't open a dialog box to tell the user of the problem without memory,

MacOS classic solved that by setting aside some memory for that dialog box.

>nor save anything.

Again: not a problem on MacOS since there application requests memory upfront and then have to deal with it. Other app couldn't “steal” it.

>I suspect it was when Windows arrived

And made it impossible to reliably handle OOM, yes. Most likely.

>What was once one program running on an OS simple enough to avoid memory allocation is now a complex collection of individually more complicated programs on a complex OS.

More complex than typical zOS installation? Which handles OOM just fine?

I don't think so.

No, I think you are right: when Windows (the original one, not Windows NT 3.1 which properly handles OOM, too) and Unix (because of fork/exec model) made it impossible to reliably handle OOM conditions — people stopped caring.

SMP or general complexity had nothing to do with it. Just general Rise of Worse is Better.

As I've said: it's not impossible to handle and not even especially hard… but in a world where people just trained to accept the fact that programs may fail randomly for no apparent reason that thing is just entirely unnecessary.

Zig heading toward a self-hosting compiler

Posted Oct 11, 2020 4:25 UTC (Sun) by dvdeug (guest, #10998) [Link] (1 responses)

> Again: not a problem on MacOS since there application requests memory upfront and then have to deal with it.

You could do that anywhere. Go ahead and allocate all the memory you need upfront.

> More complex than typical zOS installation? Which handles OOM just fine?

If it does, it's because it keeps things in nice neat boxes and runs a closed set of IBM hardware, in the way that a desktop OS can't and doesn't. A kindergarten class at recess is more complex in some ways than a thousand military men marching in formation, because you never know when a kindergartner is going to punch another one or make a break for freedom.

> SMP or general complexity had nothing to do with it.

That's silly. If you're writing a game for a Nintendo or a Commodore 64, you know how much memory you have and you will be the only program running. MS-DOS was slightly more complicated, with TSRs, but not a whole lot. Things nowadays are complex; a message box calls into a windowing system and needs fonts loaded into memory and text shapers loaded; your original MacOS didn't handle Arabic or Hindi or anything beyond 8-bit charsets. Modern systems have any number of processes popping up and going away, and even if you're, say, a word processor, that web browser or PDF reader may be as important as you. Memory amounts will vary all over the place and memory usage will vary all over the place, and checking a function telling you how much memory you have left won't tell you anything particularly useful about what's going to be happening sixty seconds from now. What was once a tractable problem of telling how much memory is available is now completely unpredictable.

> Just general Rise of Worse is Better.

To quote that essay: "However, I believe that worse-is-better, even in its strawman form, has better survival characteristics than the-right-thing, and that the New Jersey approach when used for software is a better approach than the MIT approach." The simple fact is you're adding a lot of complexity to your system; there's a reason why so much code is written in memory-managed languages like Python, Go, Java, C# and friends. You're spending a lot of programmer time to solve a problem that rarely comes up and that you can't do much about when it does. (If it might be important, regularly autosave a recovery file; OOM is not the only or even most frequent reason your program or the system as a whole might die.)

> in a world where people just trained to accept the fact that programs may fail randomly for no apparent reason

How, exactly, does issuing a message box saying "ERROR: Computer jargon" going to help that? Because that's all most people are going to read. There is no way you can fix the problem that failing to open a new tab or file because the program is out of memory is going to be considered "failing randomly for no apparent reason" by most people.

I fully believe you could do better, but it's like BeOS; it was a great OS, but when it was made widely available in 1998, between Windows 98 and an OS that didn't run a browser that could deal with the Web as it was in 1998, people went with Windows 98. Worse-is-better in a nutshell.

Zig heading toward a self-hosting compiler

Posted Oct 11, 2020 19:49 UTC (Sun) by Wol (subscriber, #4433) [Link]

> "However, I believe that worse-is-better, even in its strawman form, has better survival characteristics than the-right-thing, and that the New Jersey approach when used for software is a better approach than the MIT approach."

Like another saying - "the wrong decision is better than no decision". Just making a decision NOW can be very important - if you don't pick a direction to run - any direction - when a bear is after you then you very quickly won't need to make any further decisions!

Cheers,
Wol

16-bit Windows applications tried to deal with OOM

Posted Oct 11, 2020 12:38 UTC (Sun) by quboid (subscriber, #54017) [Link] (3 responses)

Perhaps it was when 32-bit Windows arrived that applications stopped caring about running out of memory.

The 16-bit Windows SDK had a tool called STRESS.EXE which, among other things, could cause memory allocation failures in order to check that your program coped with them correctly.

16-bit Windows required large memory allocations (GlobalAlloc) to be locked when being used and unlocked when not so that Windows could move the memory around without an MMU. It was even possible to specify that allocated memory was discardable and you didn't know whether you'd still have the memory when you tried to lock it to use it again - this was great for caches and is a feature I wish my web browser had today. :-)

Mike.

16-bit Windows applications tried to deal with OOM

Posted Oct 11, 2020 21:14 UTC (Sun) by dtlin (subscriber, #36537) [Link]

Android has discardable memory - ashmem can be unpinned, and the system may purge it if under memory pressure. I think you can simulate this with madvise(MADV_FREE), but ashmem will tell you if it was purged or not and MADV_FREE won't (the pages will just be silently zero'ed).

16-bit Windows applications tried to deal with OOM

Posted Oct 11, 2020 22:28 UTC (Sun) by roc (subscriber, #30627) [Link] (1 responses)

You should be glad browsers don't have that today. If they did, people would use it, and browsers on developer machines would rarely discard memory, so when your machine discards memory applications would break.

16-bit Windows applications tried to deal with OOM

Posted Oct 15, 2020 16:19 UTC (Thu) by lysse (guest, #3190) [Link]

Better they break by themselves than freeze up the entire system while it tries to page every single executable VM page through a single 4K page of physical RAM, because the rest of it has been overcommitted to memory that just got written to.

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 19:23 UTC (Thu) by excors (subscriber, #95769) [Link] (1 responses)

For embedded-style development, there is a third option beyond handling every individual allocation failure or restarting the whole application/OS on any allocation failure: don't do dynamic allocation. There's simple things like replace std::vector<T> with std::array<T, UPPER_BOUND> if you can work out the bounds statically. Or whenever an API function allocates an object, change it to just initialise an object that has been allocated by the caller. That caller can get the memory from its own caller (recursively), or from a static global variable, or from its stack (which is sort of dynamic but it's not too hard to be confident you won't run out of stack space), or from a memory pool when you can statically determine the maximum number of objects needed, or in rare cases it can allocate dynamically from a shared heap and be very careful about OOM handling.

E.g. FreeRTOS can be used with partly or entirely static allocation (https://www.freertos.org/Static_Vs_Dynamic_Memory_Allocat...). Your application can implement a new thread as a struct/class that contains an array for the stack, a StaticTask_t, and a bunch of queues and timers and mutexes and whatever. You pass the memory into FreeRTOS APIs which connect it to other threads with linked lists, so FreeRTOS doesn't do any allocation itself but doesn't impose any hardcoded bounds. And since you know your application will only have one instance of that thread, it can be statically allocated and the linker will guarantee there's enough RAM for it.

In terms of the application's call graph, you want to move the allocations (and therefore the possibility of allocation failure) as far away from the leaf functions as possible. Just do a few big allocations at a high level where it's easier to unwind. Leaf functions include the OS and the language's standard library and logging functions etc, so you really need them to be designed to not do dynamic allocation themselves, otherwise you have no hope of making this work.

The C++ standard library is bad at that, but the language gives you reasonable tools to implement your own statically-allocated containers (in particular using templates for parameterised sizes; it's much more painful in C without templates). From an extremely brief look at Zig, it appears to have similar tools (generics with compile-time sizes) and at least some of the standard library is designed to work with memory passed in by the caller (and the rest lets the caller provide the dynamic allocator). Rust presumably has similar tools, but I get the impression a lot of the standard library relies on a global allocator and has little interest in providing non-allocating APIs.

It's not always easy to write allocation-free code, and it's not always the most memory-efficient (because if your program uses objects A and B at non-overlapping times, it'll statically allocate A+B instead of dynamically allocating max(A,B)), but sometimes it is feasible and it's really nice to have the guarantee that you will never have to debug an out-of-memory crash. And even if you can't do it for the whole application, you still get some benefit from making large parts of it allocation-free.

(This is for code that's a long way below "complex applications" from a typical Linux developer's perspective. But nowadays there's still a load of development for e.g. IoT devices where memory is limited to KBs or single-digit MBs, implementing complicated protocols across a potentially hostile network, so it's a niche where a language that's nicer and safer than C/C++ but no less efficient would be very useful.)

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 22:33 UTC (Thu) by roc (subscriber, #30627) [Link]

Yes, not allocating at all is a good option for some applications --- probably a lot more applications than those for which "check every allocation" is suitable.

There is a growing ecosystem of no-allocation Rust libraries, and Rust libraries that can optionally be configured to not allocate. These are "no-std" (but still use "core", which doesn't allocate). https://lib.rs/no-std

Rust const generics (getting closer!) will make static-allocation code easier to write in Rust.

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 4:03 UTC (Thu) by ofranja (guest, #11084) [Link] (11 responses)

Two caveats:

1. You very correctly noted Rust std library is the issue, not the language, but then went on to incorrectly say it makes the situation hard to handle.

Rust std library panics on OOM, and panic itself can be handled. If unwinding is not desirable, not using the std library collections is a valid approach: there are alternative libraries providing fallible allocation.

Also, there is also an RFC for adding support for fallible allocations on std library. So this is a very weak point against it.

2. Adding features is not that easy. You can't retrofit an ownership system a la Rust on a pre-existing language without discarding or rewriting entire code bases.

Overall, Zig has a compelling simplicity and brings some improvements over C. Unfortunately, it may need more than a few improvements to justify a transition to a new language.

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 15:38 UTC (Thu) by khim (subscriber, #9252) [Link] (10 responses)

It's funny that you say “you can't retrofit an ownership system a la Rust on a pre-existing language without discarding or rewriting entire code bases” and then turn around and say “there are alternative libraries providing fallible allocation — so this is a very weak point against it.”

How much Rust code can you use if you give up on the standard library? No, really? That's valid question. Maybe I misunderstood how Rust ecosystem works, but I was under impression that majority of crates actually assume that you use standard library and thus would be unusable for you if you decide to with with the “alternative libraries providing fallible allocation”.

And if it's Ok for you to just take these libraries and go redo everything from scratch… then it's Ok for you to rewrite everything after you add borrow checker to the language.

This being said I'm not entirely sure Zig made the right approach by starting from C. I think “OOM-safe Rust” would have been better. But that would have raised barriers to entry significantly — and I'm not even sure Rust's approach to ownership even makes sense in OOM-safe language!

Because the simplest way to handle OOM is with arenas. But then you don't need to track ownership so strictly: if certain set of objects is destined to disappear as a set… then you don't care about who owns what in that set. Links between <b>these</b> objects don't need the parent/child model… but you still need something like that for relationships between arenas.

Only time will tell if Zig made the right call or not. For now… I'm not ready to jump. But I will be looking.

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 18:20 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (3 responses)

> It's funny that you say “you can't retrofit an ownership system a la Rust on a pre-existing language without discarding or rewriting entire code bases” and then turn around and say “there are alternative libraries providing fallible allocation — so this is a very weak point against it.”

One is a language feature that affects the validity of innumerable APIs with relevant data that is only stored (at best) in comments. The other is a library extension. I wouldn't call them comparable. You certainly can't add *any* given feature to a language (without basically telling everyone to go and rewrite their code). Could C++ get lifetime tracking? Sure. Could it do it without upending oodles of code? Very unlikely.

> How much Rust code can you use if you give up on the standard library? No, really? That's valid question

There are a number of crates which have a `no_std` feature which turns off the standard library. Some functionality is lost. There is an underlying `core` library which I think is basically not removable (as it's where compiler and language primitives tend to have their roots).

Here's a list of crates which support modes of being compiled without the standard library: https://lib.rs/no-std

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 23:43 UTC (Thu) by khim (subscriber, #9252) [Link] (2 responses)

> I wouldn't call them comparable.

What's the difference? Both mean that you need to abandon basically all the codebase collection for a given language and start more-or-less from scratch.

> You certainly can't add any given feature to a language (without basically telling everyone to go and rewrite their code)

You certainly could. Rust does it all the time. C++ does it less frequently. Heck, even C does that. You couldn't remove — that's different.

> Here's a list of crates which support modes of being compiled without the standard library: https://lib.rs/no-std

About what I expected. 492 crates out of 48,264. And, most likely, mostly on simplistic side.

So, basically, literally 99% of codebase becomes unavailable for you if you want to handle OOM… at this point it's not materially different from switching to a new language… even if it would be 50-70 times less popular than Rust… you would have more choices for you to pick.

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 7:49 UTC (Fri) by laarmen (subscriber, #63948) [Link]

The list is certainly not exhaustive, as it's based on user-defined tags for each crate. For instance, both the crates "serde" and "nom" (the main (de)serialization crate, and a combination parsing library), can be used (with a reduced feature set) in a no_std environment, yet neither appear on the list.

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 13:11 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

> Both mean that you need to abandon basically all the codebase collection for a given language and start more-or-less from scratch.

Having an allocation-failure-resilient standard library wouldn't require me to rewrite code using the current standard library. Certainly not with Rust's stability guarantees.

> You certainly could. Rust does it all the time. C++ does it less frequently. Heck, even C does that. You couldn't remove — that's different.

I said *any*. Sure, specific features get added all the time. Others get rejected all the time (even if they'd be nice to have) because they'd disrupt way too much.

> About what I expected. 492 crates out of 48,264. And, most likely, mostly on simplistic side.

You have to balance this against the interest involved here. There's enough to build *something* at least. Where that line is? Not sure, but it's almost certainly not at a level that is effectively useless.

Zig heading toward a self-hosting compiler

Posted Oct 8, 2020 23:05 UTC (Thu) by ofranja (guest, #11084) [Link] (5 responses)

> How much Rust code can you use if you give up on the standard library?

There are a number of crates with support for "no_std", and that's usually an assumption if you're doing embedded programming.

You could ask the same thing about C++ by the way, and the answer would be the same. I personally use the STL very sparingly; not at all, in my last C++ project.

> Maybe I misunderstood how Rust ecosystem works [..]

Rust has "core" and "std"; "core" is fine since it does not allocate (it doesn't even know what a heap is), "std" is the high-level library.

> I think “OOM-safe Rust” would have been better. [..]

Rust *is* OOM safe. The only thing that allocates is the std library - which is already optional, and will likely have support for fallible allocation in the near future.

Again, there are a number of alternatives.

> Because the simplest way to handle OOM is with arenas. [..]

No, the easier way is to do static memory allocation so you never OOM.

On embedded you want everything pre-allocated as much as possible since managing memory is a cost per se.

If you have dynamic allocation you need to handle OOM, period. Arenas help with the cost but only push the problem to a different place.

> Only time will tell if Zig made the right call or not.

Don't get me wrong, I like some of the ideas from Zig - like treating types as values and using "comptime" for generic programming. There are some languages that have a similar concept of multi-staged compilation - which is much more ergonomic than macros with clumsy syntax - but as soon as you try to get fancy with types you might step into research-level work. Zig does not have a really advanced type system so that's not a problem for now, but at the same time this also limits the language expressiveness.

Last, but not least, you have to consider what the language offers you. As I said before: improving C is one thing but maybe not enough for justifying a new language; a paradigm shift, however, is something much more appealing.

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 0:12 UTC (Fri) by khim (subscriber, #9252) [Link] (4 responses)

>There are a number of crates with support for "no_std", and that's usually an assumption if you're doing embedded programming.

About 1% of total. So basically already-not-that-popular Rust suddenly becomes 100 times less capable for you.

>You could ask the same thing about C++ by the way, and the answer would be the same.

Yes and no. On one hand C++ is much, much bigger. So even with 1% of the codebase you still have more choice.

On the other hand C++20 made a total blunder: coroutines, feature which looks like a godsend to embedded… is built on top of dynamic allocation. Doh.

Sure, you can play some tricks, you can disassemble compiled code and look on if allocations were actually eliminated or not, you can look on how much stack is used… but at this point “simply pick another language” starts looking like a more viable approach long-term.

>Rust has "core" and "std"; "core" is fine since it does not allocate (it doesn't even know what a heap is), "std" is the high-level library.

Thanks for explanation. I knew that some crates are non-allocating. Was just not sure about how much actual ready-to-use code could you still use if you give up on “std”… and answer is about what I expected: 1% or so.

>The only thing that allocates is the std library - which is already optional, and will likely have support for fallible allocation in the near future.

You couldn't call something which is used by 99% of codebase “optional”. It just doesn't work. That's a mistake which D did (and which probably doomed it): GC was declared “optional” in that same sense — yet the majority of the codebase couldn't be used without it. This meant that you couldn't do some things in the language because GC is “optional” — yet you couldn't, practically speaking, go without it because then you would have to write everything from scratch. Thus you got worst sides of two worlds.

>No, the easier way is to do static memory allocation so you never OOM.

That's not always feasible. And embedded is not the whole world. I still hate the fact that I couldn't just open large file and see simple message “out of memory, sorry” instead of looking on frozen desktop which I'm forced to reset because otherwise my system would be unusable for hours (literally: I measured it — between 1 hour and about 2 hours before OOM-killer would finally wreak enough havoc for the system to react to Alt-Ctrl-Fx and switch to text console… which is of course no longer needed because some processes are killed and GUI is responsive again).

>If you have dynamic allocation you need to handle OOM, period. Arenas help with the cost but only push the problem to a different place.

Well… arenas make it feasible to do that… but that, by itself, doesn't, of course, mean that anyone would bother. That's true.

>As I said before: improving C is one thing but maybe not enough for justifying a new language; a paradigm shift, however, is something much more appealing.

Paradigm shift, my ass. How have we ended up in a world where “program shouldn't randomly die with no warnings” is considered “a paradigm shift”, I wonder?

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 2:19 UTC (Fri) by ofranja (guest, #11084) [Link]

By paradigm shift I meant a safe-by-default language without GC.

I already addressed your points in my other comments and I don't feel like repeating myself - specially to someone being rude and sarcastic - so let's just agree to disagree.

Zig heading toward a self-hosting compiler

Posted Oct 9, 2020 13:20 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (2 responses)

> I still hate the fact that I couldn't just open large file and see simple message “out of memory, sorry” instead of looking on frozen desktop which I'm forced to reset because otherwise my system would be unusable for hours

There's some super-templated code in the codebase I work on regularly that eats 4G+ of memory per TU. I've learned to wrap this up in a `systemd-run --user` command which limits that command's memory. This way it is always on the chopping block first for using its allocated slot (instead of X, tmux, or Firefox all of which are way more intrusive to recover from). Of course, this doesn't help opening large files in existing editors, but I tend to open and close Vim instances all the time, so it'd be possible at least for my usage pattern.

Zig heading toward a self-hosting compiler

Posted Oct 10, 2020 16:52 UTC (Sat) by ofranja (guest, #11084) [Link] (1 responses)

> Of course, this doesn't help opening large files in existing editors [..]

Just to clarify the previous point of the discussion, neither a language that handles OOM would. It makes zero difference, actually: since malloc never fails on systems with overcommit enabled, there's no OOM to handle from the program's point of view.

There are a few solutions to the problem, the one you mentioned works but it depends on the program's behaviour and it can still lead to OOM if other programs need to use more memory than expected. The most general - without disabling overcommit - is to disable swap and set a limit on the minimum amount of cached data on the memory. When memory runs out the system will kill something instead of trash away, since there would be no pages left to allocate.

Zig heading toward a self-hosting compiler

Posted Oct 10, 2020 23:04 UTC (Sat) by khim (subscriber, #9252) [Link]

>When memory runs out the system will kill something instead of trash away, since there would be no pages left to allocate.

It's not useful. On my desktop with 192GB or RAM it takes between two and three hours before system finally returns. And quite often the whole thing becomes useless because some critical process of modern desktop becomes half-alive where it continues to run but doesn't respond to dbus requests.

You couldn't do that with today's desktop, period.

You can build series of kludges which would make you life tolerable (running compilation in cgroup is one way to prevent OOM situation for the whole system), but you couldn't do what you could with humble Turbo Pascal 7.0: open files till memory runs out, then close some when system complains.

You have to buy big enough system from handling all your needs and keep an eye on it not to be overloaded.

This works since today's systems are ridiculously overloaded compared to what Turbo Pascal 7.0 usually had… it's just looks a bit ridiculous…