|
|
Log in / Subscribe / Register

Python cryptography, Rust, and Gentoo

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 0:39 UTC (Thu) by BirAdam (guest, #132170)
Parent article: Python cryptography, Rust, and Gentoo

This mania for “memory safety” isn’t necessarily bad, but the Rust people are making me hate everyone who complains about C’s lack of memory safety.

First, Rust solves one problem and adds 3 more. It adds backward compatibility breaks. It isn’t as bad as Python at this, but then the Python people are not advocating Python as a systems language. C’s one great strength is that C code is C code. It tends to just keep working over time. The second added problem is precisely this one. Rust is being promoted as a systems language when it doesn’t work on all of the hardware needed by a systems language. The third major issue is that Rust has the cargo system as part of its standard use model. This encourages bad behavior. I do not care how “memory safe” your language is if people regularly include unvetted code from some repo.

The final point that I have yet to hear properly explained is why C is good enough to write other languages in, but not okay for others to use. You’re either admitting that other programmers are “talented enough” to use C and that you are not, or you’re just pawning responsibility off on someone else because you’re too lazy to properly do your job. Either way, C as a tool is blameless of programmer error.

(btw, I know that Rust is not written in C, it was initially ocaml and then rewritten in Rust, just making a point about the constant screaming about “C is bad because everyone knows C is bad”)


to post comments

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 0:59 UTC (Thu) by Paf (subscriber, #91811) [Link] (7 responses)

“ The final point that I have yet to hear properly explained is why C is good enough to write other languages in, but not okay for others to use. You’re either admitting that other programmers are “talented enough” to use C and that you are not, or you’re just pawning responsibility off on someone else because you’re too lazy to properly do your job. Either way, C as a tool is blameless of programmer error.”

Come on, you know better. The idea is that it is sometimes necessary or desirable but *difficult and time consuming* to write well/safely/securely in C. The goal here is to reduce the amount of code that needs to be written in it, and I think that’s reasonable. (I’m a file system and kernel dev who makes his living in C, btw.)

I love working in C, I love the simplicity and feeling of precision. But it’s been amply demonstrated that humans are not great at getting memory allocation and pointer arithmetic, etc, right, and if we can remove that as a problem for more code, then that’s desirable. And yeah, sure, some developers are better at it than others. But why should we make anything harder unless we need to?

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 5:12 UTC (Thu) by marcH (subscriber, #57642) [Link] (6 responses)

> I love the simplicity and feeling of precision

Emphasis on "feeling"

https://queue.acm.org/detail.cfm?id=3212479

C was a great low-level language - for the PDP-11

Posted Feb 12, 2021 11:40 UTC (Fri) by sdalley (subscriber, #18550) [Link] (5 responses)

That was a *really* good article!

The increasingly mind-boggling and foot-shooting complexity of modern C compiler optimizations is the clearest evidence one could wish for that C is not "close to the metal" of any modern mainstream processor. Like a tree growing on top of a pile of buried scrap metal, modern architectures and compilers have had to distort and twist themselves to grow around the need of preserving the illusion that they have flat memory, fixed registers, pointer arithmetic and sequential operation.

What would a useful modern low-level language that treats vectors, co-processors, threads, segments, references and caches as first-class objects look like?

C was a great low-level language - for the PDP-11

Posted Feb 12, 2021 12:17 UTC (Fri) by pizza (subscriber, #46) [Link] (2 responses)

The reasons compilers are so re-writingly complex is the same reason that modern CPUs are so re-writingly complex: squeezing every last drop of performance out of _existing_ code.

After the top-line price, raw performance is the only thing that folks actually care about.

(Granted, the tide has begun to shift slightly in favor of "security", but given the choice, folks will choose "faster" over "more secure"... every. single. time.)

C was a great low-level language - for the PDP-11

Posted Feb 12, 2021 17:35 UTC (Fri) by anselm (subscriber, #2796) [Link] (1 responses)

The reasons compilers are so re-writingly complex is the same reason that modern CPUs are so re-writingly complex: squeezing every last drop of performance out of _existing_ code.

Also, humans have a better chance of writing working (let alone efficient) code if they don't need to think about “vectors, co-processors, threads, segments, references and caches as first-class objects”. We have compilers so we don't need to worry about all of those (the vast majority of us who aren't working on actual compilers, anyway).

C was a great low-level language - for the PDP-11

Posted Feb 12, 2021 22:08 UTC (Fri) by marcH (subscriber, #57642) [Link]

For the PDP-11, C provided an outstanding trade-off: user-friendly programming concepts that mapped really well to the hardware.

While these concepts don't map with the hardware anymore, they stayed familiar and their programmer-friendliness has indeed not regressed. But it hasn't progressed either.

It is a very sad vicious circle to see that programming concepts and hardware keep meeting in a place that does not exist any more. Something like "retpoline" is the absolute irony: still meeting the hardware in that old, fictional place BUT with the knowledge of what hardware really does behind the scenes AND the intention to defeat that! Multiple layers of masquerading; what a carnival.

It's fantastic to see that a new crop of programming languages are at least trying to evolve a bit.

http://worrydream.com/#!/TheFutureOfProgramming (Bret Victor)

C was a great low-level language - for the PDP-11

Posted Feb 15, 2021 9:46 UTC (Mon) by anton (subscriber, #25547) [Link] (1 responses)

The referenced article is not particularly good, just a hodgepodge of pet peeves.

As for the complexity of gcc and clang/LLVM, it is an indication that they have too much budget and want to produce good benchmark results (at the cost of worse usability) to justify that (admittedly they are also doing things that help usability, but they could do that without doing the other nonsense).

As for flat memory and caches (and, mentioned in the paper, cache coherency protocols), that is indeed hardware architecture for speeding up existing software written for a simple memory model, plus being able to run processes with large memory needs. Hardware architects needed a long time to get here, and tried to throw the complexity over to programmers the whole time (and are still doing it, with weak memory consistency): Instead of caches, they wanted us to manage fast memory by software, with the most recent instance being the SPEs of the Cell Broadband Engine (used in the PlayStation 3). Instead of somewhat consistent shared memory, they would rather have given us distributed memory, with software managing the transfer of data from remote to local memory before processing (supercomputers still have this). All this would make general-purpose programming so much harder that the alternatives with more complex hardware won out. So the architectures provide at least single-threaded programs with a "flat" memory model, and a language that reflects that memory model with, e.g., address arithmetic is a sensible low-level language for that (but note taht C as understood by the gcc and clang maintainers is not such a language).

Segments are what I first thought of when you mentioned "flat memory". This has been pretty much eliminated as architectural (mis)feature (and where it is present, it has not been used for a while); having it in an architecture costs extra hardware, and costs extra in software. As to how a low-level language would look that supports it, look at the C standard; it includes many restrictions that cater for these kinds of architectures; and these days the gcc and clang maintainers use these restrictions as justification for miscompiling programs on architectures with flat memory.

As for register renaming (vs. "fixed registers"), Intel has spent billions on IA-64 aka Itanium based on the idea that compilers could rename "fixed registers" and reorder instructions better than the hardware can. In the end it turned out that the hardware with register renaming performs better for most software. The IA-64 approach would also have required more complex compilers to perform well, and the Itanium CPUs are also quite power-hungry even without a register renamer.

Vectors as first-class objects: Look at APL, J, or FP, although I would not call these languages low-level. Still, Backus was not pleased with architecture and programming languages and proposed FP as an alternative programming model. But despite Backus' standing and his high-profile presentation of his critique and alternative, FP/FL have not seen mainstream success nor taken the functional programming community by storm.

On a completely different track, you can look at GNU C's vector extensions, which is pretty low-level.

As for threads, we have seen SMT in mainstream CPUs since 2002 and multi-core CPUs in the mainstream since 2005. The low-level approaches to that have been pthreads and the C++ memory model, but they are hard to program with. By contrast, Unix pipes (a high-level concept) lets me use multiple cores or hardware threads without particular effort (but typically only for rather limited amounts of parallelism).

Occam is a programming language for programming distributed-memory multiprocessors (but even on shared-memory machines, each thread could get its private memory, limiting the memory ordering headaches to the implementation of communications). I think that one other thing that the transputers and Occam did right was to make thread creation, destruction and communications very cheap, so finding the right granularity of parallel processing was not as critical as on current mainstream stuff. Still, I don't see these aspects of Occam being picked up in the mainstream, so maybe they are not as important as I think.

Overall, the problem of making good use of many threads with little burden on the programmers is still unsolved, and that's why architectures with lots of slow threads have not found mainstream success.

C was a great low-level language - for the PDP-11

Posted Feb 15, 2021 12:47 UTC (Mon) by excors (subscriber, #95769) [Link]

> Instead of caches, they wanted us to manage fast memory by software, with the most recent instance being the SPEs of the Cell Broadband Engine (used in the PlayStation 3). Instead of somewhat consistent shared memory, they would rather have given us distributed memory, with software managing the transfer of data from remote to local memory before processing (supercomputers still have this). All this would make general-purpose programming so much harder that the alternatives with more complex hardware won out.

On the other hand GPGPU has risen in popularity, and that often does require the programmer to explicitly handle distributed memory. In OpenCL terminology you have host memory (the system RAM shared with the CPU), global memory (VRAM), local memory (shared by a large group of work-items), and private memory (basically the register file for a single work-item, though with some sharing between nearby work-items). You have to declare where all your data will live in that hierarchy, and write code to copy it between different levels, and partition your work-items to be in the same group/subgroup when they need to share data efficiently, and that can have a massive effect (maybe 1-2 orders of magnitude) on performance.

For serious number-crunching, GPUs won out over CPUs, which I suspect is because their memory model is much more scalable than the CPU's illusion of consistent shared memory, *and* they have a programming model that makes it relatively easy to exploit that memory model (by running many thousands of parallel threads so the programmer can usually ignore memory latency and branch latency - even if 90% of threads are stalled, there's enough runnable threads to keep all the ALUs busy or to saturate memory bandwidth - and by having just enough sharing between threads so they can coordinate on non-trivially-parallelisable problems).

As far as I can see, Cell was somewhere in the middle: it had GPU-like memory (8 SPEs with 256KB of local memory, and 2KB of private memory (/registers) split between 4-16 work-items (/SIMD lanes)) but it had a more traditional CPU-like programming model (just a single thread per SPE, running SIMD instructions, but even worse than regular CPUs at branches). The problem wasn't the distributed memory model, the problem was that it didn't commit hard enough in either direction and so it was beaten by GPUs on one side and traditional CPUs on the other side.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 1:58 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

> The final point that I have yet to hear properly explained is why C is good enough to write other languages in
Rust compiler is written in Rust (although there's an alternative incomplete reimplementation in C++ for bootstrapping).

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 7:48 UTC (Thu) by rsidd (subscriber, #2582) [Link]

You didn't read the full comment. Last para.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 2:01 UTC (Thu) by marcH (subscriber, #57642) [Link] (58 responses)

> I do not care how “memory safe” your language is if people regularly include unvetted code from some repo.

These are both very serious security issues but unrelated to each other. They seem related only because C sucks at both safety _and_ code re-use.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 12:23 UTC (Thu) by LtWorf (subscriber, #124958) [Link] (57 responses)

There are lots of C libraries.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 12:45 UTC (Thu) by rahulsundaram (subscriber, #21946) [Link] (56 responses)

> There are lots of C libraries.

Yes there are some but unlike Rust or Python, there is no single place you can go to look for them and the tooling around installing or updating the dependencies isn't as straightforward.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 12:55 UTC (Thu) by k3ninho (subscriber, #50375) [Link]

Ex-kernel developer Rusty Russell's library of patterns, C-Code Archive Network didn't achieve ubiquity.
 
K3n.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 18:54 UTC (Thu) by logang (subscriber, #127618) [Link] (54 responses)

apt search xyz
apt install libxyz

How is that not straightforward?

The difference between C here and other languages isn't in the straightforwardness of finding and installing libraries but the difficulty of publishing them. Getting a library into pypy/whatever requires zero effort and there is zero quality control. Getting a library into a distribution is a lot harder and as a result the C libraries there tend to be of a higher quality; but the cost of this is that there are fewer choices.

However, I believe this is a good thing. No serious programmer should be choosing to depend on tiny and marginally maintained libraries that often don't care one wit about breaking their consumers. This can create very serious headaches down the road. Thought and care should be put into every dependency. Just because it's trendy these days to do otherwise, doesn't mean it's a good idea.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:31 UTC (Thu) by roc (subscriber, #30627) [Link] (34 responses)

Depending on distro libraries is a nightmare for developers. It creates so many problems.

When I make my software depend on a distro library, I now have to worry about:
-- Adding a step before the build that makes sure the library package is installed, e.g. providing instructions *per-distro* to install it manually, making my software harder to build
-- For distros that don't package the library (or package a version of it that's older than I need), providing instructions to build and install that library manually, making my software even harder to build
-- Making sure my software builds and runs with a range of library versions packaged by different distros and distro versions, potentially packaged in different ways with different directory layouts etc across distros
-- On platforms like Windows, iOS and Android (i.e. where almost all users are), where users cannot or will not build the software themselves and I need to provide binaries, and there definitely will not be a "distro package" I can use, I need to vendor the library myself anyway

Once I vendor the library for Windows/mobile, it's usually easier to just use that for Linux too. This is why big projects like Firefox/Chrome vendor everything.

Example: rr uses Capnproto, BLAKE2 and brotli, but we only depend on Capnproto as an external library; we vendor BLAKE2 and brotli. Even the single Capnproto dependency is horrible to deal with. For example we want to distribute rr binaries that work across distros, which means we want the rr build to support static linking of Canproto, but many distro Capnproto packages don't support static linking.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:18 UTC (Thu) by logang (subscriber, #127618) [Link] (25 responses)

Many of these problems already have solutions and haven't really been that big of a deal in the past. autoconf and Cmake have existed for a long time.

It's hard to avoid the issues with Windows/iOS/Android but if you're developing such an application you are probably not using C. Windows has always had a hellish story for libraries.

Firefox, for one, is distributed by many in an unvendored form.

But the overall theme is that the libraries you are using (or not) need help. If a library you want to use is good, and well maintained but not packaged, help them package it. If an algorithm isn't in a library, find an existing library that is a good fit and add to it (or, in the worst case start a new library, preferably that contains a lot more than just one algorithm). Or maybe the benefits of the latest and greatest compression algorithm are outweighed by older ones due to their accessibility. Develop with library versions that are commonly available, not the latest and greatest. Wait for features to mature (and possibly help them mature) before depending on them. If distros don't package a static library of something, send a patch so they can. Ultimately doing all this work allows you to write software that can be included in a distro and that should be the long term goal that is by far the easiest for all your users and easiest for the people that end up maintaining your software.

There is an awfully large amount of well written C software that has been written this way, has stood the test of time and will likely be around for a long time to come.

Yes, this can take more time and may mean you have to do more work in the short term, or wait for new features to percolate through the process. But the long term end result is a more sustainable ecosystem with a lot less work over the entire community. Vendoring something might make less work for you in the moment, but is more work for other people (or even your future self) down the line and doesn't solve anything for other people with the same problems as you.

If you want to write brittle broken software that needs constant attention and maybe doesn't even work at all in a few years, then yes, go ahead and keep doing things this way. Those that engineer things properly will still be around, still making constant progress.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:36 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

> Ultimately doing all this work allows you to write software that can be included in a distro and that should be the long term goal that is by far the easiest for all your users and easiest for the people that end up maintaining your software.
Why mutilating your development process in order to confirm to arbitrary distro whims should be your goal?

A goal of an application developer is to provide value to users. Around 99% of users use iOS/Android/Windows/macOS, not classic distros.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:29 UTC (Thu) by roc (subscriber, #30627) [Link]

>If a library you want to use is good, and well maintained but not packaged, help them package it.

For all distros that any of my users might conceivably use? And then wait for several years for users to actually update to distro versions where the new package is present? No, that is completely unreasonable.

It's actually kind of breathtaking what you're suggesting here --- become a member of many different distro communities, learn all their different processes, persuade all of them to accept the library (what if they don't?), and stay engaged long term. All to avoid vendoring one library. I doubt there is a single person who has ever done this.

> Or maybe the benefits of the latest and greatest compression algorithm are outweighed by older ones due to their accessibility. Develop with library versions that are commonly available, not the latest and greatest. Wait for features to mature (and possibly help them mature) before depending on them.

Yes, creating worse performing, less capable software is definitely an option. I prefer not to.

> If you want to write brittle broken software that needs constant attention and maybe doesn't even work at all in a few years, then yes, go ahead and keep doing things this way.

Your preferred approach "needs constant attention and may not even work at all in a few years" --- you require me to pay constant attention to how distros are packaging my dependent libraries and regularly contribute to that process. In fact, because bugs are found and requirements change, any project with external dependencies requires ongoing attention.

My main project Pernosco is in Rust, has tons of dependencies (because it does a lot), and Rust+cargo have done a great job of managing those dependencies over the last five years. I am happy to keep on doing this this way.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 14:58 UTC (Fri) by MrWim (subscriber, #47432) [Link] (21 responses)

> But the overall theme is that the libraries you are using (or not) need help.

I agree. I think distro library management incentivises working around bugs, while cargo* incentivises helping upstream libraries.

I don't see this point brought up very often when discussing cargo, but I consider it to be one of the principal advantages of cargo.

> Develop with library versions that are commonly available, not the latest and greatest.

I think this is a sensible approach if you limit yourself to libraries, dynamically linked and available in distros. However I think it demonstrates how distro package managers incentivise *not* helping the libraries you're using.

Imagine you're writing some code and you come across a bug in a library you're using. You can choose to fix the bug upstream, or you can choose to work around it in your downstream code. With cargo you clone your dependency's git repo, fix the bug, push the change to a pull request upstream and update your Cargo.toml dependency to point at your new git revision with:

mydep = { git = "https://github.com/me/mydep.git", rev = "9f35b8e" }

You can leave it pointing at that specific revision until the upstream makes a new release at which point you update your Cargo.toml back to:

mydep = "3"

Fixing the bug (or adding the feature) upstream is the path of least resistance. Once you do it others who are using the library can benefit at the time of their choosing. In my mind many small fixes like this **is** the maturation process

Now what's the process with distro package managers? You're working on your new feature for your software. You come across a bug. You fix it upstream, you wait for it to get accepted upstream, you wait for upstream to make a new release and then you wait a few years for it to get into enterprise distros. Then you upgrade your infrastructure to a new major distro version, and only then can you deploy your new software that depends on this bug-fix/feature to get it in the hands of your users.

No, waiting, waiting and waiting is not going to fly. You want to help upstream but depriving your users of the new feature in your software for years is too high a cost to pay. So you work around the bug in your software and maybe if you've got time left over you also submit a fix upstream.

> Wait for features to mature (and possibly help them mature) before depending on them

I think this is the crux of my argument. cargo makes it easy to help features mature. Limiting yourself to distro repos means you have to wait for them to mature.

* possibly other language package managers too, but I'm not sure. I think cargo is best-in-class in this regard and some of its advantages may not apply to non-compiled languages/languages that don't statically link.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 15:21 UTC (Fri) by MrWim (subscriber, #47432) [Link]

Another way cargo encourages upstream collaboration is standardisation. I believe that the biggest barrier to open-source contribution is actually getting the software built in the first place. It's generally easy with rust, because it's always the same, and because the compilation model and cargo seem well designed. Check out the source code and:

cargo build

cargo takes care of finding and building the required dependencies. When you want to test your change it's cargo test. Finding the git repo for a dependency is easy too. It's linked to from its page on crates.io.

Note that nothing I've said above is related to rust as a language, it's all about tooling, but most importantly the culture of the rust community.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 21:35 UTC (Fri) by roc (subscriber, #30627) [Link]

This is a really good point.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 14:16 UTC (Sat) by LtWorf (subscriber, #124958) [Link] (18 responses)

> Imagine you're writing some code and you come across a bug in a library you're using. You can choose to fix the bug upstream, or you can choose to work around it in your downstream code.

You can also send the patch to the distribution directly, or send it to both parties.

> Fixing the bug (or adding the feature) upstream is the path of least resistance.

You claim that the work of:

* forking
* fixing
* making a pull request to upstream
* going through multiple rounds until your patch is good enough to be included upstream and respects their standard of quality
* monitoring upstream's releases to know when a new release with your fix is out
* change your dependencies back to use upstream

is the path of least resistance

LOL.

It isn't. Want to know what people will do? Fork, patch, and point forever to their out of date fork.

Now THAT is the path of least resistance, it only includes 2 of the steps of the previous list. Of course now all this software might contain security vulnerabilities that will never be fixed.

> Now what's the process with distro package managers?

For a bugfix you can patch a package directly in the distribution.

> You fix it upstream,

Or directly downstream, as I said.

> No, waiting, waiting and waiting is not going to fly.

You assume that distributions and upstream projects are maintained by members of 2 different races. Distribution maintainers can be fast, and upstream maintainers can take months to reply. It depends entirely on the specific project..

Also you are saying loads of incorrect things and forgetting that distributions can and do patch bugs out.

> I don't see this point brought up very often when discussing cargo, but I consider it to be one of the principal advantages of cargo.

As we have seen, your entire assumption of what the "path of least resistance" is, was completely wrong. So was the conclusion :)

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 14:42 UTC (Sat) by mathstuf (subscriber, #69389) [Link] (2 responses)

> You can also send the patch to the distribution directly, or send it to both parties.

Not all patches should have this done. For example, those which change API are certainly not eligible for direct distro inclusion (IMO). Upstream should have a look before someone else ships a new API in their name for sure. Even for bugfixes, I don't know if my patch is an X/Y problem and that I'm actually patching a symptom and not a root cause. Upstream can certainly help improve these patches better than packagers (on average).

> You claim that the work of: … is the path of least resistance

IME? Yes. Because things like PyPI, crates.io, etc. make releases so easy, once it is in, the release shouldn't be *too* hard. Because I can't publish *my* crate to crates.io while pointing to my fork (unless I publish it as a crate of its own on crates.io, but that requires renaming due to collisions…which is then *more* work on my consuming side).

> For a bugfix you can patch a package directly in the distribution.

"the distribution". As if there's only one.

> Distribution maintainers can be fast, and upstream maintainers can take months to reply.

What does this have to do with anything? The reverse is also certainly possible.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 15:45 UTC (Sat) by LtWorf (subscriber, #124958) [Link] (1 responses)

> those which change API are certainly not eligible for direct distro inclusion (IMO).

Those are not eligible to be accepted anywhere.

> Upstream can certainly help improve these patches better than packagers (on average).

There is an amount of software that distribution maintainers fork and become the "new upstream" because the actual upstream completely abandoned the project.

Yes upstream people abandon projects all the time. See python2 in red hat.

> IME? Yes. Because things like PyPI, crates.io, etc. make releases so easy, once it is in, the release shouldn't be *too* hard.

You can just point to your commit forever. Your software certainly wouldn't break.

> "the distribution". As if there's only one.

Uhm distributions share patches with each other.

> What does this have to do with anything? The reverse is also certainly possible.

It is possible, but you presented it as the only existing possibility.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 22:56 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

> Those are not eligible to be accepted anywhere.

APIs do change. I'm including things like "add a new enum variant for some new OpenSSL feature" kind of API changes in this. These patches certainly have a place, just not in some distro-specific patch (woe be unto anyone relying on distro packages being representative of upstream decisions in this case). See https://lwn.net/Articles/845448/ for a real-world case of this happening.

> There is an amount of software that distribution maintainers fork and become the "new upstream" because the actual upstream completely abandoned the project.

Why would I select such a project for a new dependency? All you're left with is projects that now need to port off of it (at least that would be my decision assuming there wasn't a distro-agnostic maintenance process set up). Case in point: scrot in Fedora (maintainer here). giblib and scrot were abandoned by upstream. The community picked up scrot, but left giblib alone. giblib starts to FTBFS. I don't want to maintain it; it's just a dependency of a project I do care about and I really don't want questions about it outside of that use. I file an issue upstream to port away from giblib. Still nothing. It's certainly not a patchset I want to maintain. So, scrot is currently dead in Fedora because I explicitly do *not* want to become an upstream.

As for something like Python2, yeah, that'll get some distro pickup. giblib? Not worth my time.

> You can just point to your commit forever. Your software certainly wouldn't break.

Not if I want to publish it anywhere (useful); crates.io requires that crates.io provide all your dependencies. I imagine PyPI is probably similar, but don't know.

> Uhm distributions share patches with each other

As if that's typical or even common (I'd like to see evidence). I've had to hunt down distro patches to our project that never got contributed to us, upstream. If they're not sharing upstream (or even filing issues about what they are patching), why would they share with each other? Granted, things have gotten better, but why must upstream be the one prodding here?

> It is possible, but you presented it as the only existing possibility.

Maybe that's MrWim you're thinking of?

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 15:08 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

> It isn't. Want to know what people will do? Fork, patch, and point forever to their out of date fork.
This is true. However, clicking a couple of buttons on a web form and submitting a PR is pretty easy.

> You assume that distributions and upstream projects are maintained by members of 2 different races. Distribution maintainers can be fast, and upstream maintainers can take months to reply. It depends entirely on the specific project..
So your users must depend on a whim of an unpaid maintainer for months-to-years? That's a nice model.

> Also you are saying loads of incorrect things and forgetting that distributions can and do patch bugs out.
The other poster actually nails most obvious issues with distros. They simply suck for application writers.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 20:36 UTC (Sat) by roc (subscriber, #30627) [Link] (9 responses)

Submitting changes to distro library packagers instead of upstream would only be my last resort, for the case where upstream is completely unhelpful. (In which case I'll be looking to move off that dependency anyway.) It simply doesn't scale given the number of distributions in use. In fact I have never, ever done this.

What I *have* done, many times, is exactly what MrWim proposed: made local changes to a Rust library via a temporary Cargo [patch], and later submitted those changes upstream --- and had them accepted. The former step is indeed the path of least resistance and lets me make progress in my project. The latter step is justified because there is an ongoing maintenance cost to those patches, so reducing the number of them that we're carrying at any one time pays off long term. The review they get upstream is also valuable. I'm working through this right now at https://github.com/rayon-rs/rayon/issues/562 for example.

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 22:10 UTC (Sun) by marcH (subscriber, #57642) [Link] (8 responses)

> made local changes to a Rust library via a temporary Cargo [patch], and later submitted those changes upstream --- and had them accepted. The former step is indeed the path of least resistance and lets me make progress in my project. The latter step is justified because there is an ongoing maintenance cost to those patches, so reducing the number of them that we're carrying at any one time pays off long term.

_This_ is "real" open-source: zero boundary between downloading/using/cloning/branching/forking/experimenting = complete freedom. This is why decentralized version control felt liberating. I would even argue that a project still stuck in centralized/medieval version control cannot really be considered open-source because of the added friction. And don't get me started on directories with sometimes long lists of *.patch files... never heard about branches?

Configuring and building C/C++ code at large is a nightmare and Linux distributions have been performing an amazing and critical job there. However to solve this they had to add layer(s) of indirection between software authors and users which adds friction and delays. So it's really not a surprise to see many authors trying to connect directly with their users. Random and recent example:

git clone some_python_project
pip install --editable .
<hack, test, hack, test>
git push new_pull_request

It should never be more complicated than this.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 6:43 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (3 responses)

Open source has a precise definition that has absolutely nothing to do with your preferred version control system.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 7:38 UTC (Tue) by marcH (subscriber, #57642) [Link] (2 responses)

Next time at least pretend to try to get the point.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 10:54 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (1 responses)

Next time make one that makes sense?

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 8:44 UTC (Wed) by marcH (subscriber, #57642) [Link]

FYI, the usual behavior on this site when you don't understand something is to either ask or ignore.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 10:39 UTC (Wed) by MrWim (subscriber, #47432) [Link] (2 responses)

> This is why decentralized version control felt liberating.

Thanks for this analogy, there's definitely something to it. Something about not having to ask permission before acting, but instead being able to develop using the same tools as anyone, publish the results and have the results be judged instead.

Maybe what git is to a project, cargo is to a super project, or dependency graph. Hmm, that doesn't quite feel right because the versioning is still provided by git. It's the lockfile that extends git semantics to your entire dependency graph. Hmm, not sure if that's right, I'll have to think on it, but the analogy is food for thought.

As it is I'm a big fan of lockfiles, which can even be applied to whole distros and I believe that good tooling can unlock functional freedom, although I agree with LtWorf that "cannot really be considered open-source" is rather hyperbolic.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 19:15 UTC (Wed) by rgmoore (✭ supporter ✭, #75) [Link] (1 responses)

I agree with LtWorf that "cannot really be considered open-source" is rather hyperbolic.

I think it's closer to true than you might expect. The GPL requires that source be provided in the preferred format for making modifications. That was meant to exclude things like generated code (rather than the material used to generate it) and obfuscated source, but it's not that far out to extend it to source being a copy of the version control system rather than just the raw source files.

Python cryptography, Rust, and Gentoo

Posted Feb 18, 2021 22:19 UTC (Thu) by marcH (subscriber, #57642) [Link]

Imagine a few people on the Internet want to collaborate and start some experimental branch for a project hosted with subversion _without_ bothering the maintainers. They would most likely start by cloning the subversion project to git or similar - and submit back to subversion only in the end (if ever).

Python cryptography, Rust, and Gentoo

Posted Jul 9, 2024 18:27 UTC (Tue) by jengelh (subscriber, #33263) [Link]

>_This_ is "real" open-source: zero boundary between downloading/using

What you describe sounds more like "collaboration" than "open-source": a project can achieve open-source status with so little as having a OSI license, even if its sole author rejects all your PRs because he thinks the project only makes sense for him.

>Configuring and building C/C++ code at large is a nightmare and Linux distributions have been performing an amazing and critical job there. However to solve this they had to add layer(s) of indirection between software authors and users

It is thanks to this indirection - or rather: integration -, that gives projects visiblity in the first place. Things could have turned out quite differently: If librsvg could not be integrated again after the language switch, the distro would either sport no svg support, no browser or no users (or any combination).

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 12:29 UTC (Mon) by MrWim (subscriber, #47432) [Link] (3 responses)

> Distribution maintainers can be fast, and upstream maintainers can take months to reply. It depends entirely on the specific project..

I agree. Distro maintainers can be fast, or not, and the same is true of upstream maintainers. Patches may be suitable for inclusion in stable distros, or not. Upstream maintainers may request changes to the patches, maybe in several rounds, or not.

The point is that with cargo the process is both asynchronous and uniform.

# Asynchronous

I don't need to wait for the maintainer to evaluate my patches for me to continue with my development. We can iterate over the best way to implement something at leisure without me making demands over their time "Please reply promptly this is very important to us", etc.

The other important asynchrony here is that the modifications we make needn't affect other users of the library until both the modifications and the applications are ready. This means that you can be sure that modifications you've made to a dependency don't break curl for example which also uses this dependency. Your changes are isolated until the point they are confirmed good. The way rust and cargo works is that you can even have multiple versions of the same dependency in your application at the same time.

You might respond that multiple versions of a single dependency on a system is a bad thing - it makes security validation and updates more difficult. I'd agree that the ideal state is that all applications on your system use the same version of a library. What I don't like about the traditional dynamically linked distro model is that this is enforced by technology, rather than policy.

I don't like it because I think it makes upgrading a library needlessly fraught, and places too much responsibility on the shoulders of the library maintainer, rather than spreading it more widely among the application maintainers. The library maintainer can review a patch and refuse it on various grounds like it contains bugs, or that it's already working as intended, or that it's likely to break compatibility, etc. They may also refuse it because they're worried that it will break a dependant package. This level of effort from a library maintainer is reasonable to expect - but it would be unfair to ask that maintainer to do QA on all the packages that depend on their package to confirm the lack of breakage. They may not even know how some their dependencies are supposed to behave, so noting that they're not behaving correctly after the patch has been applied would be very difficult indeed.

Instead, by removing the technical requirement that the library be upgraded in lockstep across the whole of the system we can more gracefully upgrade libraries without giving up on the **policy** that there should only be one version. Application maintainers can validate that an dependency upgrade has not broken their application and apply the upgrade then. This is a much narrower task than validating that a library hasn't broken any application, and the task falls on the person best placed to perform it.

The current process works well for patches that are both small and urgent. This applies to most buffer overflow or integer overflow fixes for example. I think a different process would be better for patches that are either not small or not urgent.

I haven't mentioned the importance of uniformity yet, but this reply is already long enough and has taken enough time so I'll worry about that later.

> > You fix it upstream,
>
> Or directly downstream, as I said.

I was responding to logang's comment. Specifically: "the libraries you are using (or not) need help.". I interpret "the libraries" to mean the upstream library. My point is that it's easier for one to help the upstream libraries when they've got them from cargo, rather than from the distro.

It seems that our disagreement is a matter of priorities. My understanding is that you believe in the primacy of the distro, and as such helping the distro has highest priority. Conversely my priorities are my application and users, the upstream library and only then the distro.

I also believe in distros, but principally as distributors/integrators/maintainers of applications, rather than of libraries.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 6:57 UTC (Tue) by LtWorf (subscriber, #124958) [Link] (2 responses)

> I don't need to wait for the maintainer to evaluate my patches for me to continue with my development.

And what if they completely reject your changes or require substantial changes?

This can be for several reasons.

What would you do at that point? You'd need to either give up using the feature you added or refactor your code.

> What I don't like about the traditional dynamically linked distro model is that this is enforced by technology, rather than policy.

The debian policy does not say anything against static linking vs dynamic linking. It does mandate that there can exist only 1 copy of a certain source within the archive. Browsers are granted exceptions because they are irreplaceable and do whatever they want.

> This level of effort from a library maintainer is reasonable to expect - but it would be unfair to ask that maintainer to do QA on all the packages that depend on their package to confirm the lack of breakage. They may not even know how some their dependencies are supposed to behave, so noting that they're not behaving correctly after the patch has been applied would be very difficult indeed.

That's basically the opposite of Torvald's approach :D

> we can more gracefully upgrade libraries without giving up on the **policy** that there should only be one version.

Libraries that are made to support it are normally available in multiple versions for a period of time. Typical example is Qt. Qt4 has been removed only very recently. But point releases replace the previous version, because we don't want to do the go way of depending on a specific commit.

> It seems that our disagreement is a matter of priorities. My understanding is that you believe in the primacy of the distro, and as such helping the distro has highest priority. Conversely my priorities are my application and users, the upstream library and only then the distro.

And when the user ends up with unpatched vulnerabilities because he downloaded some binary from some website because it wasn't included in a distribution because it violated every existing policy… how is the user served well by this?

He has to hope the author has a system in place to recompile when a CVE appears, that there is a repository set up to get the latest version. Using a distribution this would all be solved already rather than having to be solved 10000 times by every single project.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 8:28 UTC (Tue) by matthias (subscriber, #94967) [Link]

>> This level of effort from a library maintainer is reasonable to expect - but it would be unfair to ask that maintainer to do QA on all the packages that depend on their package to confirm the lack of breakage. They may not even know how some their dependencies are supposed to behave, so noting that they're not behaving correctly after the patch has been applied would be very difficult indeed.
> That's basically the opposite of Torvald's approach :D
No, it is exactly Torvalds' approach. The kernel policy is no regressions, yes. But Torvalds does not do the QA for all software that runs on linux. That would be simply impossible. He releases -rc versions of the kernel and asks everyone to test. If regressions are reported, the corresponding patches are reverted (or improved). But it is the job of the users (distros, cloud service providers, etc.) to do the testing and validation.

And it is definitely up to the distros to test whether a new kernel version works for them before they include it into the distribution.

The main difference between the kernel and many other projects is, that the kernel developers care much more about the regression reports received from the users and that regression reports have higher priority than new features.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 13:07 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

> And what if they completely reject your changes or require substantial changes?

If we had gone with your proposal, now my distro is stuck with a patch rejected by upstream. Yay?

In this case, I rework the patch and adapt my code when I point to the next proposed patch. Same as step 1, just with a different baseline to diff from.

> Using a distribution this would all be solved already rather than having to be solved 10000 times by every single project.

As if Linux is the only distribution platform for projects these days. You do realize that Linux (and the BSDs) are the oddballs out here, right? Pretty much everything else does vendoring or the like to a large extent. And if I want a turnkey release from my website, a tarball with dependencies embedded is the answer even for Linux without waiting for distros to churn on the new release (which generally takes a month or two to hit the unstable channels for our project; add a release cycle for the stable channel to have a chance).

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 0:46 UTC (Sat) by hunger (subscriber, #36242) [Link]

> Many of these problems already have solutions and haven't really been that big of a deal in the past. autoconf and Cmake have existed for a long time.

I have seen both autoconf and cmake referred to as a problem more often than as a solution:-)

Watch out: The C/C++ world has tooling for dependency management incoming as well. Sooner or later you will have similar problems with those languages depending on very specific versions of libraries as well. Developers want that stuff, so they will write it. In the end each developer gets the tooling she deserves;-)

> If a library you want to use is good, and well maintained but not packaged, help them package it.

Why? The package will either be too old, or have the functionality my program needs stripped out since some packager did not like a dependency it introduced. Or its crippled or patched to no longer work properly for my application. Or the necessary files for your buildsystem of choice are not installed. Or they can't be found. Or users have crippled the libraries to prevent some imaginary issue or another. I had to deal with all of the above in bug reports already. It is such a huge pain to debug this kind of issue.

As a developer you need to vendor libraries (at thebvery least as a fallback if system libraries are not found!), even when running on distributions that have official packages of the required libraries. And you will get bug reports due to incompatible libraries because some distro packager will unvendor your libraries for you, blissfully ignoring the documented requirements.

> There is an awfully large amount of well written C software that has been written this way, has stood the test of time and will likely be around for a long time to come.

That is survivors bias... tons of poorly written crap written in C got lost and good riddance! Undoubtedly a lot of code written in any other language will not stand the test of time either.

> But the long term end result is a more sustainable ecosystem with a lot less work over the entire community.

There is less code because it takes ages to write and maintain. Is that a benefit to the eco system as a whole? I doubt it: Making something hard does make the persistent people stick around, and many good programmers I know are lazy and easy to distract:-)

> Vendoring something might make less work for you in the moment, but is more work for other people (or even your future self) down the line and doesn't solve anything for other people with the same problems as you.

It is more work for me and other people down the line. You get so many more bugs due to library incompatibilities and such. Those produce extra work for the users, the packagers and the developers. We are just used to this work load, so we ignore it.

> If you want to write brittle broken software that needs constant attention and maybe doesn't even work at all in a few years, then yes, go ahead and keep doing things this way.

If you want brittle software now, then go ahead:-)

Seriously there is great and well tested C code out there. That is wonderful! There is also some great rust code out there. Also awesome! Great code is a wonderful thing to have in any language! In my mind code quality is not related to how hard it is to use 3rd party libraries in the programming language of choice.

I also do not want to see programming as an activity where you need to recite obscure texts that got cargo-culted down to you by your elders! And it is rare to find a medium sized project in C or C++ that does not do exactly that -- at the very least in some dark corner of its build system. Rust projects at least do have less dark corners. Part of the reason is of course that rust has not accumulated so much historical baggage yet:-)

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 11:14 UTC (Fri) by LtWorf (subscriber, #124958) [Link] (7 responses)

> Adding a step before the build that makes sure the library package is installed, e.g. providing instructions *per-distro* to install it manually, making my software harder to build

You mean a README file with a list of dependencies? I'm sure people on a certain distribution know how to use their package manager.
If they don't know, they won't be compiling your software anyway, because they don't know how to install a compiler.

> For distros that don't package the library (or package a version of it that's older than I need), providing instructions to build and install that library manually, making my software even harder to build

Users of stable distributions are familiar with the issue.

> Making sure my software builds and runs with a range of library versions packaged by different distros and distro versions, potentially packaged in different ways with different directory layouts etc across distros.

That is incentive to:
1. Do not depend on amateur libraries that change API
2. Use autotools and let it figure out all this stuff

> On platforms like Windows

There are no package managers on windows. So that is a completely different situation. But anyway you won't be using the same binary on linux and windows.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 13:57 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (4 responses)

> There are no package managers on windows.

There are. They're not as mature as anything Linux has AFAIK, but there is at least (in no particular order):

- vcpkg (probably the most useful for the discussion at hand)
- Conan (CMake-based)
- chocolatey (binary-based, includes Visual Studio/MSVC packages)
- anaconda (scientific/Python oriented, but has other bits too)

I think there's another, but I can't remember it's name. There's also zero surprise from me if there are others I haven't heard of.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 21:58 UTC (Fri) by roc (subscriber, #30627) [Link] (3 responses)

You really need a single standard package manager, preferably shipped with the OS, so there is a high chance users already have it installed and "install package manager" doesn't just make your installation process longer.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 13:35 UTC (Sat) by mathstuf (subscriber, #69389) [Link] (2 responses)

While true, I think what I'd do is use vcpkg for developer management, bundle everything up into a single package and ship that via normal means (installer/relocatable zip) and maybe chocolatey depending on the tool target audience.

Anaconda would probably be better if you're already in that realm though.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 20:39 UTC (Sat) by roc (subscriber, #30627) [Link] (1 responses)

vckpg looks cool, thanks for pointing to it. But it looks more like "cargo for C++" than a distro package manager.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 20:41 UTC (Sat) by roc (subscriber, #30627) [Link]

Of course, as such, it may be a good answer to the problem of "how do I consume C third-party libraries" which was the original issue before we got into a discussion of the "use distro packages" non-solution.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 21:56 UTC (Fri) by roc (subscriber, #30627) [Link] (1 responses)

> You mean a README file with a list of dependencies? I'm sure people on a certain distribution know how to use their package manager.

Sure. One problem is, that package list changes with each distribution and sometimes within versions of each distribution. So we only have instructions for Fedora and Ubuntu, and those instructions are wrong for some versions of those distros.

Anyway, every single step makes the software harder to build.

> Users of stable distributions are familiar with the issue.

So? The issues still exist.

> That is incentive to:
> 1. Do not depend on amateur libraries that change API

Distro policies are responsible for varying file layouts and naming conventions. Distros also make varying decisions about library versions and which features they enable at build time.

> 2. Use autotools and let it figure out all this stuff

Autotools are a nightmare and writing autotools feature tests for everything I care about would be a ton of extra work.

> you won't be using the same binary on linux and windows.

Indeed, but once you've done the work to build Windows binary you can reuse that work to build a Linux binary with vendored libraries.

People arguing that the Right Way to build C/C++ software is to make it Linux only, use distro libraries, do a ton of extra work, and downgrade the performance and functionality of that software to fit the shipped libraries, are not doing C/C++ any favours.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 8:12 UTC (Sat) by abartlet (subscriber, #3928) [Link]

> Sure. One problem is, that package list changes with each distribution and sometimes within versions of each distribution. So we only have instructions for Fedora and Ubuntu, and those instructions are wrong for some versions of those distros.

This got so bad for Samba (both the distribution versions to cover and the Samba versions to cover) that we ended up building a massive infrastructure to:
- Create Docker images for CI
- Test every build in all the supported distributions
- Publish a 'install dependencies for samba' script.

Just looking at the table here: https://wiki.samba.org/index.php/Package_Dependencies_Req... to see where this ends up.

Even the source data for those generated scripts, for a single release is quite complex: https://gitlab.com/samba-team/samba/-/blob/master/bootstr...

So for software of any serious size, it is not just a README with a list of dependencies. Furthermore, Samba has found we have to have configure checks looking for the library (otherwise folks complain that their build failed) and to make those fail by default (not 'auto-detect' and work around) because otherwise features just go missing.

All in all it is hard to argue that this is really a good vision to match.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:56 UTC (Thu) by roc (subscriber, #30627) [Link]

Making consumption of third-party libraries extremely painful is not a good way to address whatever downsides there are of depending on third-party libraries. In reality C/C++ programmers react to that pain by either vendoring libraries (with bad tools, which make updates expensive, which creates security and correctness hazards), or by reimplementation (which on average means lower quality because development effort is spread over more implementations).

For example in our Rust project (https://pernos.co) we use cargo-deny in CI to scan our dependencies for known CVEs and break the build if there is one. This is working very well. Nothing like it exists for C because the infrastructure for consuming third-party libraries in C is hopelessly fractured.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:16 UTC (Thu) by marcH (subscriber, #57642) [Link] (17 responses)

> apt search xyz
> apt install libxyz
> How is that not straightforward?

It seems straight-forward when you ignore all the work that distributions perform behind the scenes to achieve that result.

It seems straight-forward if you ignore the incredibly large attack surface involved every time you run "apt update".

It seems straight-forward if you've never debugged CMake or (much worse) autotools.

It seems straight-forward as long as you don't need different packages that require different versions of xyz.

It seems straight-forward as long as you don't try to use a package from another distro because it's missing on yours.

It seems straight-forward as long as you don't try to naively "upgrade" the LTS version of your distro with packages from a newer version of the _same_ distro.

If it's so straight-forward, why have brand new projects like flatpak, snap etc. just been created?

Code re-use, software distribution and maintenance is hard, really hard. I'm not claiming rust or anything else cracked that nut, far from it and downloading random code from the Internet (in _any_ language_) is of course a security disaster[*] Pretending on the other hand that this problem has already been solved is either dishonest or incredibly naive and probably why the entire industry is still so bad at this. Have you never heard about "DLL Hell?". We should all keep open mind, take interest in any new approach and ignore anyone recommending to keep doing what they've have been always been doing.

[*] latest and greatest fun: https://www.theregister.com/2021/02/10/library_dependenci...

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:36 UTC (Thu) by logang (subscriber, #127618) [Link] (6 responses)

>It seems straight-forward when you ignore all the work that distributions perform behind the scenes to achieve that result.

Absolutely right. It's a lot of work and a hard problem to deal with dependencies. Which is why we should pool the work in distributions and everyone should use and benefit from it.

>It seems straight-forward if you ignore the incredibly large attack surface involved every time you run "apt update".

That's an odd statement. I do that multiple times a week on more than a dozen machines.

>It seems straight-forward if you've never debugged CMake or (much worse) autotools.

I've done both. Not that hard.

>It seems straight-forward as long as you don't need different packages that require different versions of xyz.

If libraries are well maintained and care about not breaking their users, and support a range of their own dependencies (instead of essentially vendoring their own dependencies by insisting on a very specific version) this problem tends not to be that bad. Even in python, good well maintained libraries ensure they work on a wide range of python versions and with a range of versions of their own dependencies. But also, in general, long deep dependency trees should be avoided and pushed back against.

>It seems straight-forward as long as you don't try to naively "upgrade" the LTS version of your distro with packages from a newer version of the _same_ distro.

I've done this a lot. For the rare critical package, this is hard and should simply not be done. 9 times out of 10, it is easy.

> If it's so straight-forward, why have brand new projects like flatpak, snap etc. just been created?

No idea. But I avoid those like the plague. They don't solve any of my problems.

> Code re-use, software distribution and maintenance is hard, really hard. I'm not claiming rust or anything else cracked that nut, far from it and downloading random code from the Internet (in _any_ language_) is of course a security disaster[*] Pretending on the other hand that this problem has already been solved is either dishonest or incredibly naive and probably why the entire industry is still so bad at this. Have you never heard about "DLL Hell?". We should all keep open mind, take interest in any new approach and ignore anyone recommending to keep doing what they've have been always been doing.

Absolutely right. But the new languages don't seem to solve these problems, they just ignore them and try to vendor everything. From a security, maintenance and longevity perspective the distros have been doing far better, which is why I always go to them first and strongly resist the newer trends to vendor everything.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:07 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

> Absolutely right. It's a lot of work and a hard problem to deal with dependencies. Which is why we should pool the work in distributions and everyone should use and benefit from it.

You know how many users we have using distro-provided deployment mechanisms? Zero (that I hear from). I hear from distributor maintainers and we work to accomodate building with external deps (because *I* care and testing against external versions is the easiest way to set warning bells for API changes coming down the pipe).

Existing deployments are on bespoke machines with oddball dependencies not packaged by distros. They use custom MPI builds that are tuned for the hardware. External libraries compiled against those MPI libraries. And other things too.

I agree that distros do a lot of work and I'm grateful for it, but the "everyone deploys to Linux (or FreeBSD)" mentality (this is especially rampant in the web world too) is short-sighted to me. We vendor the "core" libraries we need. I even made sure we do it properly: no untracked patches to them, mangle the symbols, soname, and header include paths to avoid conflicts with external copies, provide options to *use* external copies, etc. It's a lot of work.

And after all that? I would really rather just drop a `Cargo.lock` file in for stability and have CI churn on new releases to let me know of what's up in the future.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:52 UTC (Thu) by roc (subscriber, #30627) [Link] (3 responses)

> But the new languages don't seem to solve these problems, they just ignore them and try to vendor everything

Effectively Rust wants developers to vendor everything, but a lot of work has gone into Rust+cargo to solve a lot of hard problems. For example:

cargo provides simple commands to update a dependency to the latest version, usually as simple as "cargo update" or "cargo update -p <library>".

cargo makes it easy to override a (possibly indirect) dependency with a patched version, via "[patch]".

rust-sec/advisory-db collects CVEs for Rust libraries and you can configure the cargo-deny tool to automatically break your build if one of your dependencies has an outstanding CVE.

Rust is designed so that by default linking multiple versions of the same library into a single binary works fine (always undesirable, but sometimes a necessary last resort).

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 23:31 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link] (2 responses)

Effectively Rust wants developers to vendor everything

I don't think this is quite right. As I understand it "vendoring" means copying the source code of libraries you used into your own source tree rather than linking to the distribution-provided library at run time. As I understand it, there are a few problems with vendoring:

  • Library fragmentation. When people on the project discover something wrong with the library- a bug or missing feature- there's a tendency to patch it in the local copy rather than pushing the fix to upstream. Even if the project attempts to push changes upstream, the project may keep them if upstream is uninterested, resulting in fragmentation of the library.
  • Patch delays. If something upstream gets patched, it takes extra time and effort to push the patch out to all the projects that have vendored the library compared to patching the single distribution provided version. This is annoying with ordinary bugs and a serious danger with security bugs.
  • Hidden copies. It can be difficult even to track down all the projects that have vendored the library to make sure their copy has been fixed. This further slows patch rollout.

What Rust (and many other languages with their own dependency resolution systems) does is slightly different. They incorporate libraries into a statically linked binary, but they still treat the library as an external dependency rather than copying it into the project wholesale. That means they still have problems with patch delays but much less of one with library fragmentation or hidden copies than projects which have truly vendored libraries.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 1:07 UTC (Fri) by marcH (subscriber, #57642) [Link]

> As I understand it "vendoring" means copying the source code of libraries you used into your own source tree [...] tendency to patch it in the local copy rather than pushing the fix to upstream

In other words forking the source.

> They incorporate libraries into a statically linked binary, but they still treat the library as an external dependency rather than copying it into the project wholesale.

In other words forking the binaries but not the source.

There are probably a few other (and incompatible...) "definitions" of vendoring, for instance those that (wrongly) care about where the copy is hosted, but I don't think any other vendoring definition matters besides the two ways of forking above. I suspect we can get rid of that new word and not lose anything - actually gain some clarity. Please prove me wrong!

Duplication is not bad in itself, it's bad only when it leads to Divergence.
https://doc.rust-lang.org/book/ch03-01-variables-and-muta...

I stopped saying "Copy/Paste", now I say Copy/Paste/Diverge. Even the least technical managers understand he latter.

Examples of Duplication that keeps Divergence under control: cache invalidation, RCU, version control, snapshot isolation, transactional memory,...

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 1:23 UTC (Fri) by roc (subscriber, #30627) [Link]

Yes, I used the term loosely. Sorry about that.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 1:13 UTC (Fri) by marcH (subscriber, #57642) [Link]

> > It seems straight-forward if you ignore the incredibly large attack surface involved every time you run "apt update".

(I meant "apt upgrade)

> That's an odd statement. I do that multiple times a week on more than a dozen machines.

Then pause once and try to gauge about how many persons and lines of code you trust every time you install or upgrade a few dozens packages. Maybe pip and cargo won't look that bad after all.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 11:15 UTC (Fri) by LtWorf (subscriber, #124958) [Link] (9 responses)

> It seems straight-forward if you ignore the incredibly large attack surface involved every time you run "apt update".

Uh?

How is this more risky than having 900 copies of libpng and hoping that all of them will be upgraded when inevitably the next buffer overflow is found?

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 17:03 UTC (Fri) by marcH (subscriber, #57642) [Link]

I'm not the one pretending to know which is more risky.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 22:12 UTC (Fri) by roc (subscriber, #30627) [Link] (7 responses)

Suppose we had a distro where all packages were built with Rust and statically linked the "png" crate, a CVE is issued for that crate, and a new minor version of "png" is available that fixes the bug. It would be very simple to scan all Cargo.lock files for all packages to see which ones are using a vulnerable version of "png". For each affected package, "cargo update -p png" would update to a non-vulnerable version. It would be easy to automate the entire process.

In this hypothetical distro you would also want to run 'cargo-deny' in CI to ensure that every time a package is built, the build fails if there is an outstanding CVE against one of its components.

The big picture here is that Rust+cargo standardize the build process and metadata to make managing dependencies much easier, more consistent and scalable.

(Of course we're ignoring the issue that you will have to do this much less frequently for a Rust PNG library because Rust code isn't prone to buffer overflows...)

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 2:14 UTC (Tue) by dvdeug (subscriber, #10998) [Link] (6 responses)

Let's compare that to what we have right now; we have a CVE in libpng, we upgrade the version of libpng in the distro, and fix all the packages without recompilation. That's already complex enough without literally recompiling almost every program on the system.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 2:55 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

And then some applications randomly break because of an ABI change.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 2:31 UTC (Wed) by dvdeug (subscriber, #10998) [Link] (2 responses)

Recompilation has been known to randomly break applications as well. The art of a good security patch is that it doesn't change anything besides making the security fix.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 4:59 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

> Recompilation has been known to randomly break applications as well.
Uhh.... Whut? Recompilation can't break applications, especially with repeatable builds. A bad fix that changes the API can certainly do that.

But it's way better than random breakages because the ABI has subtly changed.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 8:33 UTC (Wed) by geert (subscriber, #98403) [Link]

If the compilation is needed due to a change in a dependency, it is not a repeat of the previous build. If it was, there was no point in recompiling it.
If the compiler has changed, the recompiled application may behave differently, too.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 16:40 UTC (Tue) by foom (subscriber, #14868) [Link] (1 responses)

If we had the ability to cross-compile for slow target architectures, and proper build automation, recompiling everything that depends on libpng wouldn't need to be a problem.

Distributing the update to users might need some adjustments, too, in order to avoid massive bandwidth usage -- a good mechanism to send just binary deltas for the affected files would be more important than it is now.

Python cryptography, Rust, and Gentoo

Posted Feb 18, 2021 22:50 UTC (Thu) by flussence (guest, #85566) [Link]

It's a bit ironic that the software I *most* need cross-compilation for is the stuff most resistant to being cross-compiled…

(Bought more RAM than I thought I'd ever need. The compiler crashes because it runs out of i686 registers now. *sigh*)

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 5:22 UTC (Thu) by roc (subscriber, #30627) [Link] (53 responses)

Rust is incredibly good at avoiding compatibility breaks in practice. They are very rare and generally involve discovering that some code pattern is unsound and needs to be illegal. My Rust project is nearly 5 years old at this point and we hardly ever have had to deal with compatibility breaks. When there rarely is a break, it almost always manifests as your code failing to build with a new version of the compiler.

On the other hand, in practice C *does* have compatibility breaks. All large C programs contain bugs where they rely on subtle undefined behavior. Periodically a compiler update will change how they handle that behavior. This is worse than Rust because these regressions are generally not caught at compile time, they will show up in testing or production.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 12:23 UTC (Thu) by vstinner (subscriber, #42675) [Link] (41 responses)

CPython is made of 500K lines of C code. I can testify that it breaks at every GCC major release. Each time, we discover new "undefined behavior" which were running fine previously.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:03 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (40 responses)

> Each time, we discover new "undefined behavior" which were running fine previously.

Should be:

> Each time, we discover where we were using undefined behavior and just getting lucky previously.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 18:44 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link] (39 responses)

This isn't a terribly meaningful distinction. It doesn't really matter if you want to blame it on C having lax standards that allow undefined behavior or on lazy programmers allowing their programs to depend on it, it shows that C is not such a stable platform for building complex programs in practice. It's not like the CPython team are a bunch of slackers who don't know how to program. If they're running into this kind of problem, it's a problem with the system rather than their specific group.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:15 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (38 responses)

Oh, I certainly agree with that. Maybe I got a bit too quip-y here.

I'll note that I just added clang-tidy checking to one of our code bases (takes almost 2 hours too :( ). Tons of things are ignored because we've been lax for far too long, but getting things like ASan, UBSan, clang-tidy and a whole host of other tools looking at the code is important for C and C++ code bases to keep their sanity in the unfortunately not-always-well-understood corners of the languages that stick out all over the place.

But it's also a mistake to then turn around and blame the compiler for utilizing the freedom the language gives it for the developer's lack of knowledge in that area (and is why, IMO, the burden of proof is on the developer, not the linter, when masking lint detections). You either have to live with the dish C has been serving all of us for the past 40+ years with all the rot and flavor-enhancing spices we now have available or step up, get in the kitchen, and improve things. IMNSHO, Rust developers have been doing that.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:41 UTC (Thu) by Wol (subscriber, #4433) [Link] (37 responses)

But this is what another post on LWN pointed me at - C IS NO LONGER A LOW-LEVEL LANGUAGE.

The whole point behind "undefined" or "implementation specific" behaviour was that - where CPU behaviour varied - it would do whatever was easiest for the CPU. The logical model behind C and modern processors have diverged so much that there is no longer a simple equivalence between the C language and the processor machine/assembly code. So "undefined behaviour is whatever the hardware does" no longer makes sense, but that is what is supposed to mean!

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:44 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (30 responses)

That's fine when you're coding for a given processor. When you're coding a portable program, undefined behavior is just not acceptable (unless someone foolishly decided "whatever C does here" as part of *their* spec).

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:30 UTC (Thu) by Wol (subscriber, #4433) [Link] (29 responses)

Which is why C is not a good language - which is what a lot of posters here are saying.

Personally, I find C a perfectly okay language. I just feel that C, and Unix, and all that are a perfect example of what matters is not being any good, what matters is being in the right place at the right time. I cut my coding teeth on FORTRAN, and would probably still be using it if I had the opportunity.

As that article said, C is the perfect language for programming a PDP-11. It's just that modern computers behave completely differently to a PDP-11. Again, I cut my teeth on 50-series Pr1mes. Pr1me tried to re-write a large slab of the system in C, and I suspect that was (a small) part of the reason they went under (the bigger part being microprocessors like the 6502, the 8080 etc, were beginning to eat the minicomputers' lunch). And the 50-series having a strongly segmented architecture, it just didn't map on to the microprocessors' way of working.

Someone needs to do a "C", and design a new low-level language for programming x64.

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:38 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

> Someone needs to do a "C", and design a new low-level language for programming x64.

Just as aarch64 enters the stage in a meaningful way? Seems apt ;) .

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 2:18 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (17 responses)

> Someone needs to do a "C", and design a new low-level language for programming x64.

Well, there's assembly language. Or LLVM IR, if you wanted something a bit more optimized. But I imagine you wanted something higher-level than either of those options.

IMHO the single most significant pain point for C is undefined behavior. You can broadly divide UB into three types:

1. Essential UB - UB that results from stack/heap corruption or other cases where "You can only figure out what will happen if you know exactly how everything is laid out in memory, the order in which threads are executed, etc." It's "essential" because knowing what architecture you're using only gives you a little information about the program's likely behavior.
2. Accidental UB - UB that results from differences in architectural behavior (e.g. how negative numbers are represented, whether trap representations are a thing, whether memory is segmented, etc.). It's "accidental" because many of these instances of UB are artifacts of the state of the market at the time C was standardized, rather than fundamental constraints on what we can predict about program behavior.
3. UB that should always crash - Mostly, this is just "dereferencing NULL, dividing by zero, and anything else that everyone agrees should always immediately trap," but for the sake of completeness, I would define this as any situation where it's possible (on a reasonable, modern system, when running in userspace) to immediately detect the problem and crash, with no meaningful performance penalty for doing so (e.g. the runtime doesn't have to do array bounds checking or similar).

For addressing #3, the answer is obvious: Crash, and don't have it be UB. For #2, the answer is similarly obvious: Either pick "whatever the x86-64 does" or say "it's implementation-defined" (and not UB). But for #1, the only really effective way to remove it is to prevent stack/heap corruption statically, at compile time. And if you go down that road, you will fairly quickly find yourself reinventing the Rust wheel. Alternatively, you can insert bounds checks everywhere, and go down the Java road instead, but then you're not really a "low-level language" anymore.

TL;DR: I am unable to visualize anything that matches your description, but doesn't already exist.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 11:09 UTC (Fri) by khim (subscriber, #9252) [Link] (2 responses)

> e.g. the runtime doesn't have to do array bounds checking or similar

But even your short list (with two elements) includes two things which are hard to implement on some platforms. Accessing NULL wouldn't be caught on MS-DOS or many other “small” CPUs (and real mode is not dead if we would consider platforms which we are discussing in the article live… heck, in a world where Windows 3.0 support is added to compilers in year 2020 it can be considered more alive than other architectures discussed here). Catching “divide by zero” is not trivial, e.g., on AArch64 (fp exceptions are optional there are you need to periodically check if they happened — looks more-or-less array bounds checking or similar to me).

> Alternatively, you can insert bounds checks everywhere, and go down the Java road instead, but then you're not really a "low-level language" anymore.

But you have just said that you should crash instead! Make up your mind, please!

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 17:57 UTC (Fri) by NYKevin (subscriber, #129325) [Link]

> But even your short list (with two elements) includes two things which are hard to implement on some platforms.

Those platforms can use C. I was asked to design a language "for programming x64," so I deliberately neglected to support older platforms.

I also explicitly stated that we were talking about a "modern system." MS-DOS is not a modern system. Windows 3.0 is not a modern system. Please do not snip out parts of my comment and then complain that the snipped out pieces no longer make any sense.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 18:31 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

Java throws an unchecked exception (which is a reasonable, but much more controlled, approximation to "crashes") if you make an out-of-bounds array access.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 1:39 UTC (Sat) by Wol (subscriber, #4433) [Link] (13 responses)

> 3. UB that should always crash - Mostly, this is just "dereferencing NULL, dividing by zero, and anything else that everyone agrees should always immediately trap,"

And here we have to disagree - computers are supposed to do maths, and division by zero is a common mathematical operation. The result is (scalar) infinity, I believe, and it's actually absolutely fundamental to that branch of mathematics known as calculus.

(One of the problems people have with infinity(s) is that there are so many, and you can't mix them ... :-)

One of my early projects that I remember involved a lot of Pythagorus. The problem was, the three vertices of my triangle could easily lie on a straight line, which would result (as far as the maths was concerned) in a "divide by divide-by-zero". To which the answer is zero. As far as the program was concerned, though, it did result in a crash, resulting in a load of extra code to trap the fact that computers can't do maths properly :-)

I don't know whether the language people are doing this, but imho they should get rid of both implementation-specific behaviour, and undefined behaviour. Let's take the example of shifting a negative amount. imho the principle of least surprise says that a negative left shift is a right shift, so if you explicitly ask for the new standard you get the defined behaviour. Unless you also ask explicitly for the old behaviour. If you don't ask for anything it remains implementation-specific (until the compiler default advances to the new standard :-) And fix undefined behaviour the same way - that should only be allowed when asking for something non-sensical :-)

They had this exact problem with FOR/NExT loops in 1977 :-) FORTRAN did the test at the end, so all loops executed at least once, while Fortran77 did it at the start so loops could possibly not execute at all. So we had switches to force either new or old behaviour.

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 9:02 UTC (Sat) by mkbosmans (subscriber, #65556) [Link] (4 responses)

> And here we have to disagree - computers are supposed to do maths, and division by zero is a common mathematical operation.
> The result is (scalar) infinity, I believe, and it's actually absolutely fundamental to that branch of mathematics known as calculus.

That is not the case at all.
While you can say: lim x→0 n/x = inf, it does not follow that n/0 = inf.

And as for the more general point, calculus deals with real numbers for the most part. Computers operate on floating point and integer numbers. Operations that make sense in one domain don't necessarily translate 1:1 to another.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 10:09 UTC (Sat) by Wol (subscriber, #4433) [Link] (3 responses)

> That is not the case at all.
> While you can say: lim x→0 n/x = inf, it does not follow that n/0 = inf.

But isn't that what the MATHS MEANS? It doesn't follow that x will *reach* 0, but if it does, then n/x *must* equal infinity. (Quite often, x=0 is illegal in the problem domain.)

> And as for the more general point, calculus deals with real numbers for the most part. Computers operate on floating point and integer numbers. Operations that make sense in one domain don't necessarily translate 1:1 to another.

Principle of least surprise. Yes, floating point is a quantum operation, while reals aren't, but given that (I believe) the IEEE definition of floating point includes both NaN and inf, it would be nice if computers actually used them - I believe some popular computers did 40 years ago (DEC Vax), and I guess it's the ubiquity of x86 that killed it :-(

And the whole point of fp is to imitate real. Again, principle of least surprise, the fp model should not crash when fed something that is valid in the real domain. It should get as close as possible.

People are too eager to accept that "digital is better" "because it's maths", and ignore the fact that it's just a model. And people find it hard to challenge a mathematical model, even when it's blatantly wrong, "because the maths says so".

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 11:38 UTC (Sat) by Jonno (guest, #49613) [Link]

> While you can say: lim x→0 n/x = inf, it does not follow that n/0 = inf.

No, you can't say that. For n∈ℝ⁺, lim (x⭢0)⁺ (n/x) = ∞, but lim (x⭢0)⁻ (n/0) = -∞, so lim x⭢0 (n/0) does not exist. [For n∈ℝ⁻, lim (x⭢0)⁺ (n/x) = -∞ and lim (x⭢0)⁻ (n/x) = ∞, so lim (x⭢0) does not exist either; but for n∈{0}, lim (x⭢0)⁺ (n/x) = 0 and lim (x⭢0)⁻ (n/x) = 0, and so lim (x⭢0) (n/x) = 0].

> But isn't that what the MATHS MEANS? It doesn't follow that x will *reach* 0, but if it does, then n/x *must* equal infinity. (Quite often, x=0 is illegal in the problem domain.)

No, the maths says that the domain of the divisor does not include zero; that the closer to zero a positive divisor gets, the closer to positive infinity the value gets; and that the closer to zero a negative divisor gets, the closer to negative infinity the value gets.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 13:15 UTC (Sat) by mpr22 (subscriber, #60784) [Link]

x86 floating point is (hardware defects aside) IEEE-754, floating point division by 0.0 is defined in C, and if you compile:

#include <stdio.h>
int main()
{
float f = 1.0f/0.0f;
printf("%f\n", f);
return 0;
}

with gcc or clang and link against glibc, it prints "inf".

Integer division by 0, on the other hand, is undefined under finite-width two's complement (or unsigned) arithmetic.

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 2:47 UTC (Sun) by NYKevin (subscriber, #129325) [Link]

Infinity is neither an integer nor a real number (when both terms are defined in the mathematical sense rather than the computational sense). The real numbers observe something called the "Archimedean property," which states that there are no infinities or infinitesimals (except that zero is infinitely smaller than all non-zero values).

Why do real numbers have this limitation? Well, the blunt fact is, there's only one totally ordered metrically complete field, and it's the real numbers.[1] If you want to introduce an infinity, you have to give up one of the following:

1. The field axioms (which, broadly speaking, say that you can add, subtract, multiply, and divide real numbers, and these operations behave in sensible and familiar ways).
2. The total ordering of the reals (i.e. for any two reals a and b, either a > b, a < b, or a = b, where = is given its ordinary interpretation of "are literally the same mathematical object" rather than something which the ordering is allowed to define).
3. Two additional axioms about how the ordering interacts with the field operations (basically, you can always add the same number to both sides of an inequality without invalidating it, and the product of two positive numbers is always positive).
4. The reals are Dedekind-complete (in simple terms, "you can take limits" - in more precise terms, every non-empty subset of the reals that has an upper bound, has a least upper bound).

For example, IEEE 754:

1. Everything is non-associative, which is not allowed under the field axioms. Also, NaN and ±inf don't have additive inverses.
2. Since 1.0 / 0.0 != 1.0 / -0.0, we cannot have 0.0 and -0.0 be "the same value" (because you get different answers when you try to use them in the same expression). Neither number is greater than the other according to IEEE 754, and so they violate total ordering. Also, all of the NaNs violate total ordering, too.
3. There are cases for which x < y, but x + z == y == y + z (because x is the largest value with exponent n and y is the smallest value with exponent n+1). Also, you can trivially break this with ±inf.
4. Almost satisfied: The set of negative floats has two upper bounds which are incomparable (0.0 and -0.0), so we cannot say which is the "least" upper bound. But I'm pretty sure this is the only counterexample (ignoring trivial alterations such as "negative floats greater than -1.0," etc.) because I can't think of a way to construct a counterexample out of NaN or inf.

Or the extended real numbers, which IEEE 754 is intended to mimic:

1. inf - inf is not defined (rather than giving an NaN), which is not allowed under the field axioms. Also, ±inf don't have additive inverses.
2. Satisfied if you assume that -inf < all reals < inf.
3. Not satisfied because, for finite numbers x and y with x < y, x + inf = y + inf = inf.
4. The extended reals are compact, which in this context is an even stronger property than completeness.

Or the hyperreal numbers, which are explicitly designed to "follow all of the usual rules" (for use in nonstandard analysis):

1. Satisfied by the transfer principle.
2. Satisfied by the transfer principle.
3. Satisfied by the transfer principle.
4. Not satisfied: Consider the set of infinitesimals. This is clearly bounded, but it cannot have a *least* upper bound, or else you could derive contradictions by doubling this least upper bound (which must give you a non-infinitesimal) and reasoning about the relationship between the resulting number and the set of infinitesimals.

TL;DR: If you like doing calculus etc. in the usual way, then you can't have infinities or infinitesimals.

[1]: Every totally ordered metrically complete field is isomorphic to the real numbers. So they're all, effectively, "the same field" with different names for their elements. We need this caveat because, if you really wanted to, you could just start calling 2 ** 128 "infinity." But it would still be 2 ** 128. You could still multiply two by itself 128 times to get to your "infinity." "A rose by any other name" and all that.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 15:59 UTC (Sat) by mafrasi2 (guest, #144830) [Link] (7 responses)

> And here we have to disagree - computers are supposed to do maths, and division by zero is a common mathematical operation. The result is (scalar) infinity, I believe, and it's actually absolutely fundamental to that branch of mathematics known as calculus.

> (One of the problems people have with infinity(s) is that there are so many, and you can't mix them ... :-)

Division by zero is *not* a common mathematical operation. It is literally undefined in mathematics as well.

In fact, it is absolutely fundamental that it is undefined, because otherwise you could do for any number x

x / 0 = infinity
x = infinity * 0

which would mean that any number is equal to any other number (because they all equal infinity * 0 and equality is transitive).

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 23:57 UTC (Sat) by Wol (subscriber, #4433) [Link] (6 responses)

Except that anything multiplied by 0 is zero.

So we have infinity *0 = 5*0 = 4*0 =3*0 = 2*0 = 1*0 =0*0.

If we divide each term by 0, does that mean infinity = 5 = 4 = 3 = 2 = 1 = 0?

I think we've fallen foul of Godel's incompleteness theorem. In order to make the maths work, we need special rules outside of the maths like "divide by zero, you get infinity" and "divde by infinity, you get zero". And a whole lot of physics depends on infinities. I can't give you any examples (or maybe I can), but there are various different types such that quite often infinity != infinity, and the physics doesn't work. And the "is it 10 or 11 dimensions" model of space-time works, I believe, because it just happens to be true that infinity does actually equal infinity.

Infinity and zero are special cases, required by Godel, that are needed to make everything else work. Take that example I gave of calculating the sides of a triangle - as soon as we accept that "divide by divide-by-zero equals 0" I can use THE SAME maths on any two points in a cartesian system to calculate the distance between them. Basically I try and construct a right angle triange and calculate the hypotenuse, and if the triangle collapses into a line THE SAME maths still works. And it makes sense that it works...

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 1:28 UTC (Sun) by mathstuf (subscriber, #69389) [Link] (5 responses)

> So we have infinity *0 = 5*0 = 4*0 =3*0 = 2*0 = 1*0 =0*0.

inf * 0 is an indeterminite form. It isn't zero, it isn't inf, it isn't a number. Your logic just breaks down here.

> I think we've fallen foul of Godel's incompleteness theorem.

Umm, no. This is way before Gödel gets involved.

> In order to make the maths work, we need special rules outside of the maths like "divide by zero, you get infinity" and "divde by infinity, you get zero".

No, these rules don't exist (in normal mathematics, see later). They may "make sense" in specific instances, but they are nonsense if you try to extrapolate from them. Dividing by zero is not an operation you can do. It isn't inf, nan, or any other "thing", it just can't be done (at least in the axiomatic framework generally used; IEEE is notably lacking in axioms, so sure inf is fine there). I'm sure one could make an algebra where division by zero "makes sense" (cf. modular algebra or surreal numbers for other number systems; surreal *might* have division by zero, but it is…weird), but it might not be as useful as the algebra we use all the time.

> And a whole lot of physics depends on infinities.

I think you mean infinite series or infinitesimals, not infinities.

> I believe, because it just happens to be true that infinity does actually equal infinity.

Maybe you're thinking of the continuum hypothesis? Though I don't think string theory cares about it in particular (its truth is independent of ZF or ZFC). Though I don't know for sure in that specific instance.

> Infinity and zero are special cases, required by Godel,

I feel like you're not understanding Gödel. Gödel states that there are truths that are unprovable in any given proof system that is consistent. Or you can have all truths, but then you gain all falsities as well without the power to tell the difference. There's nothing in it about infinity or zero (as applied to number theory). Those existed before Gödel came along and are fine. I recommend the book Gödel's Proof by Nagel and Newman which is what finally turned the light bulb on for me (after not getting it in Gödel, Escher, Bach by Hofstadter and another reference I can't remember).

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 10:06 UTC (Sun) by Wol (subscriber, #4433) [Link] (2 responses)

As I understand Godel, a simple way to put it is "you cannot use a system to prove itself correct". So it's easy to prove boolean logic correct, AS LONG AS you don't restrict the proof to using only boolean logic. It's easy to prove number theory correct AS LONG AS you don't restrict tthe proof to using only number theory. That's why we can't prove logic correct, because we have nothing else to throw into the mix.

So I have no qualms about throwing that infinity stuff into the proof, because otherwise you can't class zero as a number, because it behaves completely differently to all the other numbers. "My logic breaks down". Yes, because my logic (as per Godel) MUST be either incomplete, or inconsistent. Without that rule, it's inconsistent. With that rule it's incomplete. Pick one ... I've gone for consistency.

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 13:08 UTC (Sun) by mathstuf (subscriber, #69389) [Link] (1 responses)

> As I understand Godel, a simple way to put it is "you cannot use a system to prove itself correct".

There are two parts.

The first is that any sufficiently powerful[1] system of arithmetic is incomplete. In this sense it means that there are statements one can make in the system for which no proof exists (of either its truth or falsity).

The second is that in such a system, the consistency of the system itself is one such statement.

It makes no such claim as to which statement is required.

> That's why we can't prove logic correct, because we have nothing else to throw into the mix.

Sure, but *that* system is also not provably correct. So what have you gained? You (claim to have) jumped one rung up a countably infinite ladder among a countably infinite selection of such ladders. Yay? :)

> So I have no qualms about throwing that infinity stuff into the proof, because otherwise you can't class zero as a number, because it behaves completely differently to all the other numbers.

Zero is a number. It works just fine. Division has a singularity at its value, but all kinds of functions have singularities. Do we need something else for tan(π/2)? Why not extend to the complex numbers with sqrt(-1) while we're at it? Quaternions? Octonions? Sedenions? Each of these is a separate algebra, an extension of algebra over the reals. We don't use them in general because we don't need the additional power they offer in day-to-day uses.

> Yes, because my logic (as per Godel) MUST be either incomplete, or inconsistent. Without that rule, it's inconsistent. With that rule it's incomplete. Pick one ... I've gone for consistency.

You're using the wrong definition of "consistency". It isn't consistent as in "all values must be able to take places of all other values in all expressions".[2] It is consistent as in "there are no contradictions between provable statements" which is *way* more important in (useful) mathematics.

[1] Peano arithmetic is sufficiently powerful. Arithmetic with just the natural numbers, addition, and multiplication, I believe, is not.
[2] You're still "inconsistent" in this sense about the square root of negative numbers for example. Why not toss those in? Why stop at trying to make division "consistent" in this sense when you're leaving out the trigonometric functions, square root, and the other infinite singularity-containing or domain-limited functions alone?

Python cryptography, Rust, and Gentoo

Posted Feb 14, 2021 16:34 UTC (Sun) by nix (subscriber, #2304) [Link]

> Each of these is a separate algebra, an extension of algebra over the reals. We don't use them in general because we don't need the additional power they offer in day-to-day uses.

... and because they gain annoying limitations (lose a useful property of the reals) with every such extension, and by the time you get to the sedenions there's not really very many useful properties they have left (associativity is more or less it).

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 7:17 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (1 responses)

> I feel like you're not understanding Gödel.

This is nothing to be ashamed of, by the way. I took an entire college course just focusing on the incompleteness theorems, and I still only have a very loose ability to follow their basic form. The incompleteness theorems are very, *very* hairy math. You cannot simply skim a couple of Wikipedia articles and expect to understand Gödel.

If you insist on trying to figure out what Gödel was saying without spending multiple years of your life studying the surrounding mathematics, then I would suggest starting out with Gödel Escher Bach. Yes, that's a very thick book, no, you should not skim it. The main advantage of GEB is that it actually does explain why and how completeness breaks down under arithmetic, using a real (if rather awkward) implementation of PA. For this reason, it is not an easy read, but it's better than an introductory model theory textbook.

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 22:46 UTC (Mon) by rgmoore (✭ supporter ✭, #75) [Link]

GEB is not an easy read, but it is probably as easy and fun a read as any book that makes a serious pretense of explaining Gödel's Incompleteness Theorem is likely to be.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 3:29 UTC (Fri) by zev (subscriber, #88455) [Link] (4 responses)

C is the perfect language for programming a PDP-11. It's just that modern computers behave completely differently to a PDP-11. [...] Someone needs to do a "C", and design a new low-level language for programming x64.

I see this kind of thing said a lot, and frankly it's never made the slightest bit of sense to me. What exactly about C itself is remotely PDP-specific? It doesn't strike me as terribly specialized for a PDP or any other particular ISA as it is for the Von Neumann model of computation, which was still pretty ubiquitous last time I checked. If we were all doing dataflow on FPGAs or whatnot, then sure, it'd be a poor fit, but we're still fetching and executing instructions (semantically) one at a time that load and store bytes in memory, pretty much just like Ken and Dennis did on their DECs.

"But modern machines have out-of-order execution and branch prediction and multi-level cache hierarchies!" I've seen some people argue...sure, but the whole point of that kind of microarchitectural sophistication is that it's microarchitectural -- it's not even directly visible at the assembly level, let alone in a high-level language. (Itanium exposed bits of its microarchitecture at the ISA level and look what a raging success that was.)

C's not without its shortcomings, but this notion that it's inappropriate for today's machines because it was initially run on a PDP-11 seems rather silly. Some of those shortcomings:

  • unsafety: if a 1970 PDP had been exposed to the variety of hostile inputs today's internet-connected machines are, this would have been just as much an issue.
  • pointer aliasing: C's challenges are much more entangled with modern compilers than they are with our hardware.
  • lack of abstractions/facilities for "programming in the large": pretty clearly unrelated to the underlying hardware.

None of these seem at all connected to its PDP origins. How would a "language for programming x64" differ?

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 4:21 UTC (Fri) by roc (subscriber, #30627) [Link]

Memory accesses are much slower on modern machines relative to other operations, so it is more important than it used to be to avoid redundant loads and stores. Thus, alias analysis has become more important to optimization, and C compilers more aggressive about exploiting whatever assumptions they can get away with (e.g. type-based alias analysis).

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 11:20 UTC (Fri) by Wol (subscriber, #4433) [Link] (2 responses)

> C's not without its shortcomings, but this notion that it's inappropriate for today's machines because it was initially run on a PDP-11 seems rather silly. Some of those shortcomings:

It's not that it's inappropriate. One of its major failings is that people *think* it's low level, but it doesn't map that well to what modern processors actually DO. In short, we treat it like the low-level language it *was*.

And it's that disconnect between what we think, and what actually happens, that causes all the problems.

Let's take your "unsafety" point, for example. On a PDP-11, I could have easily reasoned about what was ACTUALLY HAPPENING inside the CPU. That's not to say my programming is perfect, but my mental model of reality would have been reasonably close to reality. Nowadays, that's not true AT ALL.

And that's what bites kernel programmers all the time. Especially the noobs, their mental model of what's going on is wildly out of kilter with reality. The compiler takes the code they wrote and massively rewrites it behind their backs. And then the CPU effectively runs the object code in an interpreter I often get the impression ...

That's the point of a low-level language. Imho, if you have well-written code in a low-level language, the compiler SHOULD NOT be able to do that much optimisation. That's not a description of modern C !!!

And therein lies our problem.

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 19:08 UTC (Fri) by khim (subscriber, #9252) [Link] (1 responses)

<font class="QuotedText">&gt; That's the point of a low-level language. Imho, if you have well-written code in a low-level language, the compiler SHOULD NOT be able to do that much optimisation.</font>

<p>If we would use that definition then modern systems don't have <b>any</b> low-level languages. Not even machine code conforms: CPUs with speculative execution may do massive changes to what you wrote in your code!</p>

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 23:13 UTC (Fri) by Wol (subscriber, #4433) [Link]

Didn't I say that the CPU was an object code INTERPRETER? :-)

But if I'm unable to REASON LOGICALLY about what the CPU is going to do, how on earth am I going to get deterministic (ie it does what I want it to do) behaviour from my program?

It's turtles all the way down and logic (and the ability to debug!) has just gone down the plughole with the bathwater ...

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 15:28 UTC (Mon) by anton (subscriber, #25547) [Link] (4 responses)

Interestingly, I have seen repeated claims that current widely-used architectures have been designed for C. While I don't think that's what actually happened in most cases, the claim that C is a bad fit for current architectures is grotesque (although, admittedly, C does not have language features for all the architectural features that architectures have; some are reflected in GNU C extensions, e.g., labels-as-values or vector extensions).

Concerning a new low-level language, yes, we need that, not because C (used as a low-level language) is a bad fit for current architectures, but because the gcc and clang maintainers do not want to support C as a low-level language, and the mindset behind that seems to pervade the C compiler community.

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 19:44 UTC (Mon) by Wol (subscriber, #4433) [Link] (3 responses)

Thing is, C *is* a bad fit for modern architectures. It has a whole bunch of features that are undefined, or implementation-defined, which are MEANT to be low-level "match the hardware". Except that they aren't.

Let's just get rid of all these so-called "low level" cock-ups, accept that C is now a high-level language and that undefined and implemetation-specific behaviours shouldn't exist, and move on.

Someone brought up retpolines - that monstrosity that tries to make sure that both the hardware and the software agree on the imaginary hardware interface that's where the CPU microcode and language macrocode meet ... wtf are we doing!

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 9:27 UTC (Tue) by anton (subscriber, #25547) [Link] (2 responses)

I did not mean the language lawyer version of C. That version is a bad fit for any architecture (including the PDP-11). However, it's great for adversarial compiler maintainers who want to do whatever they want (e.g., produce good benchmark results, grudgingly cater to requests by paying customers and tell other users that their bug reports are invalid), because this version allows them to always blame the programmer for something or other. After all, no terminating C program is a "strictly conformant program", and whenever someone mentions "conformant program" (the only other conformance level for program defined in the C standard), the adversaries produce advocacy why we should consider "conformant programs" as meaningless (interestingly, they claim that we should take the rest of the C standard at face value).

I mean C as used in many programs, which has a pretty simple correspondence to architectural features (and you see it easily in contexts where optimization does not set in, e.g., when you separately compile a function that performs just one thing).

The adversaries want us to consider C as a high-level language with no correspondence to the bare metal; that makes it easier to blame the programmers and absolve compiler maintainers of responsibility. The question is why any programmer would want that. We have plenty of high-level languages, often better than C in that capacity, but not that many low-level languages; basically C is the only popular one.

Concerning a totally defined C: I think that is at odds with a low-level language for multiple architectures, but as most (all?) C compilers have demonstrated for the first quarter-century of the language, that's no hindrance for implementing C in a benign rather than adversarial way. And for those who don't know how to do that, I have written a paper (which also explains why I consider totally defined C impractical).

I don't know what retpolines have to do with any of that. They are a workaround for a vulnerability in some microarchitectures and they cannot be implemented in C (there are limitations to C's low-level nature). The vulnerability should be fixed at the microarchitecture level, and I expect that the hardware manufacturers will come out with microarchitectures that do not have this vulnerability.

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 13:18 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (1 responses)

> I did not mean the language lawyer version of C.

What version of C are you talking about? The ISO standard? The image of C that you have in your head? My head? The C that GCC 2.95 accepted and worked with?

Let's imagine a world where C compilers magically stop doing "magic optimization" steps that tend to break code. What's going to happen is that C programmers that don't know this stuff already is going to have their code be pessimized and, presumably, slower in practice. What are they going to do? Start writing their C in such a way that the compiler was doing internal transformations during optimization passes anyways. They'll learn C more (and, I imagine, less satisfied with it), hopefully be using linters and tooling to tell them where their NULL checks are inverted with uses and such.

Rereading that, maybe it wouldn't be so bad. Maybe folks would migrate to better languages. Others might actually learn more about how loose C is in practice. The optimization passes could be migrated to the linters rather than the compiler to explain "hey, you could reorder your code to This Way and gain some performance". Maybe these passes would then gain some prose explaining what and why of them.

Then again, I have no idea such a C would be specified at ISO to disallow these optimizations while still allowing for architectures to not be forced into twos-complement representations or the like because "it's faster/easier for them".

Python cryptography, Rust, and Gentoo

Posted Feb 16, 2021 15:28 UTC (Tue) by anton (subscriber, #25547) [Link]

As I wrote: "I mean C as used in many programs", and I actually point to a paper where I explain this in more detail. As for "pessimizing", it's certainly the case that advocates of adversarial C compilers claim that the adversarial behaviour is good for performance, invariably without giving any numbers to support these claims; maybe they think that repeating these claims and wishful thinking makes them true.

Wang et al. checked that for gcc-4.7 and clang-3.1 and found that the adversarial "optimizations" produced a minimal speedup on SPECint 2006, and that speedup could also be achieved by small changes to the source code in two places.

Yes, a performance advisor that points out places where changing the source code may improve performance would be a more productive way to spend both the compiler maintainer's and the compiler user's time than writing "optimizations" that break existing code, "sanitiziers" to find the places where the "optimizations" break it, and, on the programmer's side, "sanitizing" their code to withstand the latest attacks by "optimizers" (but not the next one). Moreover, such an advisor could point out optimizations that a programmer can do but a conformant compiler cannot (e.g., because there is an alias that even language lawyering cannot explain away). Of course, unlike maintained programs, benchmarks would not benefit from such a performance advisor, that's why no work goes into performance advisors; and conversely, "optimizations" don't break benchmarks (the compiler maintainers revert the "optimization" in that case), unlike other programs, and that's why we see "optimizations".

But what's more, by insisting on a very limited interpretation of what C means, the language lawyers remove opportunities for optimizations that programmers can make in the source code. I have discussed this at length.

I, too am skeptical that trying to change the C standard is the way to get rid of adversarial C compilers (not the least because you won't be able to achieve consensus with the implementors of these compilers on the committee), and I guess that's why advocates of adversarial compilers like to direct the blame for the misdeeds of these compilers at the standard, rather than at the compiler maintainers. It's not the standard that requires them to miscompile existing, tested and working, programs, it's the compiler maintainers' choice, so they alone are responsible.

Concerning architectures with other than two's complement representation of signed numbers, the last new such architecture was introduced over 50 years ago, and descendants of such architectures exist only in very few places and run select programs. There are far fewer of these machines than of the architectures (all two's-complement) that are not LLVM targets and that have brought about the parent article. And coding in a way that makes use of knowledge about the representation is one of the things that you can do for performance in a low-level language (and compilers do not perform all of these optimizations).

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 10:55 UTC (Fri) by khim (subscriber, #9252) [Link] (5 responses)

> undefined behaviour is whatever the hardware does

If that is the definition behavior then what the heck is implementation-defined behavior?

No, the confusion is much deeper. “Undefined behavior” always meant what it means today. And, in fact, most types of undefined behavior don't cause any confusion. Attempts to read from pointer after calling free or reading from undefined variable rarely cause confusion.

Something like this:

int foo() {
  int i;
  i = 42;
}

int bar() {
  int i;
  return i;
}

int main() {
  foo();
  printf("%d\n", bar());
}

Should code like above work or not? Clang breaks it even when compiled with -O0 (but gcc with -O0 works, although any other optimization level breaks it).

I don't know any practicing programmer who says compilers should support code like the above example.

Tragedy happened when decisions of C standards committee clashed with developer's expectations. Because C was designed to create portable programs lots of things which are, actually, well-defined (yet different!) on many platforms were put into “undefined behavior, don't use” bucket (instead of “implementation-defined, use carefully” bucket).

The intention was, of course, to make programs portable, but completely different things happened instead: so many “implementation-defined, use carefully” things were marked as “undefined behavior”s that developers started thinking that “undefined behavior” means precisely that: whatever the hardware does.

And now we have all that mess.

But no, “undefined behavior” never meant whatever the hardware does. Not even in C89. It was always “something your program should never do.”

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 12:52 UTC (Fri) by Wol (subscriber, #4433) [Link] (2 responses)

I think that mistake proves my point ... :-)

Undefined, implementation dependent, whatever. The point is, it BREAKS THE PROGRAMMER'S MENTAL MODEL.

And however much you want to blame the programmer, if programmers keep on doing it, it's a design fault ...

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 17:38 UTC (Fri) by khim (subscriber, #9252) [Link] (1 responses)

> And however much you want to blame the programmer, if programmers keep on doing it, it's a design fault ...

Got it. So we have issues with C which even Rust doesn't fully address:

— if you put check outside of loop the it would't test all elements of array.

— if you initialize your variable after it's used then program doesn't work.

— if you change the variable then other variables (which were calculated on basis on that variable) don't change as they should.

— you need to actually allocate memory for your data structure, just declaring pointer doesn't mean you can use these.

And I can probably add dozens more.

</sarcasm off>.

Granted: these are expectations of people who have started studying programming about two month ago… but they are very-very common.

Should we do something about them? If yes then what… if no, then why the heck no?…

> The point is, it BREAKS THE PROGRAMMER'S MENTAL MODEL.

Sure — but pretty much anything can break it if programmer is not taught properly.

The C (and C++) suffer mostly from Hyrum's Law: many thing which were supposed not to work… actually work — with real-world compiler. And then, later… they stop (even if documentation always warned not to use them)… that is when trouble happens (think glibc story).

That's the only problem with C/C++… but it's pretty severe: C language on paper and C language as implemented by typical compiler were different for so long that it's unclear what can be done at this point.

The thing is: I'm not sure switching to Rust (or any other language) would save us. After 10-20-30 years they would be in the same situation, too.

I'm not even really sure what can be done about it. Have just one fixed compiler without any changes? I don't think it would really work.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 22:20 UTC (Fri) by roc (subscriber, #30627) [Link]

> I'm not sure switching to Rust (or any other language) would save us. After 10-20-30 years they would be in the same situation, too.

No they won't.

Rust is designed to eliminate "undefined" or "implementation defined" behavior outside of explicit "unsafe" blocks. Yes, there will be compiler bugs etc, but really there will be vastly less of such problematic behaviors in Rust programs than in C and C++ programs.

That means we can expect Rust programs to behave much more consistently over time than C/C++ programs, as hardware and compilers evolve.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 18:13 UTC (Fri) by anselm (subscriber, #2796) [Link] (1 responses)

Tragedy happened when decisions of C standards committee clashed with developer's expectations. Because C was designed to create portable programs lots of things which are, actually, well-defined (yet different!) on many platforms were put into “undefined behavior, don't use” bucket (instead of “implementation-defined, use carefully” bucket).

AFAIR, the C89 standard carefully distinguished between “undefined” and “implementation-defined” behaviour. “Implementation-defined” behaviour is very emphatically not “undefined” behaviour, it's just that it is not defined by the language standard but by the various implementations (or their underlying platforms).

For example, the result of the >> operator applied to a negative signed integer is implementation-defined – many platforms offer a choice between arithmetical and logical right-shift and the compiler writer needs to pick one of the two, but after that, that particular compiler on that platform will always do it that way. (The reason why this particular behaviour was declared implementation-defined is probably that Ritchie didn't stipulate what was desired and by the late 1980s there were enough C implementations doing it one way or the other that nobody could agree anymore on which way was “correct” without making the other half of the industry “wrong”, and breaking programs that relied on the other behaviour.)

With appropriate care, you can exploit implementation-defined behaviour – especially if your set of implementations is small –, but with undefined behaviour, all bets are off. If you're interested in C code that is maximally portable between implementations, implementation-defined behaviour is, of course, something to avoid, but again it is a good idea to flag it as such in the standard so people can be aware of it.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 20:31 UTC (Fri) by khim (subscriber, #9252) [Link]

> AFAIR, the C89 standard carefully distinguished between “undefined” and “implementation-defined” behaviour.

Yes, but that wasn't my point.

You have explained perfectly why right shift of the negative value is “implementation-defined” behavior. All is very logical and proper.

But what about shift by negative value? Many (most?) low-level programmers expect that this would be “implementation-defined”, too. After all most CPUs do something predictable when they get negative value as shift value (different ones do different things but all CPUs I know do something predictable). More-or-less same as with shift of negative value: there may be different outcomes on different CPUs, yet there would be some outcome, right?

Well… no.

If you would actually open C89 standard you would see that “the result of a right shift of a negative-valued signed integral type (6.3.7)” is listed in “Appendix G, part 3 Implementation-defined behavior”… yet “an expression is shifted by a negative number or by an amount greater than or equal to the width in bits of the expression being shifted (6.3.7)” is not in part 3… it's in “Appendix G, part 2 Undefined behavior”!

I would love to know why that difference is there? Do some CPUs lock up when faced with negative shift? Or does something crazy happens (like: it takes so long that DRAM starts losing it's contents)? Or maybe some compiler couldn't handle it? Or… maybe committee just decided that if they would declare it “undefined behavior” then people would stop using it and compiler writers can generate better code?

I have no idea, really. But the end result: -1 >> 1 is “implementation-defined behavior” yet 1 >> -1 is “undefined behavior”.

To most low-level guys this is sheer insanity… yet that's how C89 is defined.

> If you're interested in C code that is maximally portable between implementations, implementation-defined behaviour is, of course, something to avoid, but again it is a good idea to flag it as such in the standard so people can be aware of it.

It's actually done in exactly this way. Not only C standard distinguishes “unspecified behavior”, “implementation-defined behavior”, and “undefined behavior”. It actually have all of them listed in three appendixes! To make sure noone would mix them up.

The only problem: actual programmers don't consult these when they are writing code. They try to guess. Based on their mental model. And for most programmers mental model either says that you could't shift negative value and you couldn't shift by negative value, too (these are sorta-lucky ones: they may not write fastest code, yet they tend to write correct code) or, alternatively, they assume you can push anything you want into a shift and get something back… and then they write something like (a >> (i-1)) * i with comment /* if i == 0 then result is zero and we don't care what a >> (i-1) produces */… only then modern compiler “looks” on that, notices that i couldn't ever be zero (because this would lead to undefined behavior) and happily nukes check if (i == 0) and removes “dead code”.

And that is where shouting starts. C89 standard clearly says that “undefined behavior” could lead to anything at all… yet “advanced programmers” say that “removing code which I specifically wrote there to catch errors is not anything at all in my book”… hilarity ensues.

P.S. I wonder if people who developed C89 are still alive and can say what they think about all that… does anyone know?

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 15:58 UTC (Thu) by luto (subscriber, #39314) [Link]

C++ breaks my code with almost every major gcc update do to improved standard compliance.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 21:55 UTC (Fri) by ceplm (subscriber, #41334) [Link] (9 responses)

> Rust is incredibly good at avoiding compatibility breaks in practice.

Just an anecdotal piece of my personal experience. The only piece of Rust software I was following (for a year or so) was https://github.com/daa84/neovim-gtk/ and it forced me to upgrade my Rust compiler twice, because code for the later versions were no longer compatible with the old one.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 22:29 UTC (Fri) by roc (subscriber, #30627) [Link] (8 responses)

That is a completely different issue.

The original commenter said:
> [Rust] adds backward compatibility breaks. It isn’t as bad as Python at this, but then the Python people are not advocating Python as a systems language. C’s one great strength is that C code is C code. It tends to just keep working over time.

Likewise, Rust code tends to keep working over time: new versions of the compiler can compile old Rust code. That's what we were talking about.

You are talking about a different issue: can old versions of the compiler compile new Rust code? No, not always, because Rust adds features and some Rust projects like to use those new features.

If you're developing software and you want to never upgrade your compiler, that's fine. Just don't, and stick with the set of features it supports.

If you're consuming someone else's software and never want to upgrade your compiler, better pick projects with the above policy.

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 18:51 UTC (Sat) by ceplm (subscriber, #41334) [Link] (7 responses)

Of course, technically, you are completely correct, but I don’t have these problems (or at least not that frequently) with programs in most other programming languages (not mentioning C, because that’s really unfair). Either Rust is still too unstable, or older versions were so lacking that programmers are forced to use the latest features from the bleeding-edge compilers. Am I right?

Python cryptography, Rust, and Gentoo

Posted Feb 13, 2021 22:59 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

Forced to use the latest features? Not likely. Allowed them to remove bad patterns, simplify others, or just use new ones? Sure. There's the nightly-only `-Z minimal-versions` flag to use the *minimum* declared versions of dependencies, but discussions about stabilizing it and making it work across the ecosystem haven't gotten very far (and I've submitted over a dozen PRs to make my dependencies work with the flag).

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 1:21 UTC (Mon) by roc (subscriber, #30627) [Link]

I don't know why the neovim-gtk authors bumped their MSRV but most likely they saw some small feature that would be useful, and view upgrading the compiler as a trivial step (`rustup upgrade`), and so saw it as an easy win. If upgrading the Rust compiler is actually hard for some significant part of their community (I don't know why that would be), let them know!

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 10:25 UTC (Mon) by laarmen (subscriber, #63948) [Link] (4 responses)

I think it has to do with the fact that Rust has really good tooling to manage the toolchain from a developer PoV along with a good backward compatibility. Upgrading the rustc version is assumed to be a trivial step (rustup update and voilà), and if you're an application author you can provide binaries for most platforms, which means the upgrade doesn't concern those users as there is no need for them to upgrade their runtime.

In contrast, most other languages either have a runtime component, which makes upgrading painful for all users, and/or an upgrade process that was not trivial when the community around the language started forming its habits. I would assume you'll find a similar attitude in the Go ecosystem.

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 12:03 UTC (Mon) by ceplm (subscriber, #41334) [Link] (3 responses)

That’s what I call immature environment: if it works on my laptop, it’s perfect, ship it!

Without considering maintenance costs, long-time support, and or combining multiple Rust projects in one system (e.g., Linux/Mac OS X distributions).

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 13:47 UTC (Mon) by laarmen (subscriber, #63948) [Link] (1 responses)

You're putting words in my mouth.

The fact that application developer using Rust don't particularly care about sticking to a lower version of Rust is indeed partly because the language is still evolving and gaining features, for instance async/await which was not available in the version of rustc originally shipped with Debian (the situation has since changed). But my point is that it also comes from the fact that it is *very* easy for a Rust developer using what is considered the standard way of developing in Rust to install and use multiple versions of a toolchain, and for most users *from their PoV* the version of Rust doesn't matter since either they compile from source and can thus be expected to use standard (for Rust) tooling, or they are using already-compiled binaries with no runtime dependency on the version of Rust.

Saying that maintenance costs, long-time support and combining multiple Rust projects in one system isn't considered by the Rust community is just plainly false. It's just a matter of perspective : they consider that rustc is just another build-time dependency and it's okay to require those that build to bump it, which in its face makes sense : you're already updating something in your system (the project itself), it shouldn't be a big bother to update something else.

I'm not saying this is the absolute right choice, as there clearly is a mismatch with how distros work, but things are not black and white.

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 14:32 UTC (Mon) by ceplm (subscriber, #41334) [Link]

> You're putting words in my mouth.

I am not. I haven’t mean it as pretending to quote you, or I haven’t even claimed that you would support this statement, but that it seems to me that this attitude is too present in the Rust community.

> […] the language is still evolving and gaining features, for instance async/await which was not available in the version of rustc originally shipped with Debian (the situation has since changed).

Yes, in other words, the language is still too immature for the projects of the size of Linux distros and similar.

Python cryptography, Rust, and Gentoo

Posted Feb 15, 2021 15:28 UTC (Mon) by Cyberax (✭ supporter ✭, #52523) [Link]

> and or combining multiple Rust projects in one system (e.g., Linux/Mac OS X distributions).
It works just fine with multiple Rust projects and environments (see https://doc.rust-lang.org/edition-guide/rust-2018/rustup-... ). I have several Rust versions installed on my laptop side-by-side for tests and cross-compilation.

And the fact that you mention Windows/macOS is especially funny, because Rustup has native installers for them, making experimenting with them very easy.

And of course, Cargo makes sure that libraries don't interfere with each other, so each project gets its own dependency closure.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 5:28 UTC (Thu) by roc (subscriber, #30627) [Link] (4 responses)

> Rust is being promoted as a systems language when it doesn’t work on all of the hardware needed by a systems language.

Who gets to define which hardware needs to be supported for a language to be "a systems language"?

When gcc drops support for a CPU architecture, does that mean "systems languages" no longer need to support that hardware? Did someone appoint the gcc maintainers as the guardians who get to define what it means to be "a systems language"?

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 6:12 UTC (Thu) by jalla (guest, #101175) [Link] (3 responses)

no, but when gcc drops support for the language you're still able to build C software for the target. brcm still ships gcc4 as the primary toolchain, as an example. Many others additionally ship gcc4 as the primary toolchain. Requiring software that never existed to build software for real systems (like s390x) is preposterous and missing the $ of the market.

What has happened here is the epitome of the python mindset, which is "if it doesn't impact me, it doesn't matter". I'm not going to take a stance on if this is right or wrong, but it's actively harmful against users.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 6:54 UTC (Thu) by StillSubjectToChange (guest, #128662) [Link]

"Requiring software that never existed to build software for real systems (like s390x) is preposterous and missing the $ of the market."

Rust supports the s390x as a Tier 2 platform, Gentoo just doesn't have packages for it yet. However, Rust does not support the s390 and neither has Linux since 2015. Besides, if a company bought an IBM mainframe then they shouldn't be making *any* complaints about support from open source projects.

"I'm not going to take a stance on if this is right or wrong, but it's actively harmful against users."

Realistically it isn't very many users. If a platform is so anemic that it doesn't have an LLVM backend and isn't implementing one, then it's functionally abandoned.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 9:07 UTC (Thu) by josh (subscriber, #17465) [Link]

> brcm still ships gcc4 as the primary toolchain, as an example

Which means that the same complaints would arise for using any C features that GCC 4 doesn't support, such as most C11 features.

I remember seeing complaints when projects dropped support for pre-C89 compilers. Those complaints don't make it reasonable to keep K&R C support forever.

> Requiring software that never existed to build software for real systems (like s390x)

Rust supports s390x. It sounds like Gentoo didn't ship Rust for that platform.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:07 UTC (Thu) by banana (guest, #144773) [Link]

LLVM supports s390x. It doesn’t support s390, which hasn’t been manufactured in 21 years.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 6:11 UTC (Thu) by StillSubjectToChange (guest, #128662) [Link] (24 responses)

"The third major issue is that Rust has the cargo system as part of its standard use model. This encourages bad behavior. I do not care how “memory safe” your language is if people regularly include unvetted code from some repo.
...
Either way, C as a tool is blameless of programmer error."

So C is blameless for its numerous pitfalls, but Rust is responsible for people misusing cargo? No, I don't think that is a reasonable opinion.

"The final point that I have yet to hear properly explained is why C is good enough to write other languages in, but not okay for others to use."

This seems like a complete non sequitur. If your language compiler is written in C then it's only a build time dependency. If your language interpreter/vm is written in C then you must spend a lot of time making sure it's reliable and secure. In either case it is much safer than having everyone write C.

But C is on the way out for implementing new programming languages. LLVM is a C++ based project, they are even implementing their libc in C++ instead of C. GCC is using C++ for more and more of the compiler. The Go toolchain doesn't need C at all. All major JavaScript engines are written in C++. The list goes on, but clearly people don't believe that C is fit for implementing their languages anymore and will choose not to if possible.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 10:41 UTC (Thu) by Sesse (subscriber, #53779) [Link] (19 responses)

Well, Rust is the only language I've seen where you cannot have a global variable without pulling in a crate.

(You can have a global variable, but not reasonably access is without a mutex, and to initialize that mutex, you de facto need the lazy_alloc crate.)

There are so many things I think Rust has done right. I really want to love the language. But I so dislike that it is yet another language with dependency sprawl and its own package manager that works for its one language only.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 11:51 UTC (Thu) by moltonel (subscriber, #45207) [Link] (18 responses)

You can use std::sync::Once instead of an external crate, or even just a plain static for basic types.

If you want your global to be mutable after init then of course you need to protect it with a mutex or similar, that's a basic C mistake that Rust is protecting you from.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 11:56 UTC (Thu) by Sesse (subscriber, #53779) [Link] (17 responses)

Yes, I wanted it to be mutable after init; it's a cache of state between multiple HTTP requests. So I need a Mutex, and how do you initialize a Mutex safely without lazy_static?

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:07 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (15 responses)

You could do what `lazy_static` is doing behind the scenes. It's not compiler magic or anything, but plain Rust code. The crate just packages it up in a nicer API than stamping out the manual code all the time.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:13 UTC (Thu) by Sesse (subscriber, #53779) [Link] (14 responses)

Sure, but when I searched around for this, people were “do not reimplement lazy_static, you'll be doing it wrong, use the crate”.

It's a bit like Turing completeness. It's _possible_ to do without a crate, but it's definitely much harder, and it doesn't really matter what you do as a single developer long as the entire ecosystem goes the other way.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:25 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (13 responses)

Sure, I didn't say it'd be easy. The fact that lazy_static makes it so easy behind its easy-to-use API is a *benefit*, not a downside.

Global mutable state is a tricky thing in C and C++ too. It's a mistake that they make it so easy to appear that one got it right. The fact that one can easily add a dependency that does something as "trivial" as making it does is a vast improvement. C and C++ dependency management is a PITA on a good day with reasonable projects as dependencies. Rarely do you get one, nevermind both.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 14:44 UTC (Thu) by Sesse (subscriber, #53779) [Link] (12 responses)

What? In C++, I can have a global std::mutex with no external dependencies. There's absolutely no reason why Rust couldn't have a simple way of initializing one in std, without requiring a crate.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 15:37 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (11 responses)

Sure, you can make the mutex, but what data is it guarding? Who ensures the mutex is *used* when accessing the guarded data (spoiler: no one)? When does any of it get initialized at runtime? C++ doesn't guarantee any of this stuff (other than "it happens before it's needed" for the initialization). For example, static initializers might only be run when other code in the .o is used. If the guarded data lives in the data section and is looked up in some way other than through the TU that contains the mutex (e.g., `dladdr` or something), the mutex initializer might not be run at the right time, so good luck with that.

In the case of lazy_static specifically, there is another way that looks better and is also more performant[1]: once_cell. It doesn't use a macro and looks cleaner anyways (no weird macro-syntax of `static ref`). So in this specific instance, the stdlib would have gained a subpar API for such a thing anyways. This is, IMO, a vast improvement over the Python way which enshrines bad APIs because "they were available first" and is how one ends up with urllib, urllib2, urllib3, and requests.

[1]https://github.com/async-rs/async-std/issues/406

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 15:58 UTC (Thu) by Sesse (subscriber, #53779) [Link] (10 responses)

> Sure, you can make the mutex, but what data is it guarding? Who ensures the mutex is *used* when accessing the guarded data (spoiler: no one)?

Please stop the whataboutism.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 16:13 UTC (Thu) by mathstuf (subscriber, #69389) [Link] (7 responses)

You're arguing that Rust makes global mutable state harder to do. Yes, it does. It is in service of helping to point out improper use of such things. That is, IMO, important in such a discussion of comparing the ease of making such declarations in each language.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 16:30 UTC (Thu) by Sesse (subscriber, #53779) [Link] (6 responses)

No, I am arguing that Rust makes even simple things hard to do _without pulling in crates_. I am notably not making a comparison against C++.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 16:46 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

I'm arguing that what seems "simple" is not as simple as you might think it is based on how "easy" C and C++ have made it in the past.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:36 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

Why do you even need a global state? That's an uncommon thing to use, so having users to write code is perfectly fine for that.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:30 UTC (Thu) by marcH (subscriber, #57642) [Link] (3 responses)

If you think a global mutable state is a "simple thing" then you have a serious memory safety problem.

It took a very long wait, but in 2011 even C finally got a memory model that realizes concurrency has to be part of the language

https://en.wikipedia.org/wiki/C11_(C_standard_revision)

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:41 UTC (Thu) by Sesse (subscriber, #53779) [Link] (2 responses)

I think initializing a mutex should be simple thing!

I'm giving up this discussion; too many people are interested in arguing against strawmen, and too few people are interested in discussing the actual problem. It's pretty off-putting when a community's reaction to criticism is “who needs to do that anyway”.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 20:59 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

> I think initializing a mutex should be simple thing!

And it is, mechanically. Semantically, it is *not* a simple thing. These kinds of issues are what Rust is aiming to tackle as a whole.

Could once_cell or lazy_static be added to the stdlib? Sure. Why not yet? Maybe the API isn't sufficiently nailed down, soundness cases considered, etc. enough for the stdlib. Until then, crates.io is a handy place for these things to mature *while getting real world (ab)use*.

Some context for the C++ side of things. Improvements living in some random P paper on the ISO C++ standard committee mailings isn't going to get battle-hardened by anyone other than the author without the heroic work of making it available on existing language specs (e.g., Eric Neibler's Ranges library). This kind of stuff is nigh impossible with language features too. There is still errata coming in for `for (auto i : expr)` for crying out loud because this is undefined behavior:

std::vector<std::string> func();
// ...
for (auto i : func()[0]) // oops, you're iterating on a temporary that just got destructed
{}

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:58 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

> I think initializing a mutex should be simple thing!
It's actually not. For example, on some systems mutexes can require a system call.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 16:14 UTC (Thu) by laarmen (subscriber, #63948) [Link]

Actually, this is relevant to the discussion. A Rust mutex isn't standalone but a container, and it must be able to guarantee that its contents are a valid value for the contained type, whereas a C++ std::mutex doesn't have any knowledge of the data it protects. The former approach makes it possible to guarantee protected access, but it makes the mutex implementation that much difficult.

I agree that it's a bit of a shame to have to resort to a thirdparty crate to easily have a global mutex-protected variable, but as pointed by someone else in the thread, it turns out the approach used by the proeminent crate for this might not be the best after all, which makes me glad it has not been imported into std after all :)

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 17:24 UTC (Thu) by farnz (subscriber, #17727) [Link]

All of those are lists of why what appears to be simple in C++ ("just" create a std::mutex at global scope) are in fact not simple at all once you consider the details. It's just that C++'s (and C's) way of doing things relies on you knowing that it's not that simple, and that you have a whole boatload of other complexity to consider when doing this.

And Rust globals are writable - you have to use the unsafe marker with writes to a global, because writing to a global without sufficient consideration of concurrency results in memory unsafety (in C, Rust, or C++ - this is a common thing to all systems languages). C++ just doesn't force you to confront that front-and-centre, where Rust does.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 21:19 UTC (Thu) by josh (subscriber, #17465) [Link]

> Yes, I wanted it to be mutable after init; it's a cache of state between multiple HTTP requests. So I need a Mutex, and how do you initialize a Mutex safely without lazy_static?

To answer your question:

We're currently adding an equivalent to lazy_static in the standard library: https://doc.rust-lang.org/std/lazy/index.html . It's currently available on nightly, and folks are working towards marking it stable.

We're generally very careful before adding something to the Rust standard library. We have strong stability guarantees for anything in the standard library, and once we add something it's subject to those same guarantees. The Python project has the philosophy that "the standard library is where code goes to die", for much the same reason; there are various pieces of the Python standard library for which the standard wisdom is "don't use it, use this third-party module instead". We want to avoid that situation in Rust whenever we can, even if that means that some common functionality requires a crate. It's very easy to add a dependency on a crate from the crates.io ecosystem.

Now, separate from the answer to your question, there are two reasons you might not want to use a global mutex-guarded variable as the cache for your HTTP requests. First, you might want to use a concurrent data structure instead, such as "dashmap", a fast concurrent hashmap. And second, you might consider putting that data structure in one of your library's objects instead, so that you (or other code calling into your code) can use multiple such objects concurrently without global state. All that said, you *can* use a global mutex-guarded variable if you want to, and std::lazy should let you do that using just the standard library.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 18:58 UTC (Thu) by ceplm (subscriber, #41334) [Link] (3 responses)

> So C is blameless for its numerous pitfalls, but Rust is responsible for people misusing cargo?

No, but C has glibc, Rust has ??? Is there a complete standard library for Rust (in the style of glibc or the Python stdlib), or are Rust people the same as Node or Lua ones (although, the latter is more forgivable, because of embedded space): “La, la, la, there is no problem, just pick some stuff from NPM/Luarocks/Cargo.”

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 19:45 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

Rust's stdlib is pretty much equal in features to glibc. It's not like glibc has anything apart from basic IO.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 4:51 UTC (Fri) by zev (subscriber, #88455) [Link]

I dunno, glibc's got, say, strftime(), and a PRNG (or three). And sure, the latter's not cryptographically secure, but it's nice to be able to generate some quick-n-dirty test data without having to take a third-party dependency.

Python cryptography, Rust, and Gentoo

Posted Feb 12, 2021 12:22 UTC (Fri) by khim (subscriber, #9252) [Link]

Unfortunately glibc have lots of things besides basic IO. Internationalization and authorization, berkeley db and elliptic function.

If you try to look on glibc you'll find bazillion different things there. Most in half-useful state and not very useful at all.

While glibc does the best job it can it's really a pile of crazy things which are there just because Unix variants have grown all these warts there and GLibC needs to support it all.

I, for one, am very glad that Rust have nothing like glibc.

What I find most ironic is that some of the same people who condemned systemd “because it violates Unix philosophy” (specifically they claim it violates the principle: “do one thing well”) now condemn Rust because it refuses to provide huge pile of… things in it's standard library.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 8:57 UTC (Thu) by josh (subscriber, #17465) [Link]

> It adds backward compatibility breaks.

Rust cares quite a bit about *not* having backwards compatibility issues. A project dropping compatibility with ancient platforms would be a potential backwards compatibility break in that project, not in Rust.

> Rust is being promoted as a systems language when it doesn’t work on all of the hardware needed by a systems language.

Only if you're tautologically defining "all the hardware needed by a systems language" as "every platform that has ever had a C compiler".

Rust runs on most current platforms, and many non-current ones.

Python cryptography, Rust, and Gentoo

Posted Feb 11, 2021 17:36 UTC (Thu) by farnz (subscriber, #17727) [Link]

In answer to your thing about "why is C good enough to write other languages in?" - it's not. It's been used because it's what we had when those languages started being written, and good things that exist are better than perfect things that don't exist, and so we now have technical debt to pay down in relation to security and undefined behaviour.

Rust is one path for paying down that technical debt - it's not the only possibility, but it's one that exists now and has found a sweet spot that Agda (theoretically better, but harder to use) and Object Pascal (Lazarus project) have not found. I'm confident that in the future, we will find a new sweet spot language, and will have tech debt written in Rust to pay down, too.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 13:25 UTC (Wed) by kmweber (guest, #114635) [Link] (3 responses)

And, I mean, it's not like it's particularly difficult to write memory-safe code in C. I don't know where this myth that C is an "inherently insecure platform" comes from. It's not. It's exactly as easy to write safe code in C, as it is to write unsafe. It's programmers who choose not to, not the language forcing it on them.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 14:28 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (1 responses)

Sure, writing secure C is *possible*, but I think that, as a whole, we programmers have proven to be pretty shitty at it. If the Linux kernel code review process with all the C veterans can't get it right (just look at the stable kernel patch queue!), what makes you think it's viable for the general coding population to use it? Sure, one can use clang-tidy, sanitizers, valgrind, etc. on it, but I see that as a failing of the language being propped up by expensive tooling rather than a benefit of the language itself.

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 15:37 UTC (Wed) by Wol (subscriber, #4433) [Link]

Yup. Different languages, different strengths, different weaknesses. C *encourages* you to play with pointers, which means even experienced programmers use them when they're not necessary. And if you play with knives when you don't need to, you WILL, on average, get cut. Sometimes badly.

I'm sure Rust has its faults. My favourite language, DataBasic, has quite a few. But one of the biggest flaws in a language is using it in an environment for which it is not suited. C *was* brilliant as a low-level system language. Hardware has evolved. C is no longer low-level. People still use it as a low-level language and get badly sliced by the impedence mismatch between what C thinks the hardware is, and what the hardware really is. And it's the easy access to pointers that encourages this dangerous behaviour.

Cheers,
Wol

Python cryptography, Rust, and Gentoo

Posted Feb 17, 2021 15:41 UTC (Wed) by pizza (subscriber, #46) [Link]

It's disingenuous to claim that "programmers choose to not write memory-safe" code. Bugs are (almost) never intentional.

But you're far more likely to get cut when playing with knives than with spoons.

Meanwhile, in the world I where I spend most of my F/OSS (and often, professional) time, the majority of the code I write is what other language consider inherently "unsafe". It's probably fair to say that C is the least-worst option.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds