Fedora ponders the Python 2 end game

Posted Aug 2, 2017 4:11 UTC (Wed) by wahern (subscriber, #37304)
In reply to: Fedora ponders the Python 2 end game by Cyberax
Parent article: Fedora ponders the Python 2 end game

At work the recent Dirty Cow kernel patches broke the JVM. Something similar could happen with Go some day (or maybe already has?), which could be much worse as breaks in backward compatibility would require rebuilding every single application. That could prove a nightmare.

I know Linux tries to maintain strong compatibility guarantees, but both the kernel as well as, more typically, distributions fall short of that goal. For example, the deprecation of the sysctl(2) syscall, which broke descriptor-less acquisition of system randomness. (And because getrandom(2) wasn't added until recently, several long-term RHEL releases suffer from a flat-out regression in behavior and sandboxing capabilities.) For all their stodginess, commercial Unices like Solaris had good track records in this regard. (Perhaps this was their downfall!) On balance Go's static compilation model works well for most teams and in fact a unique advantage, but this could change.

External interpreters are more resilient in this regard. There's a reason the ELF specification is so complex, and why enterprise Unix systems evolved toward aggressive dynamic linking; namely, so you could modify and upgrade discrete components in a more fine-grained matter. On a platform like OpenBSD, which ruthlessly changes its ABI, a static compilation model is much more costly. As a recent DEF CON presentation suggests, improved security might require more frequent breaks in backward compatibility as a practical matter.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 7:14 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (6 responses)

> At work the recent Dirty Cow kernel patches broke the JVM.
The mainline kernel has never broken JVM in released versions. Linux Torvalds would eat anybody who tried to do this on purposed. And Go has a critical mass of users, so any kernel-level breakages will be obvious. Go's static linking and independence from libc is a stroke of genius in my opinion. You just drop a binary and it's immediately usable.

This would also be a problem for Python, if you're using a C-based module.

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 9:43 UTC (Wed) by wahern (subscriber, #37304) [Link] (5 responses)

> Go's static linking and independence from libc is a stroke of genius in my opinion.

It's a return to historic Unix. Plan 9 intentionally rejected dynamic linking, too.

I agree that in current production Linux environments static linking provides a significant net benefit. Static linking decouples components, allowing you to iterate development easier. But the spirit behind static linking is predicated on your codebase being well maintained, your programmers responsive to changes in interfaces, and that operating system interfaces are minimal and extremely stable. This is how things _should_ be, ideally. But if this were true in reality than CentOS and RHEL wouldn't dominate on the backs of their ABI compatibility guarantees, and people wouldn't sweat kernel upgrades or even deploying alternative operating systems. "Well maintained" does not describe most code bases; nor does "responsive" describe most engineers. Have you upgraded your vendor kernels (which would have broken the JVM), or recompile all your C and cgo programs, to mitigate Stackclash?

It's no coincidence that Plan 9 represents almost every resource as a file or set of files. A file-based object contract only needs four interfaces at the languag level to manipulate--open, read, write, and close. More over, in Plan 9 control channels rigorously restrict themselves to ASCII-based string protocols. That provides an incredibly stable contract for process-process and kernel-process integration, providing opportunities for rich and deep code reuse without sacrificing reliable backward and forward compatibility. Theoretically Plan 9 is a finished operating system: you aren't going to need to wait around for the kernel to provide improved sandboxing or entropy gathering as it's already provides all the interfaces you need or will ever get.

But Linux isn't Plan 9. New interfaces like seccomp aren't generally implemented via file APIs for very practical reasons. Even interfaces like epoll() were a regression in this regard, as an early predecessor to epoll() had you open /dev/poll instead of via a new syscall. Linux added getrandom(2) to supplement /dev/urandom because the Unix filesystem namespace model is fundamentally limited when it comes to achieving "everything is a file" semantics--unprivileged, unlimited namespace manipulation breaks deeply embedded assumptions, making it a security nightmare and necessitating difficult trade-offs when you wish to leverage namespace manipulation for, e.g., sandboxing. I could go on endlessly about how Linux extensions subtly break the object-as-file interface contract.

There are similar problems wrt resource management when you rely heavily on multi-threaded processes. Go isn't designed for simply spinning up thousands of goroutines in a single process; both Go and Plan 9 were designed to make it easy (and predicated upon the ability to) scale a service across thousands of machines, in which case you really don't care about the resiliency of a single process or even a single machine.[1] But that kind of model doesn't work for enterprise databases or IoT devices, or in situations where communicating processes implicitly depend on the resiliency of a small set of processes (local or remote) for persistence. Do you implement two-phase commit every time you perform IPC? That kind of model for achieving robustness is neither universally applicable, nor even universally useful; and it makes demands of its own. In practice execution is never perfect even if you try, but that's true just as much as when writing a highly distributed system as when trying to achieve resiliency using more typically approaches.

As I said before, dynamic linking and complex ABIs didn't arise in a vacuum. Torvalds guarantee about backward compatibility is meaningful precisely because it permits static compilation of user processes so you can decouple kernel upgrades from application process upgrades. But that guarantee wouldn't be required for different operating system models. It's not an absolute guarantee, especially given that almost nobody runs mainline kernels directly--you'd be an idiot to do do some from a security perspective. And the kernel-process boundary isn't the only place where you care about independently upgrading components.[2] If we don't appreciate why it came about and assume that returning to a static linking model will solve everything, we're doomed to recapitulate all our previous choices. Instead, we need to learn to recognize the many contexts where the static model breaks down, and continue working to improving the utility and ease of use of the existing tools.

[1] For example, Go developers harbor no pretense about making OOM recoverable, and AFAICT generally believe that enabling strict memory accounting is pointless. That makes it effectively impossible to design and deploy high reliability Go apps for a small number of hosts that don't risk randomly crashing the entire system under heavy load, such as during a DDoS. (Guesstimating free memory is inherently susceptible to TOCTTOU races. And OOM can kill any number of processes that ultimately require, directly or indirectly, restarting the host or at least restarting all the daemon processes which shared persistent state, dropping existing connections.) It's one thing to say that you're _usually_ better off designing your software to be "web scale". I whole heartily agree with approach, at least in the abstract. It's something else entirely to bake that perspective into the very bones of your language, at least in so far as you claim that the language can be used as a general alternative to another systems languages like C. (Which AFAIK is not actually something the Go developers claim.)

[2] Upgrading the system OpenSSL is much easier than rebuilding every deployed application. Go already had it's Heartbleed: see the Ticketbleed bug at https://blog.filippo.io/finding-ticketbleed/

Fedora ponders the Python 2 end game

Posted Aug 2, 2017 19:40 UTC (Wed) by drag (guest, #31333) [Link]

> "Well maintained" does not describe most code bases; nor does "responsive" describe most engineers.

If your developer sucks and projects have bad habits then static vs binary dependencies are not going to save you. It's going to suck in slightly different ways, but it's still going to suck.

> There are similar problems wrt resource management when you rely heavily on multi-threaded processes. Go isn't designed for simply spinning up thousands of goroutines in a single process; both Go and Plan 9 were designed to make it easy (and predicated upon the ability to) scale a service across thousands of machines, in which case you really don't care about the resiliency of a single process or even a single machine.

Mult-threaded processes are for performance within a application. Dealing with resiliency is a entirely separate question.

If you have a application or database that isn't able to spread across multiple machines and can't deal with hardware failures and process crashings then you have a bad architecture that needs to be fixed.

Go users shoot for 'stateless applications' which allows easy scalibility and resiliency. Getting fully stateless applications is not easy and many times not even possible, but when it is possible it brings tremendous benefits and cost savings. It's the same type of mentality that stays that functional programming is better then procedural programming. It's the same idea that is used for RESTful applications and APIs.

Eventually, however, you need to store state and for that you want to use databases of one type or another.

The same rules apply however when it comes to resiliency. You have to design the database architecture with the anticipation that hardware is going to fail and databases crash and file systems corrupt. It's just that the constraints are much higher and thus so is the costs.

It really has nothing to do with static vs dynamic binaries, however, except that Go's approach makes it a lot easy to deploy and manage services/applications that are not packaged or tracked by distributions. The number of things actually tracked and managed by distributions is just a tiny fraction of the software that people run on Linux.

> For example, Go developers harbor no pretense about making OOM recoverable, and AFAICT generally believe that enabling strict memory accounting is pointless. That makes it effectively impossible to design and deploy high reliability Go apps for a small number of hosts that don't risk randomly crashing the entire system under heavy load, such as during a DDoS.

When you are dealing with a OOM app they are effectively blocked anyways. It's not like your Java application is going to be able to serve up customer data when it's stuck using up 100% of your CPU and 100% of your swap furiously trying to use garbage collecting to try to recover during the middle of a DDOS.

Many situations the quickest and best approach is just to 'kill -9' the processes or power cycle the hardware. Having a application that is able to just die and is able to automatically restart quickly is a big win in these sorts of situations. As soon as the DDOS ends then you are right back at running fresh new processes. With big applications and thousands of threads and massive single processes that are eating up GB of memory... it's a crap shoot whether or not they are going to be able to recover or enter a good state after a severe outage. Now you are stuck tracking down bad PIDs in a datacenter with hundreds or thousands of them.

> And the kernel-process boundary isn't the only place where you care about independently upgrading components.

No, but right now with Linux distributions it's the only boundary that really exists.

Linux userland exists as a one big spider-web of interconnected inter-dependencies. There really is no 'layers' there. There really isn't any 'boundries' there... It's all one big mush. Some libraries do a very good job at having strong ABI assurances, but for the most part it doesn't happen.

Whether or not it's a easier to just upgrade a single library or if it's even possible or if it's easier to recompile a program... all this is very in 'It depends'. It really really depends. It depends on the application, how it's built, how the library is managed, and thousands and thousands of different factors.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 7:10 UTC (Thu) by mjthayer (guest, #39183) [Link] (2 responses)

The thought that always goes through my head when I hear this discussion is "why not link statically but shell out to openssl(1) for encrypted connections". I am sure there is a good reason, which wiser people than me (I am no openssl expert, neither the library nor the command line tool) will tell me. This is not specific to openssl of course. It applies just as much to image format translation.

Generally, there must be a security risk threshold below which static linking can make sense. If you are at risk from a vulnerability in a dependency, you are presumably at risk from vulnerabilities in your own code too, so you have to be vigilant and to do updates from time to time anyway. The point where most updates are due to security issues in components is probably the point where the threshold has been passed.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 7:46 UTC (Thu) by mjthayer (guest, #39183) [Link] (1 responses)

Having posted that and tried to anticipate the responses, I have to admit that I don't know whether shelling out to openssl(1) is really such a big gain over dynamically linking in most Linux use cases. The main one I see where it is is when you are shipping a binary outside of the distribution and don't want to support multiple openssl shared library ABI versions.

Fedora ponders the Python 2 end game

Posted Aug 3, 2017 13:45 UTC (Thu) by niner (subscriber, #26151) [Link]

Funny that you bring this up in a news post that's dealing with "python2/3" vs. "python" command name. What do you hope to gain by shelling out to openssl instead of dynamically linking? The binary still has an interface that may or may not change in incompatible ways. At least for dynamic libraries there's support for versioning. As the article demonstrates, there is no such thing for system commands.

Fedora ponders the Python 2 end game

Posted Aug 4, 2017 1:23 UTC (Fri) by lsl (subscriber, #86508) [Link]

> Go already had it's Heartbleed: see the Ticketbleed bug at https://blog.filippo.io/finding-ticketbleed/

Uhm, that is not a Go issue at all but a bug in the TLS implementation of some F5 load balancer appliances. You could just trigger the bug using the Go TLS library client-side as it uses TLS session identifers of different size than OpenSSL (which was probably the only thing the vendor tested with).

When you sent a 1-byte session ID to these F5 appliances they still assumed these were 32 bytes long (as that's what OpenSSL and NSS happen to use) and echoed back your session ID plus the 31 bytes immediately following it in memory.