Long battle

Posted Mar 21, 2025 20:49 UTC (Fri) by cen (subscriber, #170575)
Parent article: Julien Malka proposes method for detecting XZ-like backdoors

I think this effort will take at least 10 years because you need to start at the bottom with the low level libraries and build to the top to the final consumers, images and the whole OS. You need to have reproducibility, build metadata, attestations, multiple independent builders producing the same thing, possibly some TEEs mixed in, everything signed all the way down.. at the same time you need to battle against compromise at source (probably hardest), compromised CI workers (Pagure/OBS article from the other day) and "FTP"/release team compromises.

The key is that the automation tools get to a point where all these security measures are already built and and on by default, for example with Github actions for general population. Just like we take TLS for granted now but it wasn't in the 2000-nds.

Long battle

Posted Mar 22, 2025 21:33 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (8 responses)

Ironically, most of these measures are far easier to deploy in a proprietary environment than in a FOSS environment, because the proprietary world can implement policy by fiat, and the FOSS world mostly cannot.

As a simple example, your employer could tell you that:

* Your code must be built in Bazel (or some other thing which resembles Bazel)...
* ...out of the company's monorepo, which enforces mandatory code review...
* ...on the company's standard CI system...
* ...and then signed with the company's private key that is only held by that same CI system...
* ...or else the production environment will refuse to execve() it.

"Bazel is hard and the upstream doesn't like it?" Too bad. You have to use it, or your binaries won't run in production. If you can't convince the upstream, then you have to vendor the dependency and make whatever changes are necessary to build it in Bazel anyway. It's probably not even worth trying to talk to the upstream, seeing as nobody likes to get "you have to do this because my company requires it" bugs.

An exception? Sure, your great-great grandboss (possibly with several additional "great"s depending on how deep the company's hierarchy is) can maybe give you an exception, if they feel like it and you manage to get five seconds of their incredibly limited time. It has to be renewed every six months (when we rotate the key), and we also expect you to draft a plan for getting your build workflow up to company standards. That plan must have a specified end date, and if that end date changes from one renewal to the next, we will notice and pepper you with (written) questions (that require written answers). Or you can suck it up, vendor the damn thing now, and skip all of this faff.

(Yes, there are standing exceptions for things like "the server is on fire and I need to run something unsigned to fix it." Most of these exceptions cannot be fully automated, generate copious audit logs, or more often both. Inappropriately using these mechanisms to bypass security policy is a fireable offense.)

You can't implement something like that in the FOSS world, because half those things don't even have FOSS counterparts (What is "the company's private key" in FOSS? What is "production" and who gets to set policy for it? etc.), and the other half are at least somewhat possible, but controversial at best (Bazel, vendoring, "firing" people, bureaucracy as a policy lever, etc.).

Long battle

Posted Mar 22, 2025 23:16 UTC (Sat) by linuxrocks123 (subscriber, #34648) [Link] (5 responses)

When the proprietary world uses open source software, they either do it normally if they're smart or pay Red Hat / Oracle / SuSE / Canonical an arm and a leg and use whatever they are given by them if they are not. When the proprietary world forks something or builds something on top of something from the open source world, they can do it different ways, but one way they definitely will not do it is open a PR checking in the entire Chromium source tree and having another employee code review each of those million lines of code.

So, the proprietary world is not likely to do anything resembling the overengineered nonsense you propose because the proprietary world knows their employees' real names and addresses and therefore that they are probably not Russian spies, and they're not going to review their OSS dependencies or upstream codebases before checking them in anyway, so, if those dependencies have been infiltrated by Russian spies, signing the Russian spies' code won't add value.

Now, if your general point is "centralization is better because I can make everyone else do what I want", that mindset has a lot of other problems ... but that's a different conversation.

Long battle

Posted Mar 23, 2025 1:24 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (4 responses)

> So, the proprietary world is not likely to do anything resembling the overengineered nonsense you propose

I find it utterly baffling when people tell me, citing no evidence whatsoever, that my employer does not do the exact thing that I personally experience on a day to day basis. I am not making up a hypothetical, I am telling you how Google actually develops software. See [1] if you don't believe me.

> but one way they definitely will not do it is open a PR checking in the entire Chromium source tree and having another employee code review each of those million lines of code.

There is tooling and automation to facilitate processes like this. It does not necessarily require a human to manually review every line of third-party code on first integration, especially if it comes from a reputable project that is known to have a serious process for finding and fixing vulnerabilities. It is enough for there to be some team of security professionals who can make policy judgments about which FOSS projects are trustworthy and which ones are going to require manual code review.

But regardless, the ultimate goal here is not just to get all of the code audited. It is to ensure that you have a standardized and hermetic build process, that isn't curl | bash or anything resembling curl | bash, and that can be subject to reproducibility and signing requirements in a centralized fashion.

[1]: https://bazel.build/about/faq#how_does_the_google_develop...

Long battle

Posted Mar 23, 2025 6:57 UTC (Sun) by pm215 (subscriber, #98099) [Link] (3 responses)

I think that the miscommunication is that you say "the proprietary world" when what you mean is "the absolutely gigantic tech firms, of which there are perhaps half a dozen". Much more of the proprietary world is smaller companies who do not have "invest in software tooling" as a corporate value the way Google does, but instead see infrastructure and tooling as a cost to be minimised.

IDK, I don't have much insight into internal processes at a wide range of companies: but I strongly suspect Google and the amount of investment Google can and will put into tooling is an outlier. Most companies do not write their own build systems!

Long battle

Posted Mar 23, 2025 7:30 UTC (Sun) by pm215 (subscriber, #98099) [Link]

(I should have written "what you appear to mean" rather than "what you mean"; sorry about that.)

Long battle

Posted Mar 23, 2025 9:02 UTC (Sun) by NYKevin (subscriber, #129325) [Link] (1 responses)

Bazel is FOSS. Anyone can use it right now. There are also numerous FOSS CI solutions, numerous FOSS code signing solutions, etc., all of which are individually capable of every piece of functionality I describe in my original comment, or could be trivially extended to support such functionality. You don't actually *need* to write your own build system, because other people have already done that work for you.

Some technical knowledge is required to snap all of the individual Lego bricks together, so I'm not suggesting that random non-tech companies are going to do this (they will buy a turnkey solution from somebody like IBM or Oracle, which will do something like this, but with more audit logs and misc. "compliance" features), but it is not nearly as hard as you seem to think. If you have a few competent engineers, it's mostly a question of political will and budgeting. That does not make it easy. It makes it feasible, under the right conditions, with managerial support. Ultimately, this is a business decision. If a company('s management) does not want reproducible builds, or is not willing to say "no" to a large number of employees in order to get reproducible builds, then it will not have reproducible builds.

***

Aside from that, while I appreciate that you are trying to deescalate the discussion, I really do not think the phrase "overengineered nonsense" is helpful, nor do I see any qualification in the original comment suggesting that it was restricted to smaller companies. I really wish we could be kinder to one another in discussions like this one.

Long battle

Posted Mar 23, 2025 12:45 UTC (Sun) by pm215 (subscriber, #98099) [Link]

I think my take is that most proprietary shops indeed do not have that political will and budget, and are unlikely to acquire it short of external forcing factors like regulation.

Long battle

Posted Mar 23, 2025 14:36 UTC (Sun) by RaitoBezarius (subscriber, #106052) [Link]

What is interesting is that Nixpkgs is implementing many of these policies by virtue of the store based model.

(Even things like execve() policies are possible with eBPF and used by people in production with image based NixOS.)

And yes, we even wrap Bazel builds in Nix, e.g. Gerrit!

Long battle

Posted Mar 23, 2025 16:47 UTC (Sun) by tialaramex (subscriber, #21167) [Link]

Nah. I have worked for small, medium and large proprietary outfits over the decades.

At every place there was a theory which usually looks much like what you discuss - and then a reality which did not at all.

On Friday for example we discussed a work card for (I will change names to protect the guilty) "Raspberry scheduler upgrade?" marked as an Infosec problem. Trivial upgrade, small card, upgrading the Raspberry scheduler won't be hard why are talking about it? Ah yes, said the very experienced team lead talking, the thing you need to know is that the Raspberry scheduler machine xyz1234 was a really convenient system to dump other stuff onto either when it had no natural home or while waiting for Infrastructure to spin up a permanent home. So, if you just upgrade the Raspberry scheduler and tick done, either Infosec will make these other arbitrary systems no longer work OR they will re-open your ticket now with a higher priority saying you didn't fix it because xyz1234 still fails their checks. You need to go scuba diving in that machine, figure out absolutely everything we're actually using it for, document it and write more cards about all the upgrades needed to meet Infosec's requirements, but also please upgrade the Raspberry scheduler while you're about it.

At my last big corp Linux work, I had SSH and sudo on the production machines. I didn't want that access, but I had unsuccessfully argued against having it for maybe 5+ years before I quit for other reasons.

Long battle

Posted Mar 24, 2025 2:12 UTC (Mon) by buck (subscriber, #55985) [Link]

I'm not sure i understand exactly, but GitHub actions themselves seem to be bringing more supply-chain concerns to the party:

https://github.com/advisories/GHSA-mrrh-fwg8-r2c3

So, in terms of "CI worker" trustworthiness, it's turtles all the way down.

(Not that I am asserting you said it wasn't)