A few relevant quotes
I'm on a holiday and only happened to look at my emails and it seems to be a major mess.— Lasse Collin
The reality that we are struggling with is that the free software infrastructure on which much of computing runs is massively and painfully underfunded by society as a whole, and is almost entirely dependent on random people maintaining things in their free time because they find it fun, many of whom are close to burnout. This is, in many ways, the true root cause of this entire event.— Russ Allbery
Incredible work from Andres. The attackers made a serious strategic mistake: they made PostgreSQL slightly slower.— Thomas Munro
There is no way to discuss this in public without turning a single malicious entity into 10 000 malicious entities once the information is widely known.— Marc DeslauriersMaking sure the impact and mitigations are known before posting this publicly so that everyone knows what to do before the 10 000 malicious entities start attacking is just common sense.
Again the FOSS world has proven to be vigilant and proactive in finding bugs and backdoors, IMHO. The level of transparency is stellar, especially compared to proprietary software companies. What the FOSS world has accomplished in 24 hours after detection of the backdoor code in #xz deserves a moment of humbleness. Instead we have flamewars and armchair experts shouting that we must change everything NOW. Which would introduce even more risks. Progress is made iteratively. Learn, adapt, repeat.— Jan Wildeboer
Posted Mar 30, 2024 17:24 UTC (Sat)
by jcdickinson (guest, #168509)
[Link] (12 responses)
Posted Mar 30, 2024 22:24 UTC (Sat)
by draco (subscriber, #1792)
[Link] (11 responses)
Based on the evidence as reconstructed so far, I believe Lasse is a victim that deserves the benefit of the doubt
Posted Mar 30, 2024 23:58 UTC (Sat)
by MarcB (guest, #101804)
[Link] (10 responses)
This is made worse by the fact that he made some benign contributions to various projects, likely to earn trust and reputation, and in doing so interacted with several people.
Admittedly, he also seems to have made more questionable contributions, like removing a filter for escape sequences contained in file names: https://github.com/libarchive/libarchive/pull/1609
Also, another malicious one to xz-utils: https://git.tukaani.org/?p=xz.git;a=commitdiff;h=328c52da... (Fix: https://git.tukaani.org/?p=xz.git;a=commitdiff;h=f9cf4c05...)
All in all, this is extremely damaging. Suddenly people who accepted a benign - or questionable at worst - commit are suspected as co-conspirators.
Posted Mar 31, 2024 0:46 UTC (Sun)
by dullfire (guest, #111432)
[Link] (2 responses)
Considering how much of a long-term game appears to have been in play here (at least from what I have heard), name sound entirely questionable. Unless there's people reliable people who've met the person, there is no reason to believe that is their real name, or has any reflection on their real identity.
And grouping people together that way (especially with a complete lack of *any other evidence*) is almost never helpful.
Posted Mar 31, 2024 1:03 UTC (Sun)
by lambda (subscriber, #40735)
[Link] (1 responses)
There is no good reason to suspect that any of these have any relationship to the attacker's actual identity.
Posted Mar 31, 2024 13:15 UTC (Sun)
by nix (subscriber, #2304)
[Link]
Posted Apr 3, 2024 13:14 UTC (Wed)
by paulj (subscriber, #341)
[Link] (6 responses)
Posted Apr 3, 2024 23:56 UTC (Wed)
by atnot (subscriber, #124910)
[Link] (1 responses)
Posted Apr 4, 2024 0:42 UTC (Thu)
by viro (subscriber, #7872)
[Link]
Speculating about the origin of the bastard(s) in question on the basis of name is, of course, completely pointless. It neither points to China nor discards such possibility. Fake is fake (and Lunar New Year is not an argument either - if major says that lieutenant is going to work on some date, the lieutenant _is_ going to work, holiday or no holiday, whatever country or agency they are in).
Posted Apr 4, 2024 7:21 UTC (Thu)
by wsy (subscriber, #121706)
[Link] (3 responses)
Posted Apr 4, 2024 9:03 UTC (Thu)
by paulj (subscriber, #341)
[Link] (2 responses)
Posted Apr 4, 2024 9:43 UTC (Thu)
by farnz (subscriber, #17727)
[Link]
Of course, one thing to take into account when looking at the name - malicious entities like it when you blame their enemies for their actions, and whenever they've got a free choice (like a name for an attacker), they'll choose one that they hope encourages you to blame their enemy, not them.
You thus can't draw any conclusion from the name given what we know so far, since it's just as likely that Jia Tan is a name chosen by a Western agency to put the blame on China as it is one chosen by a Chinese agency to ensure that if their agent makes a slip-up and appears Chinese in action, it'll not be as suspicious as if the agent used the name "George Smith".
Posted Apr 4, 2024 10:28 UTC (Thu)
by paulj (subscriber, #341)
[Link]
Posted Mar 30, 2024 18:29 UTC (Sat)
by gmgod (guest, #143864)
[Link] (22 responses)
Posted Mar 30, 2024 20:39 UTC (Sat)
by geuder (subscriber, #62854)
[Link] (16 responses)
However, the practice to use release tar balls which do not match what is under version control made the wrongdoing more difficult to detect. And I can see no single valid reason to do so, it should be extremely easy to stop.
Posted Mar 30, 2024 22:33 UTC (Sat)
by ballombe (subscriber, #9523)
[Link] (8 responses)
Most of the payload was checked in the GIT repository anyway and nobody noticed.
Posted Mar 31, 2024 7:44 UTC (Sun)
by geuder (subscriber, #62854)
[Link] (3 responses)
Microsoft uses codesigning, but I assume few in this forum would support the idea that that is a sufficient measure to trust software. Being able to inspect the code in a favorable form with the support of version management tools is an essential part of FOSS. Having a big amount of m4 stuff, illegible to most of us in any reasonable amount of time is not better than distributing binaries.
> You cannot usually redistribute the whole git repository
Why not? Are you referring to size?
My Yocto builds use BB_GIT_SHALLOW [1], so I clone only once until the next update. Yes, the tooling would need improvement, so it could handle deltas.
> git archive of a commit is not reproducible.
Poor tooling once again.
> Most of the payload was checked in the GIT repository anyway and nobody noticed.
So we need to learn to treat test data as untrusted, like you treat inputs from the internet. I guess it should be possible to use landlock not to read from test data locations during the build phase, before entering test phase. And the test phase not being allowed to change the binaries produced anymore. (Guessing, haven't used landlock myself yet.)
> It is more important to protect against webhost compromise than upstream going rogue.
Not sure I am following here. I think most build systems verify a hash already. The bigger risk is really do I have a hash belonging to good code in the beginning.
My point with not 100% agreeing with the original quote that FOSS did great in this case is: Yes, the aftermath is handled much better than any closed source company would do. But our tools and processes have many weaknesses, too. We tend to trust FOSS developers because they are good guys. Many IT managers trust Microsoft because they are such a successful company. And each side is convinced the other side is just naive.
[1] https://docs.yoctoproject.org/bitbake/dev/bitbake-user-ma...
Posted Mar 31, 2024 12:40 UTC (Sun)
by ballombe (subscriber, #9523)
[Link] (2 responses)
Yes. It would not be reasonable for debian to ship the whole git archive as orig.tar.gz
>> It is more important to protect against webhost compromise than upstream going rogue.
The git repository can have been tampered with.
> I think most build systems verify a hash already. The bigger risk is really do I have a hash belonging to good code in the beginning.
Yes, we still need code signing.
Posted Mar 31, 2024 16:59 UTC (Sun)
by geuder (subscriber, #62854)
[Link] (1 responses)
A bare shallow clone should not be significant different in size and you can immediately see whether your SHA-1 matches or not.
> The git repository can have been tampered with.
Tampered such that the SHA-1 of your release has changed? Yes that's possible, but you should know what is a good SHA-1 to base your build on. Either from a signed commit/tag or a signed release note.
Or are you referring to SHA-1 collisions? Well, they might come some day, but at the moment I am not overly worried yet. To hide some payload it should not be done in the tip, but a bit deeper in the history. That seems to be many years out.
Posted Apr 5, 2024 13:43 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link]
Formats in `export-subst` contents which use replacements backed by the likes `git describe`, short hashes, or other repository-state sensitive replacements need consideration here.
Posted Mar 31, 2024 8:47 UTC (Sun)
by walters (subscriber, #7396)
[Link] (2 responses)
Posted Mar 31, 2024 12:27 UTC (Sun)
by geuder (subscriber, #62854)
[Link] (1 responses)
"make dist" is the step I have never liked, mostly out of debugging/reproducibility reasons. As I wrote in a sibling it's close to distributing binaries. And in the current case it was a relevant part of the exploit.
About SHA-1 I personally haven't been too worried yet. Creating git hash collisions hasn't been demonstrated yet. But maybe inserting "suitable" test data will allow that some day. Posted Apr 2, 2024 9:10 UTC (Tue)
by nim-nim (subscriber, #34454)
[Link]
And you can not analyse this test code and data because some of it is intentionally bad and designed to break tools.
It’s a huge mess. The whole build infra is in sore need of being systemd-ed by someone with enough know-how and guts to deprecate the pile of cruft and impose clear limits on what can and can not be done.
Posted Mar 31, 2024 6:11 UTC (Sun)
by cozzyd (guest, #110972)
[Link] (5 responses)
Posted Mar 31, 2024 16:08 UTC (Sun)
by dullfire (guest, #111432)
[Link]
Posted Apr 4, 2024 12:17 UTC (Thu)
by detiste (subscriber, #96117)
[Link]
Just noticed this one last week ... incidently I noticed it while
Before it would give a good nostalgia vibe :-(
Posted Apr 5, 2024 13:20 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
Posted Apr 5, 2024 13:23 UTC (Fri)
by gioele (subscriber, #61675)
[Link] (1 responses)
Posted Apr 5, 2024 13:38 UTC (Fri)
by rahulsundaram (subscriber, #21946)
[Link]
Yes, looks like this changed early 2021 but before that, you can see it was just snapshots and tarballs committed into git directly. The tarball part is still true.
Posted Apr 3, 2024 8:38 UTC (Wed)
by LtWorf (subscriber, #124958)
[Link]
The configure script is needed to generate the Makefile
The Makefile is needed to build.
It is considered bad form to keep generated files on VCS, and also a commit regenerating them might include whatever, it won't be vetted because it's huge.
Posted Mar 30, 2024 23:21 UTC (Sat)
by MarcB (guest, #101804)
[Link] (4 responses)
Nothing was "proactive". Basically all rolling release distributions happily shipped it; Gentoo and Arch were just lucky the attackers didn't consider them relevant enough and the backdoor code wasn't even injected in their build environments. I had the package on my Arch system, but the backdoor was indeed not present.
Only a single person (Andres Freund) was vigilant, and even he says that a lot of lucky coincidence was involved (and this wasn't even on one of the rolling release distributions). Maybe it would have been discovered by someone else before the Fedora 40 and Ubuntu 24.04 releases in April, maybe not.
Also, the statement on Mastodon "The backdoor code was inserted only under very specific circumstances in the build process" is IMHO very misleading. The "very specific circumstances" were: Building a package for any of the largest distributions, for the most common architecture. Ignoring the mobile and embedded world, this would have easily affected 80% of all Linux installations. Likely more.
The statement from Russ Allbery is (unfortunately) much closer to 100%. It is pretty clear to me, that they specifically targeted a quasi-core project with just a single maintainer who was already struggling significantly and publicly. This is as malicious as it is smart. A truly proactive FOSS world would have acted on the maintainer's situation: Either by supporting him or by removing the package from core components (ideally both).
Posted Mar 31, 2024 0:11 UTC (Sun)
by judas_iscariote (guest, #47386)
[Link] (3 responses)
You can't. distributions are built with xz compressed packages, rpm package payloads specifically.
Posted Mar 31, 2024 2:17 UTC (Sun)
by salimma (subscriber, #34460)
[Link] (2 responses)
Posted Mar 31, 2024 2:54 UTC (Sun)
by draco (subscriber, #1792)
[Link]
$ rpm -ql zstd |grep lib |grep -vF .build-id
Posted Mar 31, 2024 7:18 UTC (Sun)
by pbonzini (subscriber, #60935)
[Link]
Posted Mar 30, 2024 18:34 UTC (Sat)
by bluss (guest, #47454)
[Link] (2 responses)
Which suggests to me they are treating their build infrastructure as compromised, which is not strange given how their tools including dpkg link to the compromised library (in newer versions).
Posted Mar 30, 2024 18:41 UTC (Sat)
by rra (subscriber, #99804)
[Link]
Posted Mar 30, 2024 19:32 UTC (Sat)
by cjwatson (subscriber, #7322)
[Link]
Posted Mar 30, 2024 20:28 UTC (Sat)
by birdie (guest, #114905)
[Link]
Not mentioning RedHat/IBM who have been heavily investing and sponsoring Linux for almost three decades now.
Posted Mar 30, 2024 21:23 UTC (Sat)
by Phantom_Hoover (subscriber, #167627)
[Link] (69 responses)
Posted Mar 30, 2024 23:48 UTC (Sat)
by tomsi (subscriber, #2306)
[Link] (1 responses)
So that no company or organization stepped in to htlp him?
Posted Mar 31, 2024 13:19 UTC (Sun)
by nix (subscriber, #2304)
[Link]
Posted Mar 31, 2024 3:11 UTC (Sun)
by rra (subscriber, #99804)
[Link] (17 responses)
If you want to force me to do something in a specific way, you will have to pay me, because at that point it is no longer volunteer and it is no longer fun. And I say this despite the fact that I probably agree with you on 95% of the higher standards that you would like to impose and I am trying to hold myself to them voluntarily. But as soon as you bring legal force to bear, I'm out; that will be the last free software you see from me on my own time. If you want to control me and tell me what to do, that is the definition of a job and I expect a salary that is worth the headaches of having to follow your rules.
Posted Mar 31, 2024 4:21 UTC (Sun)
by Phantom_Hoover (subscriber, #167627)
[Link] (13 responses)
Posted Mar 31, 2024 5:03 UTC (Sun)
by rra (subscriber, #99804)
[Link] (12 responses)
I think the trick is to make the obnoxious bits people don't want to do a paid job, while letting the fun coding work that I kind of want to do anyway something for which I might get a bit of a stipend that's less than what's paid for the obnoxious bits but enough to give people some options other than a full-time day job. Those two things may combine in one person in some cases, and not in others. For example, if there was money for free software tied to specific projects that are already being used, I think a lot of maintainers would organize into non-profit umbrella organizations and pool their resources, and that umbrella organization could hire people to do the tedious bureaucratic work that the maintainers don't want to do.
Down that path you potentially get non-profits with a predictable funding stream whose whole reason for existence is to make free software and give it away for free, at a high quality. Which has been the holy grail of the free software movement since day one. It would be like the FSF or the Linux Foundation, except with enough separate funding streams that you could form your own or throw in with other like-minded people if those organizations gave you hives for whatever reason. (And not have to then spend the rest of your life begging other people for money.)
I think you could fund an amazing ecosystem through taxes on just the FAANG companies, if we actually taxed them the way that they should be taxed in any fair social system.
Posted Mar 31, 2024 6:03 UTC (Sun)
by cpanceac (guest, #80967)
[Link] (4 responses)
Posted Mar 31, 2024 6:27 UTC (Sun)
by rra (subscriber, #99804)
[Link] (3 responses)
Corporations do not behave precisely according to this model for a few different reasons. One major one is that they're made up of people and those people have normal non-sociopathic human reactions to the rest of society and often can convince corporations to spend some money on things that are not purely profit-maximizing. But those deviations are bounded aberrations and can't be relied upon.
For a functional society, one either has to design laws to not create legal sociopaths, or one has to constrain those legal sociopaths by forcing them to not externalize their costs and free-ride on the rest of society. The former is extremely hard. We've largely chosen to do the latter. The normal mechanism to do that is taxes, which is how all the other infrastructure that corporations rely on (roads, bridges, water, sewer, etc.) is funded. Free software is not treated as infrastructure.
Posted Mar 31, 2024 11:55 UTC (Sun)
by bluca (subscriber, #118303)
[Link]
The German-govt backed STF is exploring grants/fellowships for OSS maintainers. We should tax the mega corporations and ring-fence the revenue to fund such efforts.
Posted Apr 1, 2024 1:36 UTC (Mon)
by rgmoore (✭ supporter ✭, #75)
[Link] (1 responses)
As a practical matter, I don't think they really can shift the cost of a compromise onto somebody else. They may be able to shift the blame, but the business consequences of important computers being compromised are not something you can ever be fully compensated for. Nobody can make $BIG_CORPORATION whole for their most vital secrets being stolen or for the time they waste rooting out every possible backdoor an attacker might have left on their systems.
Not that this really affects your underlying point. One of the huge dangers of this general kind of thing is that humans are terrible at assessing risk. We really don't have an adequate basis for determining how big the risk of a malicious programmer compromising our systems is; there simply isn't enough data to make a good estimate. The result is that business people tend to ignore tail risk as too rare to worry about, even if it's something that could destroy their company if it happens.
Posted Apr 1, 2024 11:59 UTC (Mon)
by khim (subscriber, #9252)
[Link]
But who may ever care about that? Manager? Nope, they are not losing anything when important computers being compromised. Stockholder? Nope, they only care about price of their stocks. CEO? CEO, too, only cares about price of company stock, not about it's actual finances. And that's the only thing that matters. Yes, but there are also nobody who cares about all that. Large corporations are dysfunctional and fail precisely because they have no one who cares about fate of company as whole. Private company have an owner who is supposed to care and who may push others to care. $BIG_CORPORATION ? Nope, there are no such antity! If we are talking about $BIG_CORPORATION then it's not theirs! That's the core issue!
Posted Mar 31, 2024 18:22 UTC (Sun)
by willy (subscriber, #9762)
[Link] (6 responses)
We've been labouring under the presumption that somebody who writes an amazing piece of software is inherently trustworthy. The fact that this has worked acceptably well for so long should give us faith in humanity. Most of us aren't awful people!
Posted Mar 31, 2024 19:12 UTC (Sun)
by rra (subscriber, #99804)
[Link] (5 responses)
Funding will inherently come with reporting requirements; that's just part of life. That's going to involve some amount of keeping accounting records, filling out forms, auditing books, etc. Another good paid position. Other examples would be bug triage, moderating project forums for large projects, getting an additional sanity check on whether a contributor seems trustworthy based on their behavior elsewhere (probably wouldn't have stopped this attack but might have when it became obvious that the people pushing for them to become maintainer seemed to be sock puppets), maintaining legal notices for copyright... There are lots of things that most maintainers don't like doing, but that are potentially decent jobs for someone who is getting compensated for kind of boring work.
Or, also, the maintainer may choose to do that work themselves and take the additional money. For example, I love doing release management. I find it deeply satisfying and would continue to do it myself; I don't find it tedious or annoying, personally. But I'd be happy for someone to deal with all of the accounting paperwork.
The idea is that the places where we want to require people to do tedious and kind of annoying things for legal or security or sustainability reasons, we should also pay them for their effort (where "we" means the society that benefits from the job being done well).
Posted Apr 1, 2024 0:04 UTC (Mon)
by willy (subscriber, #9762)
[Link]
I have some vicarious experience with government funding, and the reporting requirements can be quite onerous for organisations which are unaccustomed to it. Of course, the government is required to demonstrate that it is spending taxpayer money responsibly, and it can't do that without a mountain of paperwork.
Posted Apr 1, 2024 9:20 UTC (Mon)
by Wol (subscriber, #4433)
[Link] (3 responses)
And that is a BAD idea. EVERYBODY should get paid. For example, I enjoy a lot of the tedious, "let's get rid of technical debt" maintenance work. If you take the line "people who enjoy it will/should get paid less", we're rapidly going to spiral straight back into "nobody gets paid".
Open Source should be seen as a great place to work, because you get paid to do what you like doing. And that includes doing tedious grunt work !!!
Cheers,
Posted Apr 1, 2024 14:43 UTC (Mon)
by farnz (subscriber, #17727)
[Link] (2 responses)
I disagree that everybody involved should be paid; for example, I've contributed to open source projects while job-searching so that I can point people to examples of how I write code, and how I behave during the review process. This was unpaid work, done for my benefit and only incidentally upstream's benefit (since I closed off some backlog issues for them).
This is why I like the idea that we should pay for work we want done, rather than paying everybody. If you want (for example) a GCHQ "Assessment of Security Risk" form filled in, then you should pay people to do that; if you want dependencies meeting certain criteria (e.g. no response from upstream to contributions within the last 12 months) removed, you should pay people to do that. But if I choose to volunteer to do something, then it's fine for it to be unpaid.
Yes, this means that the amount you have to pay to get the things you want done done varies over time, because the availability of volunteer labour for each project varies over time (for example, I'm less likely to contribute to a C++03 codebase that won't accept C++11 features now than I was 10 years ago); but ultimately, the point of paying someone to do something is to get something done your way when they would prefer to either not do it, or do it a different way.
So, if I want you to work on fixing technical debt in MariaDB, then I should expect to have to pay you to make those fixes for me. If I want you to fix up a DataBASIC example, I might not have to pay as much, but I'll still have to pay you since this is something you wouldn't otherwise choose to do. But if you want to contribute to ScarletDME (say to make it 64-bit clean, or easy to package on RHEL + Debian) to encourage people to pick up Pick-style databases, that shouldn't automatically come with payment, since you're doing it because you want to do it, not because someone else wants you to.
Posted Apr 1, 2024 16:04 UTC (Mon)
by Wol (subscriber, #4433)
[Link] (1 responses)
I think we've accidentally ended up talking past each other :-)
My original quote I responded to said something about "you require me" ...
So if you do work to showcase your skills, as part of your CV, or I do wortk on ScarletDME because I want to make it 64-bit clean and tidy up technical debt, we're both getting paid (although not in cash).
But as soon as other people start depending on us, then the expectation should be that we get paid cash. If people like your demonstration, if people like that ScarletDME is solid and robust, then if they use it they should pay us towards it.
The quote I was responding to left me very much with the feeling that only *some* work was worthy of payment - the work that nobody wants to do. And I don't believe such work exists. (Although quite often, the people who enjoy doing it don't have any desire to do it for free.) Plus the view that only some work should be paid for is, of course, extremely discriminatory.
Cheers,
Posted Apr 1, 2024 17:00 UTC (Mon)
by farnz (subscriber, #17727)
[Link]
I disagree - someone who depends on you should not be paying you just because you're doing the work that you want to do right now. I may like your work on ScarletDME, and depend on ScarletDME being solid and robust, but that does not, in and of itself, create a moral imperative for me to pay you for it.
The point at which money should come in is the point at which you want me to do something other than follow my desires; the money is how you get me to do something that I might not otherwise do. And yes, that means that if I'm doing what you want without being paid, you're in luck - but you can't expect me to keep doing what you want unless you set up an arrangement with me, for which I will want to be paid, in order to get me to do what you want.
Posted Apr 1, 2024 9:13 UTC (Mon)
by Wol (subscriber, #4433)
[Link] (2 responses)
Which is the whole point behind the CE mark. And now the rules appear to have been changed to say "if you're not paid, you can't issue a CE mark", the people who use it ARE on the hook to pay for it one way or another.
Cheers,
Posted Apr 3, 2024 14:47 UTC (Wed)
by jepsis (subscriber, #130218)
[Link] (1 responses)
Posted Apr 3, 2024 15:38 UTC (Wed)
by Wol (subscriber, #4433)
[Link]
The manufacturer needs a CE. They can create their own, no problem.
What they can NOT do is ask a volunteer to provide a gratis CE. If they're not paying for it, it's legally worthless and invalid. Whether they pay their own staff, or have a contract with an outsider, the law doesn't care. But they can't offload liability without paying for it.
Cheers,
Posted Mar 31, 2024 15:00 UTC (Sun)
by mcatanzaro (subscriber, #93033)
[Link]
Posted Mar 31, 2024 18:00 UTC (Sun)
by pizza (subscriber, #46)
[Link] (47 responses)
What exactly does "vet contributors" in this context mean, anyway?
And it's not enough to "attach strings", you also have to "attach a mechanism" that enables themselves to do so. [1]
...And that still won't protect you if one of those vetted contributors gets compromised down the line..
[1] More explicitly -- It's not enough to require a scan of some sort of government ID; you also need a way to *validate* said ID, plus confirming it matches the person who sent it to you. But even if they are who they say they are, how do you know their intentions are "good"? You're now in the territory of a comprehensive background check that requires detailed personal and financial histories [2] and even that's not foolproof [3]. And then there's the "data protection" aspects of receiving this PII.
Posted Mar 31, 2024 19:43 UTC (Sun)
by rra (subscriber, #99804)
[Link] (46 responses)
Elsewhere in this discussion, people have been thinking like US defense contractors and their idea of vetting, but I think this specific example points to a different type of vetting that's also tedious but much less prone to mistaking geopolitics for trustworthiness: detailed code inspection and reproduction.
One of the critical moments in this exploit came when the "test files" were committed. Vetting may look like asking questions: where did these come from? How did you generate them? Please provide detailed instructions so that I can regenerate them and make sure they match. Let's check the scripts used to generate them into the Git repository. Etc. That's a type of vetting, and it's exactly the type of vetting that overworked maintainers have a hard time doing.
Another critical moment is when the new maintainer did their first release. There, checking the first release would just move the problem; they'd inject the backdoor in their second release, etc. But the tooling to verify that a release is a correct representation of the Git tree was absent. Ideally someone should write it. That's not a very fun program to write, but a very useful program for the community to have. (I realize that the other approach is to move away from tarball releases; I don't mean to open that debate here, I'm just giving an example and other release methods will have other examples.)
Posted Mar 31, 2024 20:50 UTC (Sun)
by pizza (subscriber, #46)
[Link] (24 responses)
That's moving the goalposts from "vetting contributors" to "vetting contributions", which are very much not the same thing.
We can't solve the problem of not having enough suitably-skilled (and now, -trusted) people to handle the expected workload by requiring *more* work from the existing people.
And if you bring more people onboard to spread the workload, how long do you carefully vet them before they are treated as "trusted"? Apparently 2.5 years is no longer enough.
Posted Mar 31, 2024 21:16 UTC (Sun)
by rra (subscriber, #99804)
[Link] (23 responses)
They're not the same thing, but they're certainly closely related.
Are you going to be able to detect someone who carefully behaves entirely aboveboard for years before springing their trap? No. But neither will background checks and citizenship checks and all of the other institutional machinery of governments reliably, and they have a much higher social cost and exclude all sorts of people who are entirely trustworthy, which I think was part of your original point.
This specific attacker was not careful enough to survive careful vetting of contributions. This is a problem that we could have potentially solved, and it's worth imagining a world in which it would be possible to do that. Would they have altered strategies if that vetting was in place? Yes, probably, but this sort of thing is hard and you're creating more opportunities for them to trip up.
> Apparently 2.5 years is no longer enough.
This contributor did several detectable things way earlier than 2.5 years in. There's the pressure from sock puppets, the mysterious test files, commit messages for commits that do not do what the commit message claims they do, etc. Would I have caught those things? Probably not, but I might in the future.
Sure, I too can imagine an attacker that didn't do any of those things, but I really dislike both throwing up our hands and saying this is impossible or going down a path that inevitably leads to saying only people with certain passports can ever be trusted. There are concrete things that we can do that still involve judging people by their actions. Are they perfect? No, nothing in security will ever be perfect. Are they sustainable ways to make an attacker's life harder? I think so, maybe.
Posted Mar 31, 2024 21:47 UTC (Sun)
by mb (subscriber, #50428)
[Link]
Yes, I agree. This certainly taught me lessons about how to do reviews.
Posted Mar 31, 2024 22:37 UTC (Sun)
by pizza (subscriber, #46)
[Link] (21 responses)
My GitHub presence is nearly nonexistent, and nearly all of it is filing bug tickets or commenting on pull requests that I have an interest in. Does that make me a sock puppet? (Meanwhile, My first public contribution to the Linux kernel came with accusations of using a psuedonym because my legal name "sounds silly")
....You only know them to be "sock puppets" _after the fact_, because everyone has to start _somewhere_, and even then, there's entire software development ecosytems beyond the confines of Github or any other public repositories.
Incidentally, I've also committed "mysterious" test files. They're not mysterious to _me_ but they are to pretty much everyone outside of my project. Does that make my actions questionable? Or it a matter of "you have to have <this much> domain expertise to meaningfully participate"?
The difference between "legit" and "hostile" is a matter of _intent_, not one of _appearance_, because on the first few layers they appear to be the same thing.
> Sure, I too can imagine an attacker that didn't do any of those things, but I really dislike both throwing up our hands and saying this is impossible
It's not that this is "impossible", it's that you need to be *very clear* about what you're actually trying to protect yourself against or otherwise accomplish, and what the costs (both individual and collective) will be -- and only then can one determine if the cost is "worth it". Because often enough, the rational answer is "nope" -- on both the individual and collective levels.
I'd hate to see the collective F/OSS community lose it collective mind over this and self-immolate in the name of "security"..
Posted Apr 1, 2024 11:43 UTC (Mon)
by jafd (subscriber, #129642)
[Link] (20 responses)
If you get into a project out of the blue and start campaigning to get its maintainer replaced because there haven't been releases made in some arbitrary interval of time or because whatever, along with several other users that haven't been in the project until that moment, then very likely yes, it does.
Posted Apr 1, 2024 12:41 UTC (Mon)
by pizza (subscriber, #46)
[Link] (19 responses)
The ill intent here was only visible some time (years?) after the fact. At the time, there's no way to tell the difference.
...What you are calling "Sock puppetry" here is what others call "drive-by contributions"; ie supposedly the entire point of putting a project on the likes of github to begin with.
(There are numerous projects that I've only ever interacted with as a "drive-by" because through the course of my employment I found bugs or had issues and was able to get permission [1] to contribute something back. After that particular task (or employer) I never interacted with said projects ever again.
[1] I only obtained through dogged insistence; the bureaucracy defaults to "hell no", for $reasons.
Posted Apr 2, 2024 21:53 UTC (Tue)
by jafd (subscriber, #129642)
[Link] (18 responses)
Making demands, whining how, quoting, "the community desires more", and that the project's governance needs to be changed is a negative contribution, if anything.
Posted Apr 2, 2024 22:32 UTC (Tue)
by pizza (subscriber, #46)
[Link] (17 responses)
I don't disagree, but the more widely used the project becomes, the more likely that this entitlement represents the (overwhelming) norm.
Posted Apr 3, 2024 11:07 UTC (Wed)
by farnz (subscriber, #17727)
[Link]
And arguably, that's the core problem; instead of seeing open source contributors (including maintainers) as gift-givers whose generosity is not guaranteed, people see them as obligated[1] to provide software that meets everyone's needs. If we can't get that entitlement under control, we will continue to have problems where malicious people can "take over" a project and backdoor it, simply because they promise to deliver what the users are demanding.
This problem also applies in the proprietary world; paying for binaries is not enough to ensure that the entity you're paying is not malicious. And it's within the budget of a nation-state attacker to get a few people into key points within a big company so that they can backdoor the binaries you buy; it's also possible for an attacker to outright purchase a small company that produces binaries you use specifically to backdoor them.
[1] I once contributed a small change to X.org's Xserver, targeting my then-employer's use case; I had two competitors to my employer ask me to change what I'd done to suit them. One was polite and went to the upstream mailing lists upon request, and took my advice on alternatives to my patch, and what they could do to implement what they needed. The other one was not so polite, and complained to my employer that I was refusing to spend my working time on a competitor's needs. Needless to say, I got no pushback from my employer on refusing to work on improving a competitor's product in my working time.
Posted Apr 3, 2024 15:21 UTC (Wed)
by paulj (subscriber, #341)
[Link] (15 responses)
I find it interesting how I and some others were discussing the toxicity and entitlement that is present in the expectations and demands a good chunk of users seem to think they can hang on creators and maintainers of free software, just a while ago in the context of what happened to Rust Actix. And now related issues (i.e., the lack of appreciation by the wider world for free software creators and maintainers; and how that opened up this social engineering channel) pop up to what would have been one of the worst security issues in a long while - if not for the luck of Andres Freund noticing.
As a former maintainer, I'm sorry, but the attitudes of many around Free Software are seriously off. Entitlement is ingrained in many, even some of the most well-meaning. It is a significant cultural issue.
Posted Apr 3, 2024 15:35 UTC (Wed)
by rra (subscriber, #99804)
[Link] (13 responses)
Yes. This is the part of the compromise that I was the most struck by, and that I keep thinking about. I kind of knew this already, but reading back through the messages and watching this social engineering happen crystallized it for me.
People use free software as if they were the consumers of a product and treat maintainers as if they were companies producing substandard equipment. Not all people, not a majority of people, but a persistent minority. And heaven help you as a maintainer if you decide that you would like to take the software that you give away for free on the Internet in a direction that you like better but that some user views as an unwanted change. The carping and nasty insults and raging entitlement can go on for literally years, despite the fact that any one of those people could fork the code and do whatever they would like with it.
There is a deep rot in our culture, and it's very off-putting.
Posted Apr 3, 2024 15:58 UTC (Wed)
by paulj (subscriber, #341)
[Link] (12 responses)
The scolds and the trolls thrive on having a public, an audience. It's the (implied) audience that allows one sense of entitlement to override the fact you're talking to a human who has given you something, and scold them instead and demand things. Trolls don't always need an audience, but it definitely encourages them more.
Get rid of the audience, make it a clear 1:1 communication instead, and I think the problem would be significantly abated.
So... if I end up maintaining something useful again, that's what I'll try next time.
(And there are large and core Free Software projects that work on that "email a private address" basis for comments/patches).
Posted Apr 3, 2024 16:32 UTC (Wed)
by farnz (subscriber, #17727)
[Link] (11 responses)
You end up with a double-edged sword. Doing things in private means that it's easy to cut off the trolls, but it also means that you don't get drive-by assistance with someone who's well-meaning but not good at expressing themselves.
I'd be interested to hear how your experiment goes; my experience suggests that it won't go well, because I've had people be as entitled in private mails as I have seen in public - going as far as e-mailing my employer to request that I be fired for refusing to spend my working time changing a patch I sent upstream to suit their needs, not just my employer's needs.
Fortunately for me, they ran a competitor of my then-employer, so when my boss asked me what was going on, I just pointed him at the e-mail domain. He went to their website, and understood why I'd refused to help out - top thing on their site was a hit-piece against our product.
Posted Apr 4, 2024 15:30 UTC (Thu)
by paulj (subscriber, #341)
[Link] (10 responses)
Your other point, you're into the topic of corporate and industry politics. That is, IME, another thing beyond "mere" trolling and entitlement. That's a whole other level!
My view here would be that, as far as possible, try avoid having multiple corporates be competitors, while also trying to "collaborate" on an open-source project. So trying to arrange it so developers of the project are not at competing entities, and arranging it so corporate users are not developers of the project. So, have one entity that can accept resources in some way (be that via donations, support contracts - latter is probably a lot easier for users to justify) and distribute those resources to the developers in some fair way. The number of structures possible here is numerous - unincorporated association, co-ops, non-profit corporates, etc. The developers should control it though. Also, it should not be a charity - a non-profit is fine, but charitable (or equivalent) tax status is not (charity status is hard to get in UK and Ireland I think, and heavily regulated - but unfortunately it's relatively easy in the USA via 501(c)(3)).
[To any young Free Software hackers reading, if you're involved in some project that has an entity/trust/association/corporate/foundation managing your donations; and you _don't_ have *full* transparency into _what_ its income is and _how_ those are being distributed, and on a *timely basis* (not "2 years after the current fiscal year, whenever ProPublica manage to acquire the /meagre/ IRS filing" - but more like weekly), then _get_ that transparency and make sure everything is correct and fair. Do *not* simply take things on trust that the 50+yo FOSS-svengalis who run the board and appoint the executive (if not one of themselves) are working in your interests. They may or may not look after you, but they will /certainly/ look after themselves when it comes to any salaries. If you can not get that transparency, that is a _major red flag_!! Please heed this warning, from someone who was once unfortunately naive on this.]
If you already have a rats nest of corporates who each sell the FOSS projects code in some way, and already are jostling for position, and looking for ways to offload their own maintenance costs onto others whenever possible, then you may be stuck with the politics. You could try setting up a trade association, to take up some tasks and be funded, with rules, but.... even if that helps a bit, you're probably stuck with corporate politics. :( I'd be looking for another job at that point. ;)
That's what happened with the project I was on. It started out as a normal community kind of thing, with individual maintainers. As it got wider adoption and recognition, we eventually started to attract some corporates and "svengali" types. We also started to get more and more patches that were about offloading some corporates pain points onto the (unpaid) maintainers - without any benefit to the code or community.
E.g., patches to add "APIs" into the GPL code - so hooks all over the place, and then exported over some RPC - custom, JSON, capn'proto, whatever. But we never got code that built on those APIs and actually did something useful; and generally never even got code to even exercise and test the exported API. Or we got code to further abstract sets of APIs, e.g. by adding some kind of extra context - again, never provided with code to actually make use of that, never provided with code to even exercise it. Sometimes there would be /promises/ of such code in the future, but... ha!
Clearly, these companies had proprietary software/solutions they were selling, which relied on this GPL software. And they wanted to get rid of the continuous maintenance hassle to them of keeping their changes for their hooks synced with upstream. Possibly also the legal risk of their proprietary software relying on GPL software this way, at least for RPC API hooks. If they could upstream their patches, that maintenance burden would be offloaded to upstream - yay! - and they'd also be able to point to the acceptance of the RPC API changes as showing legitimacy for that (for such patches at least). Double yay!
I know others in this thread say the job of a maintainer is to say "No", but when you do that, once you've reached this "infested by shady corporates" stage, what will happen is the shady corporates start playing power politics. And they do not play nice at all.
I could continue... but this is already long.
There are some very shitty people out there.
Posted Apr 4, 2024 16:00 UTC (Thu)
by paulj (subscriber, #341)
[Link]
Posted Apr 4, 2024 17:52 UTC (Thu)
by farnz (subscriber, #17727)
[Link] (8 responses)
My other point was a lot simpler; IME, you get the same entitlement issues over private e-mail as you get on public forums. It's just less visible, because it's inherently private unless the target of the request publishes it (because the origin doesn't).
Posted Apr 5, 2024 11:08 UTC (Fri)
by paulj (subscriber, #341)
[Link] (7 responses)
I think there is a difference between the entitlement of users being ingratefully demanding of maintainers - getting the "who owes whom what" context 180 degrees backwards in thinking the maintainer owes them something - and developers from different competing corporates jostling over contributions to the same project. The latter the nastiness is more due to the entities being competitors, and if one can can leverage naivety or the feeling of Free Software principles of a competitor's developer to extract more free work out of them to their benefit, they'll do it.
I guess it's a sense of entitlement, or at least exploiting a general culture of entitlement in FOSS to try extract free work from a competitor. But it's a competitive behaviour.
The first case is just private individuals being shitty - perhaps unwittingly, because this culture is so pervasive. The second case is people working for corporates, exploiting that culture (perhaps unconsciously) for competitive gain.
Posted Apr 5, 2024 11:24 UTC (Fri)
by farnz (subscriber, #17727)
[Link] (6 responses)
I've seen both of the things you describe in my e-mail inbox when contributing to big projects (not maintaining, just contributing), including people attempting to threaten me for implementing something differently to the way they want it done and asserting that I should do it their way because they work for a big corporate, and I should redo my contribution to match the BigCo way.
I don't think you can escape the shittiness of some people; the only choice you have is whether they display that shittiness in public, or whether it's done in private. And the advantage of it being done in public is that you can alert their BigCo's "press relations" team to the behaviour of the shitty people, which is often sufficiently bad that BigCo's PR team will take action to get it under control for fear that it gets picked up as an example of how BigCo behaves.
Posted Apr 5, 2024 11:49 UTC (Fri)
by paulj (subscriber, #341)
[Link] (5 responses)
People are shitty.
The private entitlement stuff, I think private comms would alleviate a lot of that.
The shitty corporate power politics - which becomes engrained into certain people who work at certain shitty corps [cough, company from San Francisco with a bridge logo, cough] - you can't fix that. Those are shitty people who enjoy playing power games.
Posted Apr 5, 2024 11:53 UTC (Fri)
by paulj (subscriber, #341)
[Link]
I never realised significance of that until a colleague of mine, unrelated to my issues with said people, was complaining to me about how difficult/nasty it can be to work with people from bridge-logo company cause it is notorious for horrendous, cut-throat internal power politics - which he told me he thought were largely due to their no-mercy stack-ranking system, where you must keep making regular promotion progress, otherwise you go out.
Posted Apr 5, 2024 12:04 UTC (Fri)
by farnz (subscriber, #17727)
[Link] (3 responses)
My direct experience is that people are shittier and more entitled in private comms to someone outside the business than in public, because they know that they can leverage their contacts in their business to dismiss your issue with them in private as "he's forging the e-mail because he wants me to get in trouble - I would never risk bringing the company into disrepute like that".
In public, they say things that are relatively manageable, because they know that if they do the sort of leverage that you've had, they'll have to justify the stuff they said in public. In private, they can be as abusive and shitty as they like, because they can lie their way out of trouble, and have your boss come down on you twice over - once for not doing what the shitty person wants you to, and once for lying to get the shitty person into trouble.
Having all communications be public alleviates the worst of it, because the truly shitty people out there know that they cannot lie their way out of trouble if communications are public - and thus that they'll lose at the power games because their boss will lose at their power games if PR are having to say "we need to be ready to deal with this before the press get hold of it".
Posted Apr 5, 2024 12:40 UTC (Fri)
by paulj (subscriber, #341)
[Link] (1 responses)
I'm just drawing a line between the "private individuals" context and the "competing corporates" context. I'm saying the /former/ likely is fixable with private comms.
I agree you it won't fix things in the latter context. I don't know how to fix the latter context. My intention is to avoid being in that scenario again. If I were a maintainer, I would try discourage other businesses from building their business around something I maintained - cause it would ultimately result in pain for me.
The latter context is a complex topic.
The former is a lot more tractable though.
Posted Apr 5, 2024 13:22 UTC (Fri)
by farnz (subscriber, #17727)
[Link]
IME, private individuals aren't shitty in comms in public where they think they're talking to a person, not a company, and they're as shitty or worse in private if they think they're dealing with a company, not a person.
Basically, my experience tells me that the only problem you fix by moving comms to private instead of public is that of corporate PR flacks asking you to remove corporate attribution of shitty comments, while you create new problems of people expecting to get away with being shitty because they can blame you for everything.
Posted Apr 5, 2024 12:43 UTC (Fri)
by paulj (subscriber, #341)
[Link]
Indeed, very good public comms is probably essential to it. You need to get all the agendas and interests teased out, specified, and try identify the common interests as much as possible, and try set clear lines for competition. Good comms and negotiation needed on this.
Posted Apr 3, 2024 16:08 UTC (Wed)
by farnz (subscriber, #17727)
[Link]
I think that part of it is that people have lost sight of the intention of the "no warranty"[1] clause from Free Software licences, since similar clauses are rife in the proprietary software world, and thus they don't trigger a "oh, this isn't the normal buyer/vendor relationship" reaction. Instead, because that sort of clause is common in every licence, they assume that it's just software boiler-plate, and their relationship to a Free Software provider is exactly the same as their relationship with someone who charges them $10,000/year/seat for licences for a piece of software.
In fact, though, there's an essential difference - Microsoft are existentially threatened if their customers decide en-masse that they're not paying for Windows licences in future, but are instead going to use Debian Linux for free, while Debian does not lose anything if a non-contributing user says they're going to pay for a Windows licence instead of downloading Debian for free.
And once you've got that difference in your head, you realise that the value a user provides to Free Software is in contributions - good quality bug reports, documentation that makes it easier for others to use Free Software, helping out on forums and the like, or even helping with the code itself, and not simply using the software. This means that a threat to not use this software, but to use a fork or something different isn't a big deal to the supplier of Free Software.
[1] Like the following from some BSD licence variants:
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE
Posted Mar 31, 2024 22:13 UTC (Sun)
by calumapplepie (guest, #143655)
[Link] (2 responses)
The whole point of co-maintainers is not inspecting every line they commit.
> Let's check the scripts used to generate them into the Git repository.
This would be a good practice; you never know when you need more test files. But, again, it adds a significant amount of work; and it might have been generated in a fundamentally unreproducible way. The file could've come from a fuzzer, a user, the maintainer just poking things, etc; documenting provenience would be good. But who wants to say "thanks for your free work, now do some more"? Who wants to be told that?
> But the tooling to verify that a release is a correct representation of the Git tree was absent.
That's because, in many cases, release tarballs aren't just the output of "git archive". It's actually somewhat of a problem, and the folks at Debian are talking about mitigations for it. This is hardly the first time that someone getting clever with adding files not found in VCS has been a problem (self plug for my great suspender article), but it remains an issue. Tarballs are often generated to be one-stop-shops for users; bundling dependencies, etc. "Download this file, verify it matches what I signed, and be confident that it will just work". Its a philosophy that fades in and out of favor; distributions work to check the work, but it can be quite hard (see the conversation Russ was having).
Posted Apr 1, 2024 7:16 UTC (Mon)
by pabs (subscriber, #43278)
[Link] (1 responses)
Posted Apr 1, 2024 10:23 UTC (Mon)
by ballombe (subscriber, #9523)
[Link]
There are reasons why people ship pregenerated files, even while that makes diff
Posted Apr 1, 2024 7:13 UTC (Mon)
by pabs (subscriber, #43278)
[Link] (14 responses)
Posted Apr 1, 2024 16:03 UTC (Mon)
by emk (subscriber, #1128)
[Link] (13 responses)
Anyway, GNU Poke is probably reasonable for data files that are built up in sensible ways. But a key part of my test infrastructure for certain libraries are fuzzer-generated binary blobs. These tend to be maximally "evil" misinterpretations of an input format, generated via guided search of billions of possible inputs. At some point, each of these blobs caused a crash or an overflow. And I keep them around to detect regressions, or to seed future fuzzing runs. There's nothing nice or sensible in these files; they break the core assumptions of a format. And almost none of them have ever been analyzed in depth by a human.
Or for another example, if I want to test speech recognition, I'm probably going to want a few seconds of spoken audio somewhere in my test suite.
I'm actually wrestling with this more, recently: A lot of machine learning and AI-based tools really need "test sets" containing a significant amount of real-world data, combined with minimum expected performance metrics. Figuring out how to manage this in the open source world has been tricky at times. These data sets may be too big to distribute as part of a release. Which leaves fun options like "git submodule", plus the issue of building non-proprietary test sets.
Posted Apr 1, 2024 16:39 UTC (Mon)
by nix (subscriber, #2304)
[Link] (2 responses)
Posted Apr 2, 2024 4:11 UTC (Tue)
by pabs (subscriber, #43278)
[Link] (1 responses)
Posted Apr 2, 2024 12:42 UTC (Tue)
by pizza (subscriber, #46)
[Link]
"Other folks" == "one other person" in this case. Which I suspect is the overwhelmingly second-most-common case (ie other than having one person in total)
The whole point of having co-maintainers is so you can spread the work around, not create more for yourself.
Posted Apr 2, 2024 4:09 UTC (Tue)
by pabs (subscriber, #43278)
[Link] (9 responses)
Sure, poke wouldn't be useful for some cases, but for the fuzzer generated blobs you could minimise them down to just the bits that trigger the bug and then convert that to poke. Those test files also don't have to be part of the source code tree. For the file format libraries I maintain, most of the test files aren't redistributable anyway, because they came from some bug reporter in private, or in public but not licensed.
Most of the data inputs for ML/AI are not licensed, just scraped from somewhere, so not redistributable even if they were small enough. There are some exceptions, some datasets are under proprietary commercial licenses, and there are libre datasets but they are rare. This is worth a read:
Posted Apr 2, 2024 9:27 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (8 responses)
The other direction with fuzzer blobs (at least with some fuzzers) is to provide instructions for recreating the fuzzer blob on your own system. This goes well with fuzzers that have a guided mode; I can provide guidance to the fuzzer, and you can run the guided fuzzer against the known buggy version of code to get the same blob yourself, thus allowing you to be confident that the blob came from the fuzzer.
Posted Apr 2, 2024 10:52 UTC (Tue)
by excors (subscriber, #95769)
[Link] (2 responses)
1) Many people would stop contributing fuzzer-based regression tests, because it's too much of a hassle, so the project would be at greater risk of security vulnerabilities from untested regressions.
(In fact, Jia Tan did contribute an xz-logo.png recently, and updates to the translations. I don't see anything suspicious about them, but I couldn't be certain...)
Posted Apr 2, 2024 11:39 UTC (Tue)
by ms (subscriber, #41272)
[Link]
Posted Apr 2, 2024 13:09 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
The fundamental point I'm aiming for is proof of good faith where reasonable. If you make a claim ("this blob is fuzzer output"), can you provide a reasonable proof that it is actually fuzzer output, and not malicious data?
E.g. if I claim that this is "fuzzer output", and that you can regenerate it by running AFL with this block of guidance data against this revision, you can quickly confirm that I'm not lying, because while it may have taken you 6 CPU-months to find that blob, with the guidance data I can get AFL to reproduce it in seconds. If I say it's my own work, there's nothing I can do to prove good faith.
Posted Apr 2, 2024 11:41 UTC (Tue)
by emk (subscriber, #1128)
[Link] (4 responses)
I mean, I certainly couldn't (and wouldn't) recreate the fuzzer blobs on my own system. To find those blobs, I rented some expensive multi-core monster in the cloud, and I ran it for days. We're definitely looking at over $100 of CPU time here. If someone is going to try to reproduce that, I'd prefer they look for new bugs!
But I do try to purge my largeish test fixtures from distributed packages, even though that means my packages don't match git.
(I fear that supply-chain attacks may be intrinsically difficult to protect against, if you assume that some of your adversaries are nation-states playing the long game. Let's not forget all the tales of companies hiring people who do not exist, whose address turned out to be an empty house rented in advance for cash. Or of national intelligence agencies intercepting shipped packages and modifying the hardware prior to delivery. And it's not like the CIA kept out Aldritch Ames, either. So I think our first steps here should be to add multiple layers of mitigations and checks, aimed at greatly reducing the risk of less sophisticated attackers succeeding. I would love it if we began by removing autotools and M4 from all the key parts of the ecosystem. And maybe started keeping databases of identities with commit rights and upload rights for critical base packages—much of which could be automatically figured out from public information.)
Posted Apr 2, 2024 12:17 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (2 responses)
If you're going this direction, you're using a fuzzer where, having found a blob, it's trivial to tell the user how to recreate the blob via the fuzzer; in effect, you're providing the fuzzer guidance input that gets it to find the blob extremely quickly as a proof that the blob was found via the fuzzer, and not by hand, even though finding out what guidance is needed is going to take a long time and cost a lot of money.
And fundamentally, this is what it all comes down to - proving that you're acting in good faith, and not an attacker. An attacker can't provide evidence that they created the blob via a fuzzer; you can provide a guidance string from the fuzzer that has it generate the same blob that took you days to find, but taking seconds to find it because the guidance gets it down the right path immediately.
Posted Apr 2, 2024 16:23 UTC (Tue)
by kleptog (subscriber, #1183)
[Link] (1 responses)
That's only true if you have a program that actually fails and there's no guarantee that there is a committed version that fails in the way that the test case is testing. I suppose you could add an assert to the code that is specifically defined so it fails during fuzzing at the specific test case. So the test case is really: does the fuzzer find the "flag" with the specified input.
However, if you're using the input to guide the fuzzer, all you will prove is that the input is a possible way to trigger the bug. You can't prove that the test case wasn't modified afterwards to include malicious code in a way that still triggers the bug. To do that you'd need some kind of unicity test: that the test case is the lexicographically earliest test case that could possibly trigger this path. I'm not sure it that's a typical output of fuzzers. And I'm not sure if the added effort is worth it (unless the tools can be improved to the point where it is no extra effort).
Posted Apr 2, 2024 17:26 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
The blob output by the fuzzer is the test input; you have a separate input for the fuzzer that's your "certificate" that the fuzzer found the test input blob, and that it's not been modified since the fuzzer created it. If you modify it afterwards, then when I attempt to regenerate the test input using the fuzzer and the certificate input, I'll see that your certificate and the test input don't match.
You'd therefore only run the fuzzer if you wanted to see it reproduce the blob - you're after checking that this certificate matches up with the blob, and raising the alarm if there's a mismatch. Mostly, you'd just trust the committed blob, even though you have instructions for reproducing it - but the idea is that it raises the bar for an attacker, since they have to be concerned that at any point, a random passer-by could decide to try and reproduce the blob, discover that the checked-in blob differs to the reproduction instructions, and raise the alarm.
Posted Apr 2, 2024 12:36 UTC (Tue)
by pizza (subscriber, #46)
[Link]
This is a complete non-starter (for all but the largest/well-funded projects) in many jurisdictions, due to the massive legal/regulatory requirements it triggers.
Posted Apr 1, 2024 15:10 UTC (Mon)
by welinder (guest, #4699)
[Link] (2 responses)
If I want payload.o in, I could do...
1. Create obfuscated.o from payload.o
This will not stand out in any way. And there are endless variations of the above -- enough to smuggle a herd of elephants into the repository.
Broken binary files are common. They are often created by unknown broken software or even hardware (usb sticks, for example). You can't vet that.
Posted Apr 1, 2024 15:25 UTC (Mon)
by farnz (subscriber, #17727)
[Link]
And as a significant case, I might well check in evil.xz because I'd found a way to make xz fail in an interesting fashion using the damaged file.
For example, I have a test case in my current job which looks at a block of binary data (in a known format) and verifies that our protocol decoder correctly fails because it runs out of data to decode rather than getting stuck because an inner field claims to need more data than the size of the data block, and we know the data block's size from a header.
Posted Apr 1, 2024 15:39 UTC (Mon)
by dezgeg (subscriber, #92243)
[Link]
I guess next problem is some binaries do end up in the final .deb/.rpms (like images for icons) where stuff can still be hidden (and steganography could probably be used on plaintext test cases as well). But maybe that sort of rule would still help, as libraries or pure CLI tools don't need to include images.
Posted Mar 30, 2024 22:13 UTC (Sat)
by jg71 (guest, #67102)
[Link]
I still look after a flock of dedicated servers running various flavours of Slackware in my spare time (because I like doing that), and I am so glad this major mess was stopped as soon as it was.
Posted Mar 30, 2024 23:00 UTC (Sat)
by ejr (subscriber, #51652)
[Link]
Software supply chain security is an ongoing area of research and development. Ain't solved yet, and no simple hammer is going to solve it other than just being offline.
Yeah, I have to use AT&T, but this isn't related to their other thing.
Posted Mar 31, 2024 1:11 UTC (Sun)
by carlosrodfern (subscriber, #166486)
[Link] (4 responses)
Anyways, a lot of room to improve still.
Posted Mar 31, 2024 3:06 UTC (Sun)
by rra (subscriber, #99804)
[Link] (3 responses)
Admittedly, I may have also been muttering a rant about how no one writes sufficient documentation. (In this case, that rant was wrong, though; the behavior I was seeing was sufficiently documented, I was just reading the wrong documentation because I had an erroneous understanding of the problem.)
Posted Mar 31, 2024 6:01 UTC (Sun)
by mjg59 (subscriber, #23239)
[Link] (2 responses)
I initially spent time developing reverse engineering skills for hobby purposes, but these days a fair amount of it ends up being to figure out how proprietary software is behaving and how to work around it. All else being equal I'd definitely prefer to have the source code, but given a decent incentive it's really not as hard to deal with this in the proprietary world as you might think.
Posted Apr 2, 2024 12:56 UTC (Tue)
by Sesse (subscriber, #53779)
[Link] (1 responses)
Posted Apr 3, 2024 11:57 UTC (Wed)
by Wol (subscriber, #4433)
[Link]
I wouldn't call my C skills that good (and I have no need to dig in to the guts of most stuff), but I have a hell of a lot of experience, and also an almost pathological desire for simple, logical behaviour.
So you'll have seen me moan a lot about what I consider stupid behaviour by software, but it also actually makes it fairly easy for me to reason about said "stupid" behaviour - usually making me berate it even more for the clear lack of any thought by the authors! - but I can work out what it's doing, and how to work round it.
The amount of sheer CRAP I have to deal with in VBA is amazing (or rather it's not, the lack of any coherent design is overwhelmingly obvious :-(
Cheers,
Posted Mar 31, 2024 17:11 UTC (Sun)
by matkon (subscriber, #109282)
[Link] (4 responses)
Posted Mar 31, 2024 19:09 UTC (Sun)
by Vorpal (guest, #136011)
[Link] (1 responses)
* You need to identify the git repo for the project, given only an URL to the tarball (which may be hosted elsewhere, eg. Gcc does that).
For older C/C++ this is going to be a pain. I think it would be doable for more modern build systems (cmake, meson, bazel) and languages (rust+cargo, js+npm, etc where you have a standard way of distributing releases) though, where you don't have this tradition of complicated "make dist".
Doing the actual build and comparing as you suggested might be even more difficult: how to build from repo can differ than how to build from tar ball (extra steps needed). And you need to match build flags, compiler versions, dependency versions, etc.
Posted Apr 1, 2024 7:21 UTC (Mon)
by pabs (subscriber, #43278)
[Link]
This is incorrect, the file that got modified originated in gnulib, a bunch of m4, C etc code that is designed to be used as an embedded code copy and is often committed to git.
Posted Mar 31, 2024 22:24 UTC (Sun)
by calumapplepie (guest, #143655)
[Link] (1 responses)
Perhaps reproducible builds should get a bit harsher with their variations; setting/unsetting $TERM. Of course, that will mean a lot more work for them; but as they theorietically could have caught this one, perhaps now their efforts will get more funding and attention?
Posted Apr 2, 2024 17:02 UTC (Tue)
by jwilk (subscriber, #63328)
[Link]
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
To elaborate, the three names given for the attacker are all from drastically disparate languages, cultures and geographic regions in China.
A few relevant quotes
It is about as credible a name as, idk, "Jaques Vladimir Smith". Or those american baseball player names japanese game developers made up like "Sleve McDichael". Except that it's even less credible because how you transliterate characters into the latin alphabet differs drastically between all of the chinese languages (it really is a severe understatement to call them dialects, although they do lack an army and a navy). A bit like how the name "jaques" gets anglicized to "jack", except more pervasive.
Add to that the random timezone jumps and not observing lunar new year, a time when literally the entire country shuts down. It really makes the attempt to seem chinese seem almost laughably amateurish in retrospect. It's incredible that there's still people willing to believe it. But they picked China because they knew anti-chinese sentiments would work to their advantage there, and it did.
(This is, incidentally, why I don't think this was the US. For a major operation, surely they would have been able to conjure up someone in the CIA with at least a rudamenrary understanding of China to make this more plausible. Or maybe they're just more incompetent than I give them credit for.)
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
You cannot usually redistribute the whole git repository and a git archive of a commit
is not reproducible.
It is more important to protect against webhost compromise than upstream going rogue.
A few relevant quotes
A few relevant quotes
> Why not? Are you referring to size?
>Not sure I am following here.
A few relevant quotes
A few relevant quotes
> A bare shallow clone should not be significant different in size and you can immediately see whether your SHA-1 matches or not.
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
switching something else from autoreconf hell to CMake
in the wake of XZ.
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
Archlinux also would have been unaffected, since they neither carry the libsystemd patch nor SELinux. For Gentoo it likely would have depended on USE flags.
A few relevant quotes
A few relevant quotes
A few relevant quotes
$ rpm -q --provides zstd |grep lib
$ rpm -q --requires zstd |grep ^lib |grep -vE 'lib(c|gcc|stdc)'
liblz4.so.1()(64bit)
liblzma.so.5()(64bit)
liblzma.so.5(XZ_5.0)(64bit)
libz.so.1()(64bit)
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
Then, most if not all of these big companies are members of the Linux Foundation, so how come this foundation is not funding and supporting these projects as suggested above?
A few relevant quotes
A few relevant quotes
A few relevant quotes
If the corporation can shift the cost of repairing damage done by insecure software to other actors or free-ride on other people's efforts to fix it without spending their own resources, their incentive structure rewards that behavior.
> the business consequences of important computers being compromised are not something you can ever be fully compensated for.
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
Wol
A few relevant quotes
A few relevant quotes
Wol
A few relevant quotes
A few relevant quotes
Wol
A few relevant quotes
CE marking is a regulatory requirement in the European Economic Area.
It signifies conformity with health, safety, and environmental protection standards.
A few relevant quotes
Wol
A few relevant quotes
A few relevant quotes
[2] Not unlike what it takes to get a security clearance
[3] Government security/intelligence services routinely create fake "legends" for their operatives.
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
Entitlement issues with Free Software
A few relevant quotes
A few relevant quotes
A few relevant quotes
Apple need to stop shipping a 20-yar old version of bison.
Developpers should not use cmake feature only available in the latest version.
Redhat should hot have removed perl from the default install
etc. etc.
ugly...
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
Sure, poke wouldn't be useful for some cases, but for the fuzzer generated blobs you could minimise them down to just the bits that trigger the bug and then convert that to poke. Those test files also don't have to be part of the source code tree. For the file format libraries I maintain, most of the test files aren't redistributable anyway, because they came from some bug reporter in private, or in public but not licensed.
A few relevant quotes
2) Attackers would simply hide their payload in a logo.png instead, because nobody would think to check that for hidden data. Or they'd hide it in the middle of a 200KB file of Korean translations, which non-Korean-reading reviewers wouldn't notice, or in any other file that's similarly impractical to review.
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
I mean, I certainly couldn't (and wouldn't) recreate the fuzzer blobs on my own system. To find those blobs, I rented some expensive multi-core monster in the cloud, and I ran it for days. We're definitely looking at over $100 of CPU time here. If someone is going to try to reproduce that, I'd prefer they look for new bugs!
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
2. Take valid.xz at truncate it at, say, 64k creating truncated.xz
3. Concatenate truncated.xz and obfuscated.o into evil.xz
4. Have someone file a "xz cannot unpack this file" report.
5. Analyze the bug report and say "file is damaged -- it seems to have been overwritten with random garbage half way in"
6. Commit evil.xz as a test case with reference to the bug
A few relevant quotes
A few relevant quotes
A few relevant quotes
And I can relate to Russ Allbery's quote. That's why I wrote this comment.
I earn my living with a job that is far removed from open source software.
There's a time of fun being involved when it comes to open source software, and then sneakily real life takes its toll. It's hard to pull out when you put your heart and soul into it, lines are often blurred.
I stepped back 'a little' last year when I realized that it felt like a burden more often than not, maintaining buildscripts mostly, helping out in a local users group, ... you catch my drift.
Life is simpler, yes, but I look back to those achievements fondly.
One needs to reflect upon that from time to time, I think, and draw conclusions.
Nobody talks about that early on. Real life is dynamic, and false priorities really can do harm.
A few relevant quotes
A few relevant quotes
Compare that with the attack to the closed software SolarWinds, which actually made it to the customers.
A few relevant quotes
A few relevant quotes
A few relevant quotes
A few relevant quotes
Wol
Reproducible builds
Reproducible builds
* Associate a git tag with that release.
* Figure out if any files are stripped from the release archive (e.g. gitignore, CI config).
* If the project uses autotools, deal with regenerating those files with the same version of autotools (xz used auto tools and there were hidden things in the autotools files that differed).
* If the project bundles anything else (submodules, data files from a separate repo, etc) deal with that.
Reproducible builds
Reproducible builds
Reproducible builds