|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for February 9, 2023

Welcome to the LWN.net Weekly Edition for February 9, 2023

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Users and Python packaging

By Jake Edge
February 8, 2023

Python packaging

A lot of digital ink has been expended in recounting the ongoing Python packaging saga, which is now in its fourth installment (earlier articles: landscape survey, visions and unification, and pip-conda convergence). Most of that covered conversations that took place in November and the discussion largely settled down over the holidays, but it picked up again with a packaging-strategy thread that started in early January. That thread was based on the results of a user survey about packaging that was meant to help guide the Python Packaging Authority (PyPA) and other interested developers, but the guidance provided was somewhat ambiguous—leading to lots more discussion.

Survey

The packaging survey was analyzed by Shamika Mohanan, who summarized the results in late November. Tzu-ping Chung noted that there seemed to be conflicting answers to two of the questions asked:

People seem to generally disagree with the statement "Python packaging deals well with edge cases and/or unique project requirements", but also feels that "Supporting a wider range of use cases" is the least impactful thing for the PSF [Python Software Foundation] and PyPA to focus on. I kind of sense there's some interesting thinking going on but can't quite identify what exactly.

Brett Cannon suggested a cause, perhaps somewhat cynically: "People want their problem solved, but don't care about others and their problems." Paul Moore had a different take, though; he thought that respondents wanted the packaging community to "continue focusing on the majority cases" and not spend time on supporting edge cases. "To me, that fits with the desire for an 'official' workflow and associated tool(s) - the edge cases would be outside the 'official' workflow, and hence not a priority." Somewhat in keeping with Cannon's comment, however, Moore said that the pip developers have found that "people really don't take kindly to being told their use case is unusual or unsupported". Dustin Ingram thought that by catering too much to the edge cases, everything suffers: it "leads to both the majority cases and edge cases being poorly supported".

At the end of December, Ralf Gommers announced the pypackaging-native site that is meant to collect "the key problems scientific, data science, ML/AI [machine learning/artificial intelligence] and other native-code-using projects & authors have with PyPI, wheels and Python packaging". But on the site he purposely does not offer "more than hints at solutions", since it is meant to foster discussion, larger projects, and Python Enhancement Proposals (PEPs). The site came up regularly in the strategy discussion as a fairly neutral recounting of the trials and tribulations associated with packaging, especially for scientific-Python projects (like NumPy and SciPy) where there are native dependencies, such as for compilers, build tools, and specialized libraries (e.g. for CUDA).

Strategy

In the strategy thread, Mohanan began by noting that several survey respondents had explicitly called for a single tool that would unify the various disparate efforts and eliminate the fragmentation. "In a nutshell, there are too many tools and users are not sure which ones to use." She asked if there is a path toward some kind of unification, and what it would entail.

It is not simply a matter of tools, Filipe Laíns said: "just unifying tools won't do any good, we need to do a full UX [user experience] analysis and then design something that fills the needs of users more naturally". pip already tries to be the single-tool solution, but it suffers from a need to continue supporting all of its historic use cases, which makes it hard to change. In addition, pip and other packaging tools all suffer from a lack of maintainer time.

Moore readily agreed, but noted that for some unified tool to be successful, "it will be at least as important to decide what workflows we won't support, as what we will". The lesson learned from pip is that trying to support every workflow just ends up in a mess, he said. But one problem is that respondents to the survey have likely been making an "implicit assumption that their preferred workflow will be supported" in whatever unified tool comes about. Ultimately, that is where the tension will lie.

But Moore is not so sure that a single, unified tool is worth the enormous amount of effort it would take to get there; it would be a substantial improvement for users, but it would take a lot of resources that might be better used in other ways. "Even as a funded project, with its own hired resources, it would consume a big chunk of attention from the packaging community [...]". He thinks there may be a better way:

One alternative that I think we should consider is continuing to work on the goal of splitting out the various "components" of packaging into reusable libraries. (In particular, I'd like to see an ecosystem where "pip is the only viable approach to installing packages" is no longer the self-evident truth that it is today.) Projects like installer, build, packaging and resolvelib are good examples of this. Using those libraries, and more like them, it will be a lot easier to build new workflow tools, any one of which has the potential to become the unified solution people want. It's definitely going to result in more complexity on the way to something simpler, and I'm pretty sure it's not what the survey respondents were imagining, but I feel that it might be a better trade off for the long term.

Cannon said that in his work on Python in Visual Studio Code (VS Code), he also encounters the problem of everyone expecting that their workflow will be supported. "People often don't have exposure to other workflows, so they innately think their workflow is 'normal', so if something doesn't work for them then it obviously is broken for everyone, right?" Changing to a more common workflow is not something that some people will be willing to do.

He agreed with Moore's suggested approach of reusable libraries for packaging; "If we can get everything backed by standards and then make using those standards easy by making sure they have backing libraries, then we can make it easier for people to experiment a bit as to what the proper UX should be for any front-end packaging tool." He also noted that the complexity in Python packaging is a double-edged sword: "While people bemoan Python's packaging story as being too complex, [its] flexibility is what helped it become the glue language of the programming world."

Conda

The elephant in the room, though, is the scientific-Python world and the conda ecosystem that goes along with it. H. Vetinari worried that it was not really being considered in the discussion. "As long as the ecosystem currently being served by conda cannot be folded back into 'the One True Way', we have not actually solved the schisms in python packaging (i.e. everyone can just use the same tool)." From Vetinari's perspective, either the monumental task of solving most of the problems described at pypackaging-native needs to happen in the unified tool, or to "define large parts of the data science ecosystem as out of scope". Neither of those is particularly palatable, but the PyPA has generally considered the needs of the scientific-Python world as being out of scope for its tools because of how much work it would take to handle the non-Python dependencies, Vetinari said.

But Moore made a distinction between "use cases" and "workflows"; he is adamantly opposed to labeling any particular use case as "out of scope". "I do expect that we may need to ask users to do things in certain ways in order to address their use cases [...]". Beyond that, though, conda has to install its own Python build, which means that anyone who wants to get, say, NumPy through conda, has to suddenly switch away from the Python they are already using; pip and others are meant to work with any (and every) Python installation:

The PyPA focus is on supporting the "standard" builds of Python (python.org, Linux distro builds, Windows Store, self-built interpreters, Homebrew, ...) Solutions that require the user to switch to a different Python build don't fit that remit. I don't think that "declare all of those Python builds as out of scope" has any more chance of being acceptable to the SC [steering council] than "declare a big chunk of the user base" does.

Vetinari suggested that looking at the problem differently might help: "For brainstorming about those solutions, I'd really like us not to think in terms of 'python.org installers' or 'Windows store builds' or 'conda', but in terms of anything that can satisfy the same relevant set of requirements." Moore agreed with that, but noted there is some amount of tension on the boundary between the PyPA and the core language (in the form of the steering council):

But I think that "how do people get Python" is part of the question about a unified solution. In much the same way that the "rust experience" isn't just cargo, it's also rustup. But just as we can't assume the SC would be OK with ignoring a chunk of the user base, I don't think we can assume the SC will accept dropping those ways of getting Python. We can ask, but we can't assume.

This is where the boundary between packaging (the PyPA) and the SC blurs. I personally think that the SC's "hands off" approach to packaging puts us in a bad place as soon as we get close to areas where the SC does have authority. Distutils was dropped from the stdlib, and the packaging community had to pick up the slack. We have to address binary compatibility, but we don't have control over the distribution channels for the interpreter. We provide tools for managing virtual environments, but we don't control the venv mechanism. We install libraries, but we don't control the way import hooks work. Etc.

Vetinari "fully agreed" that solving all of the problems will eventually require support at the language level, thus steering council involvement.

Unification

The discussion cue mentions "unification", Pradyun Gedam said, but everyone seems to have a different idea of exactly what would be unified. He came up with a list of six different aspects of the problem space that might be amenable to unification:

  1. Unification of PyPI/conda models (i.e. the non-Python code dependency problem).
  2. Unification of the consumer-facing tooling (i.e. consuming libraries).
  3. Unification of the publisher-facing tooling (i.e. publishing libraries).
  4. Unification of the workflow setups/tooling (i.e. organising files, running tests, linters, etc).
  5. Unification/Consistency in the deployment processes (i.e. going from source code -> working application somewhere).
  6. Unification/Consistency in `python` usage experience. (i.e. the rustup/pyenv aspects of this, which is absolutely a thing that affects users' "Python Packaging" experience (think pip != python -m pip, or python -m pip vs py -m pip, or python being on PATH but not pip etc)

That list, which others in the thread added on to, gives a nice, quick overview of where the complexity lies. These problems affect multiple users, including Python application users and developers, Python Package Index (PyPI) module developers, system administrators and operators, and so on. Each of those users has their own set of constraints (e.g. operating system, architecture), habits, and prejudices, but they are trying to share various global resources in a way that makes sense for them. It is perhaps not surprising that there is no unified solution that covers everyone. In fact it is hard to see that all of the things in that list can be fully united—at least in any kind of near-term time frame.

There is more to come on this Python-packaging journey for sure. Participants have been nothing if not verbose, with many good points being raised, as well as ideas for how to get where most everyone seems to want to go: a unified user experience for Python package installation. Something that just works for "everyone" and that every project can point to as its means of installation. Further unification could come later—if it ever comes about at all. Stay tuned ...

Comments (28 posted)

Fedora packages versus upstream Flatpaks

By Jake Edge
February 7, 2023

The Flatpak package format promises to bring "the future of apps on Linux", but a Linux distribution like Fedora already provides packages in its native format—and built to its specifications. Flatpaks that come from upstream projects may or may not follow the packaging guidelines, philosophy, and practices so they exist in their own world, separate from the packages that come directly from Fedora. But those worlds have collided to a certain extent over the past year to two. Recently, a packager announced their plans to stop packaging the Bottles tool, used for running Windows programs in Wine-based containers on Linux, in favor of recommending that Fedora users install the upstream Flatpak.

Fedora packager "Sandro" posted about their intent to stop packaging Bottles, starting with Fedora 38, on the devel mailing list. Also, the existing packages for Fedora 36 and 37 would not receive any updates unless there were any security fixes required. The reasons for doing so are twofold; first, the project "is moving fast and we have been struggling to keep up with upstream releases". That effort has been complicated by the introduction of some Rust-based dependencies, which is a problem that Fedora as a whole has also been struggling with. In addition, though, an upstream developer approached the Fedora Bottles team with a request not to package the tool.

A bit more discussion around the upstream effort to persuade distributions to drop Bottles packages can be seen in a GitHub issue for Bottles. The Bottles developers also note that the project is fast-moving, and that the issue tracker only accepts bug reports for the official Flatpak version, both of which make it unfit for distribution packaging in their eyes. In early January, "orowith2os" sent an email to the Fedora Bottles packagers requesting that they drop the package, which got the ball rolling.

The reaction to Sandro's post was less than entirely positive, perhaps unsurprisingly. Vit Ondruch said that "the push towards (upstream) Flatpaks is unfortunate". Pete Walter agreed and suggested that the package be assigned to him rather than retiring it. While "Vascom" thought that it was a "bad practice to drop packages by upstream request".

Upstream Flatpaks

Josh Boyer wondered why Ondruch was opposed to Flatpaks. The problem, Ondruch replied, is that he does not trust that upstream Flatpaks "follow any standard except standard of their authors". Beyond that, Flatpaks do not really live up to their billing:

And I don't like [Flatpaks], because their main advantage (their isolation) is also their biggest disadvantage. There can't be both without making compromises. If I am not mistaken, the isolation is also mostly myth, because it is disabled in most cases.

There are a few questions that come to mind for upstream Flatpaks, according to Richard W.M. Jones: "How do we know they don't contain non-free software? How do we ensure we can obtain and rebuild from source?" The answer to those questions is that "you have to trust that the maintainer of the upstream F/OSS project cares about and ensures those things", Adam Williamson said. However, he has experienced upstreams that have a different interpretation of "F/OSS" from his or Fedora's, so that path can be problematic.

Kevin Kofler complained that a package should not be allowed to simply be retired, it should first be orphaned, which gives an opportunity for others to package and maintain it going forward. But the idea of moving toward upstream binaries instead of Fedora packages is not a good one, he said:

IMHO, retiring a Fedora package in favor of an upstream binary of whatever kind (Flatpak, Snap, AppImage, RPM, binary tarball, whatever) is a major disservice to Fedora users and defeats the whole point of having a distribution to begin with.

But Patrick Griffis was concerned that orphaning the package would just lead to an outdated package that will degrade over time. "I think it is far more responsible, and respectful of users, to accept that some packages are better maintained elsewhere." Michael Catanzaro pointed out that an orphaned package will be automatically retired after a few weeks if no one takes it over; if someone does take it over, they will hopefully be able to maintain it well:

If the only problem with the package is the current Fedora maintainer isn't able to keep up with updates, as seems to be the case here, then orphaning gives a chance for somebody else to try to do better. Hopefully a new maintainer will only take the package if confident that they can keep it updated.

Flathub

The main source of Flatpaks that can be installed on any Linux distribution is Flathub, but it will probably come as no surprise that opinions vary on its quality. Jiri Eischmann said that he maintains packages for Fedora as well as Flatpaks for Flathub and has found that the "review to get an app to Flathub was as thorough as Fedora package review"; in fact, it was sometimes more strict. But Neal Gompa noted that some Flatpaks, notably Firefox and OBS Studio, do not have to follow those rules. Robert Marcano pointed out that many Flatpaks on Flathub are just built from binaries built by other projects.

Eischmann acknowledged that Flathub allows exceptions, including allowing Mozilla to simply upload its Flatpaks directly to Flathub; he still sees value in the repository, though:

But Flathub is still a curated repo. If you want to deviate from standards you have to justify it, if you're doing something fishy your flatpak may be taken out. But [ultimately] you have to trust the author, but that applies to Fedora, too, just to lesser [extent].

The Firefox example is an interesting one, Williamson said, "because it's *exactly* a case where I trust the Fedora builds more than I trust upstream's". Mozilla has made some "sub-optimal choices in search of revenue", in his opinion, so he would prefer to get his Firefox from Fedora, which is not faced with the same quandary. Gompa noted that there are other differences in the Mozilla builds, like disabling address-space layout randomization (ASLR). "It's actually hard to figure out what upstreams are doing with their own builds, and sometimes they intentionally make it harder to figure it out."

Sandro was clearly surprised by the number of responses they received. It turns out that Sandro had only recently adopted Bottles after its last orphaning; they are not unwilling to turn it over to Walter but want to discuss it with the co-maintainers first. There is an open pull request to update Bottles to a version from October 2022, but since that time six more upstream releases have been made.

Part of the problem in keeping up with the Bottles upstream is its dependency on the Python orjson JSON library, which in turn depends on being able to build the Rust code inside it. "Maxwell G" said that he made a start on packaging orjson for Fedora but could not commit to maintaining the package; Ben Beasley volunteered to help. Fabio Valentini pointed out that if orjson had already been packaged for Fedora, the problem for Bottles would not have reared its head at all: "you'd probably not even notice that one of the many Python packages with 'native' modules in Bottles' dependency tree is actually implemented in Rust and not in C. :)"

It would seem that Sandro plans to bow out, but that others are likely to step up, so Bottles should have a future as a Fedora RPM. That does not resolve the disagreement with the Bottles upstream, however. As Maxwell G put it, though, the request to drop the RPM "feels inappropriate and somewhat antithetical to the tenets of OSS". For Bottles, at least, the situation seems more or less resolved at this point, with the upstream request being turned down. But this is not the last we will hear of the conflicts between Flatpaks and distribution-specific packages, especially now that unfiltered access to Flathub will be available in Fedora 38.

Comments (54 posted)

Git archive generation meets Hyrum's law

By Jonathan Corbet
February 2, 2023
On January 30, the GitHub blog carried a brief notice that the checksums of archives (such as tarballs) generated by the site had just changed. GitHub's engineers were seemingly unaware of the consequences of such a change — consequences that were immediately evident to anybody familiar with either packaging systems or Hyrum's law. Those checksums were widely depended on by build systems, which immediately broke when the change went live; the resulting impact of jawbones hitting the floor was heard worldwide. The change has been reverted for now, but it is worth looking at how GitHub managed to casually break vast numbers of build systems — and why this sort of change will almost certainly happen again.

One widely used GitHub feature is the ability to download an archive file of the state of the repository at an arbitrary commit; it is often used by build systems to obtain a specific release of a package of interest. Internally, this archive is created at request time by the git archive subcommand. Most build systems will compare the resulting archive against a separately stored checksum to be sure that the archive is as expected and has not been corrupted; if the checksum fails to match, the build will be aborted. So when the checksums of GitHub-generated tarballs abruptly changed, builds started failing.

Unsurprisingly, people started to complain. The initial response from GitHub employee (and major Git contributor) brian m. carlson was less than fully understanding:

I'm saying that policy has never been correct and we've never guaranteed stable checksums for archives, just like Git has never guaranteed that. I apologize that things are broken here and that there hasn't been clearer communication in the past on this, but our policy hasn't changed in over 4 years.

This answer, it might be said, was not received well. Wyatt Anderson, for example, said:

The collective amount of human effort it will take to break glass, recover broken build systems that are impacted by this change, and republish artifacts across entire software ecosystems could probably cure cancer. Please consider reverting this change as soon as possible.

The outcry grew louder, and it took about two hours for Matt Cooper (another GitHub employee) to announce that the change was being reverted — for now: "we're reverting the change, and we'll communicate better about such changes in the future (including timelines)". Builds resumed working, and peace reigned once again.

The source of the problem

The developers at GitHub did not wake up one morning and hatch a scheme to break large numbers of build systems; instead, all they did was upgrade the version of Git used internally. In June 2022, René Scharfe changed git archive to use an internal implementation of the gzip compression algorithm rather than invoking the gzip program separately. This change, which found its way into the Git 2.38 release, allowed Git to drop the gzip dependency, more easily support compression across operating systems, and compress the data with less CPU time.

It also caused git archive to compress files differently. While the uncompressed data is identical, the compressed form differs, so the checksum of the compressed data differs as well. Once this change landed on GitHub's production systems, the checksums for tarballs generated on the fly abruptly changed. GitHub backed out the change, either by reverting to an older Git or by explicitly configuring the use of the standalone gzip program, and the immediate problem went away.

The resulting discussion on the Git mailing list has been relatively muted so far. Eli Schwartz started things off with a suggestion that Git should change its default back to using the external gzip program for now, then implement a "v2 archive format" using the internal compressor. Using a heuristic, git archive would always default to the older format for commits before some sort of cutoff date. That would ensure ongoing compatibility for older archives, but the idea of wiring that sort of heuristic into Git was not generally popular.

Ævar Arnfjörð Bjarmason, instead, suggested that the default could be changed to use the external gzip, retaining the internal implementation as an option or as a fallback should the external program not be found. The responsibility for output compatibility could then be shifted to the compression program: anybody who wants to ensure that their generated archive files do not change will have to ensure that their gzip does not change. Since the Git developers do not control that program, they cannot guarantee its forward compatibility in any case.

Carlson, though, argued for avoiding stability guarantees — especially implicit guarantees — if possible:

I made a change some years back to the archive format to fix the permissions on pax headers when extracted as files, and kernel.org was relying on that and broke. Linus yelled at me because of that.

Since then, I've been very opposed to us guaranteeing output format consistency without explicitly doing so. I had sent some patches before that I don't think ever got picked up that documented this explicitly. I very much don't want people to come to rely on our behaviour unless we explicitly guarantee it.

He went on to suggest that Git could guarantee the stability of the archive format in uncompressed form. That format would have to be versioned, though, since the SHA-256 transition, if and when it happens, will force changes in that format anyway (a claim that Bjarmason questioned). In general, carlson concluded, it may well become necessary for anybody who wants consistent results to decompress archive files before checking checksums. He later reiterated that, in his opinion, implementing a stable tar format is feasible, but adding compression is not: "I personally feel that's too hard to get right and am not planning on working on it".

Konstantin Ryabitsev said that, while he understands carlson's desire to avoid committing to an output format, "I also think it's one of those things that happen despite your best efforts to prevent it". He suggested adding a --stable option to git archive that was guaranteed to not change.

What next?

As of this writing, the Git community has not decided whether to make any changes as the result of this episode. Bjarmason argued that the Git community should accommodate the needs of its users, even if they came to depend on a feature that was never advertised as being stable:

That's unfortunate, and those people probably shouldn't have done that, but that's water under the bridge. I think it would be irresponsible to change the output willy-nilly at this point, especially when it seems rather easy to find some compromise everyone will be happy with.

He has since posted a patch set restoring the old behavior, but also documenting that this behavior could change in the future.

Committing to stability of this type is never a thing to be done lightly, though; such stability can be hard to maintain (especially when dealing with file formats defined by others) and can block other types of progress. For example, replacing gzip can yield better compression that can be performed more efficiently; an inability to move beyond that algorithm would prevent Git from obtaining those benefits. Even if Git restores the use of an external gzip program by default, that program might, itself, change, or downstream users like GitHub may decide that they no longer want to support that format.

It would thus be unsurprising if this problem were to refuse to go away. The Git project is reluctant to add a stability guarantee to its maintenance load, and the same is true of its downstream users; GitHub has said that it would give some warning before a checksum change returns, but has not said that such a change would not happen. The developers and users of build systems may want to be rethinking their reliance on the specific compression format used by one proprietary service on the Internet. The next time problems turn up, they will not be able to say they haven't been warned.

Comments (93 posted)

Constant-time instructions and processor optimizations

By Jonathan Corbet
February 3, 2023
Of all the attacks on cryptographic code, timing attacks may be among the most insidious. An algorithm that appears to be coded correctly, perhaps even with a formal proof of its correctness, may be undermined by information leaked as the result of data-dependent timing differences. Both Arm and Intel have introduced modes that are intended to help defend against timing attacks, but the extent to which those modes should be used in the kernel is still under discussion.

Timing attacks

Timing attacks work by observing how much time is required to carry out an operation; if that time varies according to the data being operated on, it can be used to reconstruct that data. As a simplistic example, imagine a password-checking function that simply compares a provided string against a password stored in a (presumably) secure location. A logical implementation would be to start at the beginning, comparing characters, and return a failure status as soon as an incorrect character is found. That algorithm could be naively coded as:

    nchars = max(strlen(attempt), strlen(password));
    for (i = 0; i < nchars; i++)
        if (attempt[i] != password[i])
	    return false;
    return true;

The time this check takes is thus a function of the number of correct characters at the beginning of the string. An attacker could use this information to reconstruct the password, one character at a time. Real-world timing attacks that can, for example, extract cryptographic keys have been demonstrated many times.

In response, security-oriented developers have learned to avoid data-dependent timing variations in their code. In the example above, for example, the entire password string would be compared regardless of where the first wrong character is found. All of this careful work can be undermined, though, if the CPU this code runs on has timing artifacts of its own. It will surely come as a shock to LWN readers to learn that, in fact, CPUs do exhibit such behavior.

The CPU vendors are not unaware of this problem or its importance. But they are also not unaware of how beneficial some optimizations that introduce timing differences can be for certain benchmark results. The usual tension between security and performance objectives comes into play here, and the CPU vendors have taken the usual way out: make the users figure out which they want to sacrifice to gain the other.

Constant-time processor modes

Both Arm and Intel have thus introduced modes that guarantee that some instructions, at least, will execute in constant time regardless of what the operands are. In Arm's case, the mode is called Data Independent Timing (DIT); Intel calls its mode Data Operand Independent Timing (DOIT) mode, often called DOITM. Neither mode is enabled by default.

Back in August 2022, Eric Biggers asked whether these modes should be enabled by kernel. As a result of that discussion, a patch by Ard Biesheuvel was merged to enable DIT for the arm64 architecture — but only when running in the kernel. DIT remains off for user space by default, but enabling it is an unprivileged (and cheap) operation on that architecture, so user-space developers can enable it easily when they feel the need. This feature will be part of the 6.2 kernel release; it has not, as of this writing, been backported to any stable updates.

The story for x86 is less clear. DOITM is controlled by a model-specific register (MSR) and cannot be changed by user space. The August discussion wound down despite an attempt by Biggers to restart it in October. He returned in January, asking once again whether DOITM should be enabled by default for x86 — and advocating that it should:

Cryptography algorithms require constant-time instructions to prevent side-channel attacks that recover cryptographic keys based on execution times. Therefore, without this CPU vulnerability mitigated, it's generally impossible to safely do cryptography on the latest Intel CPUs.

Dave Hansen was less enthusiastic about the idea, even though he concluded that it was "generally the right thing to do". He pointed to language in Intel's documentation stating that DOITM only adds value for code that was specifically written with timing differences in mind:

Translating from Intel-speak: Intel thinks that DOITM purely a way to make the CPU run slower if you haven't already written code specifically to mitigate timing side channels. All pain, no gain.

The kernel as a whole is not written that way.

DOITM, he said, is only going to be useful for a small amount of carefully written cryptographic code in user space, and will only be a performance loss for everything else. He also noted that Intel explicitly warns that the performance impact of DOITM may be "significantly higher on future processors". The "Ice Lake" generation of processors is the first where DOITM makes any difference at all; constant-time operations are evidently the norm on earlier generations.

Biesheuvel argued that the value of DOITM extends beyond user-space cryptographic libraries, and that it is even more relevant to the kernel:

But for privileged execution, this should really be the other way around: the scope for optimizations relying on data dependent timing is exceedingly narrow in the kernel, because any data it processes must be assumed to be confidential by default (wrt user space), and it will probably be rather tricky to identify CPU bound workloads in the kernel where data dependent optimizations are guaranteed to be safe and result in a significant speedup.

Biggers questioned Hansen's focus on performance, saying that the kernel operations that benefit most from data-dependent optimizations are almost certainly the ones that most need protection, since they are the ones that will show the strongest timing differences. He concluded:

I think the real takeaway here is that the optimizations that Intel is apparently trying to introduce are a bad idea and not safe at all. To the extent that they exist at all, they should be an opt-in thing, not out-opt. The CPU gets that wrong, but Linux can flip that and do it right.

Hansen answered that the community has "looked at how bad the cure is compared to the disease for *every* one of these issues", referring to other types of hardware vulnerabilities. DOITM is different, he said, because it looks like the cure is likely to get worse over time rather than better; that makes it hard to come up with a reasonable policy in the kernel. He later added that, after discussions within Intel, he feels the kernel community should not jump to enable DOITM: "Suffice to say that DOITM was not designed to be turned on all the time. If software turns it on all the time, it won't accomplish what it was designed to do." Doing that, he said, would deprive systems of the "fancy new optimizations" that are coming in the future.

No conclusions have been reached this time either — at least, not yet. It has not helped that, so far, nobody has posted any benchmarks showing what the performance impact of DOITM is. Assuming that cost is not huge, though, it would be surprising if DOITM does not end up being enabled by default in the kernel, at least, with the ability for user space to enable it on demand. "Insecure by default" is rarely a way to impress users, after all.

Comments (132 posted)

A survey of free CAD systems

February 6, 2023

This article was contributed by Alexandre Prokoudine

Computer-aided design (CAD) software is expensive to develop, which is a good reason to appreciate the existing free and open-source alternatives to some of the big names in the industry. This article takes a bird's-eye view at free and open-source software for 2D drafting and 3D parametric solid modeling, its progress over the years, as well as wins and ongoing challenges.

FreeCAD

FreeCAD, distributed under LGPLv3 LGPLv2+, can incorporate virtually any set of tools related to design and engineering. The program is organized into workbenches — modes that provide job-specific tools to work with 2D or 3D geometry and its metadata. You can start with a 2D sketch of a part, complete it in 3D, do the finite element analysis, then create project documentation for submission, all by moving between workbenches and without ever leaving the program. That has been project's goal from its inception at Daimler in 2001.

[FreeCAD screenshot] Drafting in 2D with FreeCAD is a little confusing as there are multiple workbenches that have 2D geometry tools. The choice of a workbench depends on the type of the project being worked on. The Sketcher workbench is best used when one has a 3D project in mind that starts as a 2D sketch. The Draft workbench is better suited for use cases that stick with 2D (e.g. floor plans). In addition, workbenches like BIM (building information modeling) also provide tools for 2D sketching. Creating documentation (placing views onto a sheet of paper) requires yet another workbench called TechDraw. There are a lot of additional workbenches that can be installed separately via a built-in add-ons manager. They are all tailored for specific tasks like designing boat hulls, working with sheet metal designs, etc.

FreeCAD is a capable system overall, good enough to have become the haven for former Fusion 360 users who jumped ship after the licensing policy change in late 2020. However, the project has its struggles. Users commonly cite a difficult learning curve, performance issues, as well as crashes and various glitches as major obstacles.

There are various attempts at fixing some of the worst problems; several developers who contribute to that effort rely on funding from the community, usually via Patreon. In early 2022, the project created the FreeCAD Project Association (FPA), a Belgium-based non-profit organization that holds various assets (such as domain names), serves as the project's representative in conversations with institutions, and issues grants to contributors.

QCAD

QCAD is a 2D CAD program with a focus on drafting without constraints. The project was started by Andreas Mustun in 1999 as a computer-aided manufacturing (CAM) tool. Because the tool is mature and 2D-only, changes over the past few years have come at a seemingly glacial pace. The program has a large set of design and modification tools and is currently popular with do-it-yourself people and tinkerers of all kinds. It's also extensible using ECMAScript, and that section of the forum is one of the more active ones.

[QCAD screenshot] Although the user interface is organized differently from that of other, similar programs in the industry, most features in the program are readily discoverable, so QCAD is easy to get started with. The program is stable; only eight crash bugs are marked as not fixed in the tracker (out of 481 open reports and feature requests overall), and none of them is older than two years. Mustun keeps a pretty tight grip on this.

As a project, QCAD is different from every other free/libre CAD program: there is a community version (GPLv3+), a commercial version with extra plugins like support for the proprietary DWG format, and a separate commercial version with a CAM module and support for nesting. A lot of new advanced features end up only in the proprietary editions. It's hard to say whether the project is sustainable in financial terms since it definitely is a one-man band; third-party contributions of code are rather scarce. However, QCAD has been around for over 20 years, releases are frequent, and the community is active.

LibreCAD

LibreCAD started off as a fork of QCAD v2.x in 2010. Originally, Ries van Twisk was just planning to create a CAM module for QCAD to use it with a Mechmate CNC router. However, at the time, QCAD was based on Qt3, which was rapidly becoming obsolete, and the availability of the community edition (under GPLv2) was not certain. So Van Twisk acted on this concern and did the Qt3-to-Qt4 port.

Just like the original project, LibreCAD focuses on drafting features and is commonly used as a generic 2D CAD program. Because QCAD received a substantial internal redesign and an license change from GPLv2-only to GPLv3+ since work on LibreCAD had started, feature-wise LibreCAD v2.x continues to offer features mostly available in QCAD v2.x, but not from newer versions.

In 2014, the team decided to rewrite the geometry kernel of the program for legal and technical reasons. The original QCAD v2.x kernel code was GPLv2-only, which prevented a change to GPLv3+, and they considered the code itself not good enough to justify adding new features. The v3 branch has been in development since then, mostly by Google Summer of Code (GSoC) students. It has a new geometry calculations kernel, hardware-accelerated rendering, an updated, Ribbon-like user interface, and more changes. However, at this time, it's far from a being in a releasable state.

The project has been struggling with attracting regular active contributors for some years now. Only one GSoC student has stuck with the project to become a v3 branch maintainer and a GSoC mentor himself. At the same time, LibreCAD lost two of its most prolific contributors, Van Twisk and Rallaz. On a few occasions, LibreCAD contributors have stated that they are not ready for paid work on the program. But they do accept donations.

The current plan is to make just a few more releases in the v2.2.x series to include more contributions, then switch the focus to v3 entirely. The team is also hoping to attract more attention to LibreCAD v3 by starting to provide regular CI builds, which was the focus of their 2022 GSoC project.

SolveSpace

Originally a proprietary program with a rather unusual user interface, SolveSpace became free software in 2013. It is mainly a 3D drafting program that can be used in a 2D context to some extent. The entire project has been built around a sophisticated constraints solver.

Unlike QCAD, SolveSpace doesn't have dozens of 2D drafting tools. However, the combination of basic design tools (line, circle, rectangle, spline) and parametric modeling with constraints makes it powerful enough to design various parts (with NURBS, no less), which is why SolveSpace is commonly used for mechanical design of simple parts.

It's hard to beat the summary of SolveSpace recently written by user "somat" at HackerNews:

It feels like the program has captured the pure essence of parametric modeling and put it in program form. Lightweight and nimble, nothing extra, just plain fun to use. Until your models start exploding, which happens often in SolveSpace.

Two most problematic parts in SolveSpace however are the NURBS implementation and the user interface — 10% and 17% of all reported bugs and requests have been filed against those respectively.

There is no formal roadmap in the project, but the v4.0 milestone page provides some insight into the project's future plans. There is also an experimental port of SolveSpace to WebAssembly via Emscripten available in the main development branch, although the goal to make the program work on the web is not being actively pursued.

Development is driven purely by fun, the current team does not even accept donations, and there is no formal organization such as a non-profit foundation.

CAD Sketcher for Blender

This is a relatively new project designed and built as a Blender add-on. CAD Sketcher makes use of SolveSpace's constraints solver (its port to Python called py-slvs, in fact).

CAD Sketcher was designed to improve Blender for technical use cases rather than turn it into a full-fledged CAD application. CAD Sketcher uses the main viewport in Blender to create 2D sketches on the user's plane of choice. The add-on supports drawing geometric shapes (lines, rectangles, circles, arcs), setting various constraints, creating bevels etc. The resulting draft then can be used to create a 3D model of the design. Plugging into a vibrant, active project like Blender has certain benefits from developer's standpoint: it's easy to both create the user interface and distribute the program (installing an add-on Blender is as easy as loading a ZIP archive from the Preferences dialog).

Like some other free/libre CAD software projects, CAD Sketcher is mostly a one-man band, where one person does the vast majority of development. On the other hand, given project's intentionally limited scope, this appears to work for the community. According to project's roadmap, CAD Sketcher v1.0 is not too far off, and there's no shortage of ideas for future work. Development of the project is currently financed to some extent by selling the plugin on Gumroad. The project is maintained in collaboration with MakerTales YouTube channel that has over 70,000 subscribers.

OpenSCAD

[OpenSCAD screenshot] OpenSCAD developers identify it as "The programmers solid 3D CAD modeler". Instructions are written in a simple programming language, the OpenSCAD compiler then reads that script, renders a 3D model, and exports it to STL (a 3D-printable file format) or sends it to OctoPrint (a web interface for 3D printers). You can learn more from an introduction to OpenSCAD at LWN.

The project was started in 2010 by Marius Kintel after he got involved with the RepRap project and felt the need for a design tool more suitable for 3D printing than what was available at the time. Over the years, OpenSCAD has become popular with the maker community, especially with users of platforms like Thingiverse.

Reviewers commonly highlight the benefits of OpenSCAD, such as making designs inherently parametric (which also makes it easy to create variations of a design) and being better suited to users who come to modeling with a programming background. However, most 3D CAD programs today are parametric and many (even free/libre ones) allow modeling with code in languages that have a wider use than OpenSCAD's own scripting language. So it's up to the user to decide whether the built-in scripting language is expressive enough for them.

The project's activity is currently lower than in its most active years (2014-2015), however there are several active contributors, and the project is steadily moving toward the next release (the latest was in February 2021). There's also an online version of OpenSCAD, based on Dominick Schroer's port of OpenSCAD to WebAssembly.

BRL-CAD

[BRL-CAD screenshot] BRL-CAD is a 3D CAD system with support for both constructive solid geometry (CSG) and boundary representation (B-REP). This project originated in US Army Ballistic Research Laboratory (hence BRL) in 1979, became an open-source project in 2004, and is still managed by US Army employees.

BRL has shaped the feature set and the user interface quite a lot. Here is one of BRL-CAD developers providing his insight on the subject at Hacker News:

BRL-CAD has had more than 450 years of full-time effort invested, tens of millions with development spanning over four decades. However, that investment is heavily centered around features, integrations, and capabilities that are not as typically useful to the general public. [...] Primary paid focus is military vulnerability and lethality analyses where BRL-CAD is absolutely unparalleled... Still, general usability is not funded and is left to the auspices of the open source community.

This is probably the main reason why BRL-CAD is not popular in the industry despite its long track record. Nevertheless, the team is concerned with how much usable the program is for non-army users. They've been participating in GSoC for many years now, and most of their GSoC projects focus on modernizing both the internals and the user interface. Arbalest, the new Qt-based user interface, is already a multi-year project primarily developed via the GSoC program.

The core BRL-CAD developers are currently funded by the US government, but there are also some unpaid active contributors.

Collaboration between projects

BRL-CAD has been serving as an umbrella organization in GSoC for several FOSS CAD projects since 2013. In 2014, BRL-CAD developers registered The OpenCAx Association, a US-based 501(c)(3) non-profit organization. BRL-CAD, LibreCAD, OpenSCAD, FreeCAD, STEPcode, and Slic3r are all members of the association. Several other projects, such as IfcOpenShell, LinuxCNC, and KiCad, are not (yet) part of it, although they participated in GSoC under the BRL-CAD umbrella in various years.

So far, there is not much code sharing between these projects. There is some potential in using the geometry kernel of BRL-CAD elsewhere, but this would, at the very least, require an entirely new public API. There is some ongoing work on that API: the prototype known as Modular Object-Oriented Solidity Engine, or MOOSE, is already being developed.

Another obvious project where collaboration could happen is STEPcode, a set of ISO10303-conformant schemas and classes for reading and writing STEP files. BRL-CAD developers are also reportedly interested in creating "reusable geometry conversion infrastructure which includes AP242 and a couple dozen other formats".

Interestingly, two more projects that look like an obvious fit for the association — SolveSpace and libredwg — are not part of it. SolveSpace's constraints solver is now used in multiple projects, including FreeCAD's Assembly3 workbench and CAD Sketcher. And libredwg, also used by multiple projects, is a library for reading and writing the proprietary DWG format.

Summary

It's hard to overestimate just how important funding is in the CAD area; developing a sophisticated CAD program is expensive. It isn't surprising that FOSS projects with the best track record in terms of steady progress and regular releases are also projects that have funding more or less figured out. Overall, 2D/3D CAD projects appear to be moving toward better organization, even though there are still many improvements to make there.

While there's not much collaboration between projects apart from participating in GSoC together, it's interesting that both FreeCAD and Blender appear to be successfully building ecosystems so that new major features can be developed as add-ons and plugged into a larger host application. This simplifies both development and distribution.

One common weak spot is user interfaces. Drafting and solid modeling are surprisingly conservative fields; most of the available free/libre CAD programs have a 1990s/2000s user interface, which is both a blessing and a curse. There's the familiarity aspect that is hard to deny, but also not a lot of original research to find better ways to accomplish tasks.

The difference becomes especially apparent when you look at newer projects like Plasticity. It's a 2D/3D CAD program (LGPL, but based on a proprietary geometry computations kernel) written mostly in TypeScript. The approach to UX design follows modern patterns, and the amount of praise the program gets from experienced users, even at the current alpha stage, shows just how much interest for a shake-up has built up in the community of engineers and tinkerers. This is something that developers of other existing projects will have to deal with one way or another.

Comments (16 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Briefs: OpenSSH 9.2; Rustproofing Linux; glibc 2.37; LibreOffice 7.5; Rust Vulkan drivers; Quotes; ...
  • Announcements: Newsletters, conferences, security updates, patches, and more.
Next page: Brief items>>

Copyright © 2023, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds