LWN.net Weekly Edition for March 5, 2026
Welcome to the LWN.net Weekly Edition for March 5, 2026
This edition contains the following feature content:
- The troubles with Boolean inversion in Python: the long-running debate over the behavior of Python's bitwise-inversion operator continues.
- The ongoing quest for atomic buffered writes: a solution for atomic buffered I/O may be coming into focus.
- The exploitation paradox in open source: Richard Fontana's talk about the history of attempting to close "loopholes" in open-source licenses, and ideas on keeping freedom alive.
- Magit and Majutsu: discoverable version-control: a look at Emacs interfaces for working with Git and Jujutsu.
- IIIF: images and visual presentations for the web: standards for serving, displaying, and reusing image data online.
- Free software needs free tools: Jan Ainali makes the case that open-source projects should be using open tools.
This week's edition also includes these inner pages:
- Brief items: Brief news items from throughout the community.
- Announcements: Newsletters, conferences, security updates, patches, and more.
Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.
The troubles with Boolean inversion in Python
The Python bitwise-inversion (or complement) operator, "~", behaves pretty much as expected when it is applied to integers—it toggles every bit, from one to zero and vice versa. It might be expected that applying the operator to a non-integer, a bool for example, would raise a TypeError, but, because the bool type is really an int in disguise, the complement operator is allowed, at least for now. For nearly 15 years (and perhaps longer), there have been discussions about the oddity of that behavior and whether it should be changed. Eventually, that resulted in the "feature" being deprecated, producing a warning, with removal slated for Python 3.16 (due October 2027). That has led to some reconsideration and the deprecation may itself be deprecated.
The problem was reported in 2011 by Matt Joiner who was surprised by the outcome of some tests that he ran:
>>> bool(~True)
True
>>> bool(~False)
True
>>> bool(~~False)
False
>>> ~True, ~~True, ~False, ~~False
(-2, 1, -1, 0)
That last example demonstrates how those unexpected results came about:
True is effectively just an alias for one and False is
zero. When those values are inverted, they do not really
act in a Boolean kind of way. In Python, any non-zero value is treated as
true in a Boolean sense, and the complement of one is -2, both of which
evaluate to
true. Python defines
its integers as using two's
complement representation.
History
The bool type, True, and False were not added to the language until Python 2.3 in 2002, though the feature was infamously backported to the 2.2.1 bug-fix release prior to 2.3. PEP 285 ("Adding a bool type") described the feature in some detail; it is clear that using an integer value was done purposefully, for backward compatibility, at least in part. The PEP abstract explains:
The bool type would be a straightforward subtype (in C) of the int type, and the values False and True would behave like 0 and 1 in most respects (for example, False==0 and True==1 would be true) [...]
The author of the PEP, Guido van Rossum, was the Python benevolent dictator for life (BDFL) at the time; the Review section of the PEP kind of foreshadows the problems that led him to step down from that role 16 years later:
I've collected enough feedback to last me a lifetime, so I declare the review period officially OVER. I had Chinese food today; my fortune cookie said "Strong and bitter words indicate a weak cause." It reminded me of some of the posts against this PEP... :-)
The PEP was silent about applying the complement operator to bool values, but the implementation allowed it. Joiner filed the bug in 2011 because he went looking for a C-like unary not operator ("!"), which is not present in the language, and ran into "~" instead. As Amaury Forgeot d'Arc pointed out, the logical not operator is what Joiner was seeking. The bug was closed the day after it was opened, because the behavior was deliberate.
But the problematic behavior popped up again in a 2019 bug report from Tomer Vromen, who noted that the bitwise and ("&") and or ("|") operators acted as expected (i.e. like the logical equivalents), while complement does not. In fact, the bitwise versions of the and/or operators returned a bool result, while "~True" returns an int -2 (and not True as the integer could be interpreted, or even False as the caller might expect). The bug report linked to a fairly lengthy python-ideas thread from 2016 that also discussed the problem. Both the bug and the thread noted that NumPy has a Boolean type that behaves as expected (at least by some) and returns False for "~numpy.bool_(True)".
In the thread, Van Rossum seemed to lean toward changing the behavior, but wanted to do it with a quick change for Python 3.6, skipping a deprecation cycle, or not at all. Python behavior seems fairly inconsistent, as he described:
To be more precise, there are some "arithmetic" operations (+, -, *, /, **) and they all treat bools as ints and always return ints; there are also some "bitwise" operations (&, |, ^, ~) and they should all treat bools as bools and return a bool. Currently the only exception to this idea is that ~ returns an int, so the proposal is to fix that.
More recently
The idea seems to have just died out in 2016, and again in 2019, but was resurrected by Tim Hoffmann in a 2022 comment on the 2019 bug report. He proposed that ~ be deprecated for the bool type, which Van Rossum endorsed, suggesting that the deprecation be added for the then-upcoming 3.12 release. Earlier, Van Rossum clearly did not want to change the type of the result of ~bool to be a bool:
Because bool is embedded in int, it's okay to return a bool value that compares equal to the int from the corresponding int operation. Code that accepts ints and is passed bools will continue to work. But if we were to make ~b return not b, that makes bool not embedded in int (for the sake of numeric operations).Take for example
def f(a: int) -> int: return ~aI don't think it's a good idea to make f(0) != f(False).
In 2022, though, he was in favor of deprecating the use of the complement operator on bool values, rather than switching to a bool return type for complement. In the discussions about the behavior over the years, the main downside to it is that it can be confusing to users and that there is seemingly no real use case for it. For those who do end up getting confused, it is clearly not the right tool for the job, but the fact that NumPy and other libraries have normalized using bitwise complement to mean not muddies the waters.
The deprecation warning was duly added to Python 3.12 in 2023 from a pull request from Hoffmann. It gives a lengthy explanation when the exception is raised:
DeprecationWarning: Bitwise inversion '~' on bool is deprecated and will be removed in Python 3.16. This returns the bitwise inversion of the underlying int object and is usually not what you expect from negating a bool. Use the 'not' operator for boolean negation or ~int(x) if you really want the bitwise inversion of the underlying int.
One of the problems with deprecations is the visibility of the warnings; at various points, the DeprecationWarning exception was hidden by default because it too often was only seen by end users who were unable to fix the underlying problem. That changed back in 2017 to increase the visibility of the warnings, in part so that users could request fixes from library developers—deprecation in Python pops up fairly frequently in discussions about development of the language.
In August 2024, though, Barry Warsaw saw a GitHub email notification about the deprecation, which surprised him because he could not remember a wider discussion about it. He posted to the core development category to have that discussion, but he also wanted to talk about changes like this that can sometimes fly under the radar, so he started a parallel discussion as well. The question of "change visibility" seemed to reach a consensus that there was a problem in need of addressing, but there was less clarity on what might be done. Too much bureaucracy, in the form of PEPs or a more formalized change-management process, may negatively impact contributions, which largely come from volunteers; too little can lead to surprises like the deprecation of ~bool.
On the question of whether it should be deprecated at all, no real consensus was found, which has been the case throughout its history; some were strongly pro-deprecation because it is confusing and generally a footgun, while others lamented the inconsistency of only disallowing bitwise complement for the bool type and allowing all of the other arithmetic and bitwise operators.
Oscar Benjamin noted that
"use of ~ for logical negation is widespread
" in NumPy and
SymPy. Antoine Pitrou pointed
out that is because ~ can be overridden, unlike the logical
not. Benjamin agreed,
saying that PEP 335
("Overloadable Boolean Operators") would have allowed NumPy and SymPy to
take a different path, but it was eventually rejected
in 2012. Both Benjamin and Pitrou did not think ~bool was
particularly useful and were in favor of deprecation.
On the flipside, Bjorn Martinsson provided some examples of how he uses ~ on Boolean values. They are probably kind of obscure, but he has even publicized a use of the technique. A few others popped up in the thread with use cases as well.
Hoffmann summarized the arguments that led him to propose the deprecation and author the code change to effect it. Since he believed it made sense to rid the language of this footgun, only two paths presented themselves: changing the behavior to a logical negation or deprecating and eventually removing ~bool. He saw no good migration path for switching to negation, though, so he opted for deprecation. The discussion continued on for another month or so before winding down without any firm conclusion. There was talk of a PEP, but that did not come about either.
The thread sparked up again in October 2025 and Hoffmann responded to a query about the PEP, pointing to the bug discussion and his summary earlier in the thread. At that time, Tim Peters also posted about a change that he had to make to his code because of the deprecation; he thought it was far too late in the history of the language to be making breaking changes of that sort:
All computer languages have quirks. Python is, IMO, too mature and widely used now to risk changing much of any visible behaviors, short of screaming bugs, or (but less compellingly so) accidents of implementation that were never documented as "advertised" behavior.There's nothing surprising about ~bool to people who learn the language. bool is a subclass of int in Python, period. I don't give a hoot how it works in other languages. The time for that kind of argument was when Python's semantics were first crafted. It's too late for that now.
The present
Things went quiet again until mid-February 2026, when Hayden Welch posted a concern, but had misinterpreted what was being deprecated. It led to more discussion, naturally, much of it between Hoffmann and Peters, along with a reminder from Stefan Pochmann about his use case. That caused Hoffmann to start a parallel thread to gather real-world impacts of the deprecation, which currently just has a link to Pochmann's use case and a brief mention of the deprecation (or, really, someday elimination) of ~bool being a violation of the Liskov substitution principle (which had also come up elsewhere in the discussions). Essentially, if bool is to be a subtype of int, it has to be able to be used wherever an int can be and ~ surely qualifies.
In the main thread, though, Van Rossum said that
the discussion made him cry. "The inconsistency of disallowing ~x when x is
a bool while allowing it when x is an int trumps the lack of a use case
here.
" That was, of course, a complete reversal of his position back
in 2023, and also different from his 2016 advocacy of a quick switch to a
Boolean result for ~bool. In another message, he confirmed
the reversal:
Right, I've changed my mind. Or maybe I wasn't thinking far enough ahead at the time.I would be okay if ~b where b is statically typed as bool might trigger a warning in linters or static type checkers.
Around the time of Van Rossum's change of heart, the thread seems to have picked back
up, at least for a bit. In response to Peters's argument
that people mistakenly using ~ for logical not are
terribly confused, "H. Vetinari" claimed
that they were not, since
NumPy and the like have popularized the idea, but "that it only
works for arrays
". Peters was strongly
convinced that the NumPy model would not be good for Python as whole to
follow, however. For one thing, it works on more than just arrays, "but the
conceptual model is baffling
". He provided a number of examples
showing how NumPy is internally inconsistent in its handling of its
bool type.
Everything Python does follows from that bool is a subclass of int. That's all you have to remember. numpy's bool stands as unique in its type system, and is not even "a numeric type" there - although various operations' special cases make it act like one in various ad hoc ways.It's simply incoherent, a grab-bag of special cases. The core language shouldn't budge the width of an electron to try to cater to any such stuff.
Matthew Barnett raised
the seeming oddity of bitwise & and | returning a
bool result, while ~ does not; that was inconsistent in
his eyes, as it was in plenty of others' along the way. James Dow largely
or completely demolished
that argument with extensive references to the language documentation.
The language reference pretty clearly shows that the existing behavior is
required; an implementation is not actually Python without allowing
~bool. Since bool is an int, the
bitwise and/or operators are consistent as well: "True | False
must return an integer with a value of 1 (which True is) and
True & False must return an integer with a value of 0 (which
False is).
" Tom Fryers also had a lengthy
explanation that showed why the Liskov substitution principle matters,
and that real breakage results from deprecating the
~bool operation, even though that operation is perhaps weird and unlikely.
Hoffmann seems amenable to reversing course on the deprecation. In the abstract, that should be easy enough to do; code that changed due to the warning will continue to function just fine if the warning goes away. It is not entirely clear how a decision like that would be made, but one guesses the steering council will be brought in at some point to make a pronouncement. There is no huge rush, at least until the time comes to turn the warning into an exception, which is a year or more off at this point.
Overall, the mood seems to be shifting away from deprecation. Using inversion on a bool is a bit of a dark corner of the language, for sure, and it may have been a mistake not to create a separate Boolean type, certainly some in the discussions believe so. The confusion comes to those who think the language does have a separate Boolean type, and it would be nice to find a way to warn them, but removing the feature altogether seems like a step too far.
The long journey for ~bool is probably not over, but perhaps some kind of ending will come before long. This episode demonstrates a number of aspects of the Python development process over the years, from its more freewheeling days 20 or more years ago through its more stodgy aspect these days. Throughout, we see the general cordiality and collegial nature of its discussions; one suspects we have not seen the last of this odd corner of the language, but that further discussion or development will proceed along the same genial lines. Both the language and the community are rather mature at this point—and it shows.
The ongoing quest for atomic buffered writes
There are many applications that need to be able to write multi-block chunks of data to disk with the assurance that the operation will either complete successfully or fail altogether — that the write will not be partially completed (or "torn"), in other words. For years, kernel developers have worked on providing atomic writes as a way of satisfying that need; see, for example, sessions from the Linux Storage, Filesystem, Memory Management, and BPF (LSFMM+BPF) Summit from 2023, 2024, and 2025 (twice). While atomic direct I/O is now supported by some filesystems, atomic buffered I/O still is not. Filling that gap seems certain to be a 2026 LSFMM+BPF topic but, thanks to an early discussion, the shape of a solution might already be coming into focus.
Pankaj Raghav started that
discussion on February 13, noting that both ext4 and XFS now have
support for atomic writes when direct I/O is in use, but that supporting
atomic buffered I/O "remains a contentious topic
". There are a
couple of outstanding proposals to add this feature: this
2024 series from John Garry and a more recent
patch set from Ojaswin Mujoo. These proposals have stalled, partly out
of concern about the amount of complexity added to the I/O paths and
questions about whether there is really a need for atomic buffered writes.
A frequently mentioned potential user for this feature is the PostgreSQL
database which, unlike many other database managers, uses buffered I/O.
The PostgreSQL code often has to go out of its way to ensure that partial
I/O operations do not corrupt the database, sometimes at a cost to
performance. PostgreSQL is an important user, but not all developers are
convinced that atomic buffered writes are the solution to its problems;
Christoph Hellwig, for example, commented: "I think a
better session would be how we can help postgres to move off buffered I/O
instead of adding more special cases for them.
"
PostgreSQL developer Andres Freund responded
that the project is indeed working on adding direct-I/O support, but its performance has not yet reached
the level of the buffered-I/O method. But, he
said, direct I/O will only ever be useful for
some larger installations. Smaller systems, or those where the database is
running as part of a larger application with its own memory needs, will
still do better in a buffered-I/O setup where
the kernel can manage the allocation of memory. Even when direct I/O becomes competitive as an option for PostgreSQL,
he said, "well over 50% of users
" will not be able to benefit from
it. Most of the developers in the conversation seem to accept that there
is a legitimate use case for atomic buffered I/O, though Hellwig remains a holdout.
An agreement that a solution would be nice to have does not, itself, create a solution, though. Atomic direct I/O was a complex problem to solve, requiring the kernel to keep I/O requests together all the way through to the eventual storage device. Buffered I/O adds complexity, since those operations have to go through the page cache, and the actual write operation is normally carried out at a different time, when the kernel gets around to it. Tracking atomicity requirements through the kernel in this way and preventing multiple operations from interfering with each other are not simple tasks.
Early in the discussion Mujoo suggested that one possible solution might be to use writethrough semantics for atomic buffered writes. In other words, when user space initiates a buffered write requesting atomic behavior (which would be done using pwritev2() with the RWF_ATOMIC flag), the kernel would immediately initiate the process of writing that data to disk. That would allow creating a short-term pin to keep the pages in memory (it is hard to do an atomic write if one of the pages full of data is pushed out to swap in the middle of the operation) and would let the kernel prevent any other changes to those pages while the operation is underway. There would be no need to find a way to track atomic writes for dirty data that is sitting in the page cache.
Jan Kara agreed that writethrough behavior could be interesting. It would allow much of the existing direct-I/O infrastructure to be reused, he said, making the solution much simpler. The real question, he said, was whether writethrough behavior would be useful for PostgreSQL. Freund answered that writethrough would indeed be useful, even in the absence of atomic behavior. He suggested implementing it by requiring that atomic buffered writes include a new RWF_WRITETHROUGH flag along with RWF_ATOMIC; that way, if the kernel ever implemented atomic buffered writes without writethrough, there would not be a behavior change seen by user space.
Raghav asked about the difference between the proposed RWF_WRITETHROUGH flag, and the existing RWF_DSYNC, saying that the former might (like most buffered writes) be asynchronous, while the latter is synchronous. Dave Chinner disagreed with that interpretation, though, saying that writethrough behavior is inherently synchronous so that errors can be immediately reported. The way to get asynchronous behavior, he said, is to use the asynchronous-I/O interface or io_uring. But RWF_WRITETHROUGH itself, he said, should behave identically to direct-I/O writes, allowing the existing I/O paths to be used to implement it. RWF_DSYNC, he said, would still be different in that it forces the storage device to commit the data to persistent media, while RWF_WRITETHROUGH would not take that extra step (meaning that data could remain in the device's write cache).
In an attempt to summarize the discussion, Raghav posted this set of proposed conclusions; the first step would be to implement the proposed writethrough behavior with immediate initiation of the requested operation. Writethrough alone, though, does not guarantee atomic behavior, so there will be more to be done. The next step will be to ensure that the data being written is not modified while the operation is underway. Fortunately, the kernel has long had a mechanism, stable pages, that can be brought into play here. By preventing modifications to a buffer that is being written, the kernel can prevent the data from being corrupted.
Later steps will include taking care to copy the full data range into the page cache before beginning the operation, and to make sure that the buffer is written in a single, atomic operation. There will inevitably be other details to deal with, such as specifying and enforcing alignment requirements for buffers used with atomic writes. But it would appear that the path toward atomic buffered writes is starting to become more clear. It shouldn't take more than another half-dozen or so LSFMM+BPF sessions before the problem is fully solved.
The exploitation paradox in open source
The free and open-source software (FOSS) movements have always been
about giving freedom and power to individuals and organizations;
throughout that history, though, there have also been actors trying
to exploit FOSS to their own advantage. At Configuration Management
Camp (CfgMgmtCamp) 2026 in Ghent, Belgium, Richard Fontana described
the "exploitation paradox
" of open source: the recurring
pattern of crises when actors exploit loopholes to restrict freedoms
or gain the upper hand over others in the community. He also talked
about the attempts to close those loopholes as well as the need to
look beyond licenses as a means of keeping freedom alive.
Fontana is a lawyer who is well-known as an expert on FOSS
licenses. He has worked for Red Hat for much of his career, and now
works directly for IBM since it absorbed Red Hat's legal department in
early 2026. He said that this would be an unusual talk for CfgMgmtCamp,
as it was not about configuration management—though he had provided legal
support to people working on related projects such as Ansible and
Foreman. He would not be
speaking for Red Hat or IBM in his talk, however, though he said it
did draw on his work experiences over the years. "I'm on
vacation, seriously. I wanted to go to Ghent
".
Infrastructure and freedoms
He said that he might look at open source differently than many in
the audience, and that he had been struck by how there were periodic
crises and disagreements related to "legal stuff going
wrong
". These periodic flashpoints are not totally random, he
said, they have underlying features in common; the thing that varies
over time is what he called the infrastructure. "I don't mean like
'servers', I mean the current state of play that software is situated
in
", from a technical, cultural, and social
perspective. Basically, everything that shapes where power
concentrates and how freedom can be exercised.
Our definitions of freedom are anchored to an earlier technological
world, he said. For example, the Free Software Foundation's four
essential freedoms: the ability to run, study, modify, and share
software all relate to the early days of software development. There
is also "the other normative definition that doesn't use the word
freedom
", the Open Source Definition
(OSD) by the Open Source Initiative (OSI). Those definitions can be
thought of as sort of a constitutional foundation for open source.
Fontana observed that the "state of play that software is
situated in
", everything that is relevant from a technical,
social, economic, and business perspective, keeps evolving. Each time
that it does, there are new tensions and power dynamics that pop up;
but the definitions that underlie our understanding of free software
and open source stay the same. They have not been revised to change
with the times. This is in part because the gatekeepers for those
licenses ("and I've been one of these gatekeepers in the past
")
do not want to revise the definitions. In a sense, he said, open
source is a conservative domain because it is tied to unchanging
definitions even while other conditions do change.
When infrastructure changes, there are new opportunities to exploit open source—to exercise power, to create new business models, to make a profit—that did not exist previously. When that happens, people tend to reach for legal fixes to address the exploit, which in turn can create new control points. To illustrate, Fontana said he would walk through some of the history of open source to give examples, beginning with the first flashpoint: the invention of copyleft.
Copyright and copyleft
Originally, developers were able to share code because it was not
obvious that copyright even applied to software. "All software was
inherently free. It was a commons.
" And then it became clear in
the late 1970s that copyright did apply to software after all. That
was an infrastructure shift that made it possible to exert control
over software by stopping people from making and distributing
modifications to software.
Copyleft, in the form of the GPL, was a response to that new
control point. "It, famously, uses copyright law to create a
different type of license that tries to keep software free.
" It
was a well-intentioned attempt to use a legal tool to improve
conditions brought about by legal changes. But despite it being
well-intentioned, it was controversial in software-developer
communities, Fontana said. Even today there is still a schism between
copyleft proponents and those who prefer permissive licenses, such as
the BSD, MIT, and Apache licenses.
The GPL also opened up a new, unintended, control point in the form
of the dual-licensing model. "And this is really interesting,
because the GPL is designed to prevent software from being exploited
through copyright.
" Dual licensing was used to make proprietary
licensing effective by giving one party control over copyright, but
not others. "You're the one copyright owner of a GPL-licensed
code base and you provide a proprietary version for a fee.
" That,
too, was controversial, but it took time for people to develop the
vocabulary to explain why they were concerned about it, he said.
Instead of the motivations being to perpetuate the free software commons, you have people using the machinery of copyleft licensing in a certain sense to move code out of the commons. Even though, in a formal sense, it's still there, and there's nothing in the GPL that says this is wrong.
Dual-licensing is the first example of "a phenomenon that
repeats itself throughout the history of open source. This feature is
asymmetry.
" Anyone can exercise the freedoms under the GPL, but
only one actor has the freedom to use proprietary licensing. To
implement this asymmetry, the copyright holder needs to implement a
copyright-assignment system or contributor-license agreements (CLAs)
that give more power to the maintainer of the project.
SaaS loophole
The first attempt to use asymmetrical power in open source to make
money "in a way that is somehow divorced from the ideals open
source is founded on
" was dual-licensing, but it was not the
last. Businesses continue to use the freedoms granted by open-source
licenses to "introduce new forms of scarcity in some way or
another
".
Fontana said that the audience had probably heard of what he called
the Software-as-a-Service (SaaS) loophole, which "kind of breaks
open-source licensing
". In particular, it breaks the GPL and
copyleft licensing, because the legal foundations of those licenses
rest on distribution, which does not happen when the code is used
in a SaaS context. "You sort of escape the intended obligation
under the GPL even though you're doing things that are sort of similar
to what distributors do
". Since there is no binary distributed,
the requirements in the GPL are not triggered. In a SaaS context,
"the copyleft GPL software becomes equivalent to permissive-license
software
".
Once again, some people responded to this change with concern about
the integrity of open source and an attempt to fix the problem. In
particular, it led to the creation of the Affero GPL (AGPL), "sort
of an attempt to patch the GPL
", so that deployment of a service
becomes a trigger for releasing source code. "I would argue that
the AGPL was well-intended, but I don't know if I would say that it
was well-designed to combat the problem it was created to deal
with.
"
The AGPL is another example of trying to make a fix to a license when a problem emerges, but licensing does not solve the problem very well. In fact, Fontana said, the AGPL is often used by businesses in a dual-licensing context.
Brand identity
The value of open source as a brand identity is another sort of
infrastructure shift; there is value in labeling something "open
source", but it is problematic for the community because there is no
way to protect that brand. The Open Source Initiative tried to
trademark the term "open source" but failed to do
so. That has led to various parties stretching the definition of
open source, often toward more restrictions, "really stretching the
normative foundations [of open source] or kind of entering into public
conflict with them
". Those parties have taken advantage of the
ambiguity around what open source is, and turned it into an asset
that can be monetized.
Open source has become a misused term, without any clear way to
combat its misuse. "Open source became this valuable brand, and in
some ways it became more valuable than the substance it was supposed
to represent
." One form of this that Fontana described is the
creation of source-available licenses "mostly used by startups
that got built up around a popular open-source project
". The
familiar narrative, after a few years, is that the startup does not
like the way that people are using the freedoms they were given
through the open-source licenses. For example, cloud providers can
often operate services based on open-source projects better than the
startups can, which leads companies to decide to use licensing against
their competitors.
The source-available licenses are designed to look like open-source
licenses, and the projects are often hosted publicly and allow some
of the freedoms that users expect. Those licenses do not comply with
the OSD, though, because they discriminate against at
least one class of users. "They're ultimately sort of aimed at
competitors, without saying, 'if you compete with us, you can't use
this software.' They're not honest, in that sense.
"
Fontana used the example of HashiCorp switching its
license from the weak-copyleft Mozilla Public
License (MPL) to the Business Source License
(BUSL). That license "basically says 'you can use this, but not in
production'
", and then converts to an open-source license after
several years.
The BUSL is not the worst kind of source-available license,
he said, and admitted he does not like source-available licenses, in
part because they exploit confusion about what "open" means. If a
person is not "really clued into this stuff
", then they might
be confused and misled into thinking it was open source. Sometimes
companies will even continue referring to the project as open source,
even while using a restrictive license:
There's no question that part of what gives power to these licenses, and the business models enabled by these licenses, is the existing confusion it is exploiting around what 'open' means and what 'open source' means. So source-available licenses just exacerbate some of these problems we've seen historically around asymmetry and so forth.
Around the same time source-available licenses became a problem, he
said, a "splinter movement in open source
" started up as well:
the ethical-source
movement. He described that movement as believing that normative
definitions of open source are flawed because "open source allows
you to do all sorts of bad things
". Fontana noted that the
ethical-source movement did not fit exactly with the model of
exploiting open source for profit, but it "sort of should, in a
sense
".
The concern that open-source software could be used for
"nefarious purposes
" has been around for a long time, of
course. And it is true, he said, that it is morally neutral because
the freedoms are available to everyone. "You can't discriminate
against users, or you can't say the GPL is only available as long as
you're a good person.
" The JSON license from 2002, which is
basically the MIT license with a provision added that the software
"shall be used for Good, not Evil
", was a forerunner to the
ethical-source licenses.
There are problems with the ethical-source licenses, too. They do
not fit with the accepted definitions of open source, because they
discriminate against specific use cases such as "you can't use the
software for any use case that violates human-rights law
", or
similar. Though Fontana did not say this explicitly, enforcing such
licenses would also be difficult, if not impossible. His slide
described those licenses as "principled, but misdirected
". (The
full set of slides is available on the CfgMgmtCamp site.)
Open-source developers realized that bad things are happening with
their software and feel they have to do something to stop it. But,
how? "You're not empowered to write new laws. You're just a
software developer [...] so the only tools you know how to use are
licenses
" because those are the foundational tools of the whole
system. Ethical licenses, he said, are their own infrastructure shift;
they are designed to allocate power to certain people and deny it to
other people. This time the attempt to create an asymmetry of power is
not for profit, but to try to do good.
AI
The most recent infrastructure shift is AI. Fontana said that that
there are "all sorts of asymmetries around what we're calling AI
now, and they're more extreme than anything we've seen before
". He
said he was tempted to say that AI has nothing to do with open source,
but that isn't quite accurate. "AI in the modern sense is built on
a foundation of lots of important open-source projects
", which
includes authentic open-source projects built up around the use of AI
models.
But within the world of people creating AI models themselves,
"the term 'open' is used extensively, but it's used meaninglessly.
And then people using the technology repeat this problem
". The
ambiguity around open source just gets worse in the AI era; "open
source" in the AI context just basically means that model is
public. "It is actually worse than what we have with source
available, it's just a signal with no substance
".
Misuse of "open" in this context, he said, was openwashing. The
models, if thought of as software, do not meet the normative
definition of open source. There is no source code, in this case
training data, published, and often even information about the training
data is not disclosed. "So there's this kind of extreme
non-transparency in a context where the term 'open source' is being
widely used
", which is unfortunate.
So you might say, "why can't we solve all this by creating a new license?" And you know by now my answer is that licenses are not good at solving these problems.
Some people are angry about AI and have proposed creating licenses
that basically forbid using software to create a new model. Those
licenses, Fontana said, would violate the OSD pretty clearly, and it's
not even clear that those licenses could solve the problems. Licenses
are "very brittle tools
" that can't do much. They were
effective for the limited purpose they had in the 1980s and 1990s, but
the problems of today are too complex for a single type of tool to
solve.
Licenses aren't the solution
Fontana said that when he was discussing the talk with one of the
organizers, he was asked to be inspirational: "I'm not used to
doing that, I mostly just like to complain about stuff
" he
deadpanned. He was, however, willing to try.
The problem that he identified was that the way open source is
conceptualized is rooted in the past, and it does not get updated for
new problems. His suggestion is that we should try to reframe
open-source freedoms "in a way that is more dynamic or adaptive or
mobile
". He displayed a slide (reproduced below) first with the
classical freedoms and then with his concepts for new freedoms:
reproduce, verify, participate, exit, and stewardship.
He ran through the new freedoms quickly. The right to reproduce
"is not an original idea in any sense, kind of a generalization of
the work done on reproducible builds
". The GPL is designed to
allow users to rebuild software from source, but systems are more
complex now and "being able to rebuild source code is not
enough
". There is a need for a more robust ability to rebuild and
verify software. As an example, he said, someone claims to be running
a service based on open-source software, but perhaps they've modified
it in a substantial way without publishing the modifications. "How
can you verify the claims they make about those things?
"
He mapped the right to modify software to a new concept of a right
to participate in development of software. "If you are dependent on
a project, there's a sense in which you should have some way of
ideally participating in its governance
." Modification is a local
freedom, whereas participation is more of a collective freedom. He
said it was not a radical proposal for open-source development to
become a free-for-all with no standards for contribution, "but it's
sort of elevating participation to the level of the original
freedoms
."
Everybody talks about how the right to fork is a fundamental aspect
of open source, but "it turns out in practice, and this has become
increasingly true over time, you can't easily fork projects in most
cases
". It is actually too costly to practically exercise, so he
felt that open source should explicitly state that it is built on
"the right to compete
" which could make it more practical for
participants to exit a community that no longer serves their
needs. That, of course, is directly in conflict with the
source-available licenses.
Finally, stewardship "corresponds to the
work you need to do to sustain projects and the community
" and
should be "elevated to the foundational level for what open source
means
". Open source is a human endeavor, Fontana said. The
freedoms that he was articulating correspond to real human activities
that are important to consider when thinking about the ideals that
open source ought to meet.
So, the right to reproduce is based on curiosity. The right to verify is based on integrity. The right to participate is related to the notion of solidarity. The right to exit corresponds to the concept of courage. And stewardship, of course, corresponds to care. So these are all human forms of these kinds of reframed definitional freedoms.
He was not proposing, he said, to replace the existing freedoms or
the notion of what an open-source license is. Those are still a
foundational part of open source. But he felt that we need to have a
bigger and more expansive sense of what open source means that
is not simply rooted in a "static checklist of permissions of 1980s
and 1990s kinds of concepts
."
Asymmetry is inevitable in open source. It is a feature of
infrastructure shifts; there will always be changes in the field of
play that create new power relationships and leverage points. What we
can do, Fontana said, is make sure that power does not become
ossified, "and that's what this notion of mobile freedoms is sort
of aimed at
". We cannot eliminate asymmetry, he said, but we can
continue to work around it.
There was time for one question. An audience member wanted to know
if he was referring to the Open Source AI Definition (OSAID) in
his talk. Fontana said that he had not mentioned the OSAID in the talk, but
had been a critic of the definition. The OSI came up with something that
was too complicated and impractical "and also didn't make anyone
happy because it has this big compromise built into it
". It tried
to address the problem of undisclosed training data, but it does so in
a way that has "kind of a hole in it
". It was, "sort of
pointless, frankly
" and maybe shows that trying to come up with a
definition similar to the open-source definition is not the right
approach to address the problem. "But I'd have to think about that
more.
"
With that, time elapsed. The new freedoms proposed by Fontana seem interesting, and could do with more detail on how to implement them, but his point that licensing alone is insufficient is certainly valid. It would be useful for people and projects to be thinking beyond licensing to new ways to retain the ideals of open source as the world keeps changing.
[Thanks to the Linux Foundation, LWN's travel sponsor, for funding my travel to Ghent to attend CfgMgmtCamp.]
Magit and Majutsu: discoverable version-control
Jujutsu is an increasingly popular Git-compatible version-control system. It has a focus on simplifying Git's conceptual model to produce a smoother, clearer command-line experience. Some people already have a preferred replacement for Git's usual command-line interface, though: Magit, an Emacs package for working with Git repositories that also tries to make the interface more discoverable. Now, a handful of people are working to implement a Magit-style interface for Jujutsu: Majutsu.
Magit was started by Marius Vollmer in 2008; over time, the project grew
organically to cover the users' needs for an intuitive Git interface. The
current version is v4.5.0, and new releases come every few months. The
project's statistics
page shows that a majority of the code at this point has been written by
Jonas Bernoulli, but many authors have contributed improvements for their
specific workflows and use cases. The result is a startlingly comprehensive
feature set, which Bernoulli
calls
"essentially complete
", covering "about 90% of what can be done using
git
".
Majutsu is much younger: it was started in November 2025 by Brandon Olivier and has had six contributors so far, reaching version 0.6.0 on February 12. Its interface is already fairly comprehensive, however, owing both to Jujutsu's fewer corner cases and to the libraries written for Magit. Both projects are licensed under version 3 of the GPL, and Majutsu reuses Magit's interface design and libraries for handling transient windows. (Emacs predates most graphical interfaces, and calls the things everyone else calls windows "frames". It calls panels that subdivide a frame "windows".)
Discoverable design
Magit's transient windows are core of its semi-graphical interface, allowing the package to combine keyboard-driven actions with text-based status display. When Magit is started for the first time (by typing "C-x g" or "M-x magit", depending on how good one is at remembering arcane Emacs incantations), it shows a status summary screen:
From that status screen, there are a number of keyboard shortcuts that can be used to perform Git operations. Hitting "d", for example, brings up the transient window for diffs, which lists all of the various things that can be done with diffs in Magit:
Continuing to type the characters shown in green (on my theme, at least) applies possible command-line flags — which are saved until they are reset. For example, my --stat and --no-ext-diff flags (which generate a diffstat and turn off external diffing programs, respectively) are already turned on, and will be applied to all diff-related commands I use until I turn them off again. Actually choosing an operation to perform requires typing one of the un-prefixed letters shown at the bottom of the transient window (not shown here because they've scrolled off the screen). Typing the same letter again, however, is always bound to "do what I mean", and does a reasonable thing involving the current cursor position in the status buffer is. So, hitting "d" again (with my cursor in the status buffer on one of the listed recent commits) opens the diff associated with the chosen commit in a new window:
Typing "q" will close any of Magit's temporary windows and return to the status buffer. Typing "?" lists all of the keys that pop up a transient window to begin with, although they're mostly mnemonic, such as "l" for logs or "c" to commit.
Magit commands like this are context-sensitive. If the cursor in the status buffer is on a commit identifier (branch, tag, or hash), hitting "d d" shows the diff associated with that commit. If the cursor is on a file with unstaged changes, hitting "dd" shows the diff of that file against the staging area. Within that diff, placing a cursor on a particular hunk and typing "s" stages it. Normal Emacs navigation, such as clicking or arrow keys, suffices to navigate to any Git object, such as a file, commit, hunk, or tree. Once the cursor is on it, the default contextual commands will do something useful.
When updating something about Git's state using Emacs, such as staging or unstaging a hunk, all Magit's buffers remain automatically in sync. This includes editing a file in the repository with Emacs — saving the edited file will make Magit update the diffs if they are open in another window. If the repository is updated outside of Magit, typing "g" forces a manual refresh.
This design makes the use of Magit pleasingly discoverable — performing simple operations is intuitive, and all of the text on the status screen can be interacted with. Performing a more complex operation involves opening the appropriate transient window and then turning on and off options and selecting the appropriate operation. It doesn't require going to the Magit manual or the Git manual, because there is a handy short reference guide right there in Emacs. By the same token, for all that Magit calls itself an alternate Git porcelain, one's existing knowledge of the Git command line is not obviated: Magit commands can use almost all of the same flags and Git subcommands as the normal command-line interface.
Majutsu
Despite operating on top of a different version-control system, Majutsu looks fairly similar at first glance:
The main difference is that Jujutsu does away with the concept of a staging area: there is always a particular working commit (given the short name "@"), and one just edits that commit in-place, rather than staging changes and committing them only once they're finished. Consequently, Majutsu puts the graph of recent commits at the top, with expanded details about the state of the current working commit down below.
The interface works exactly the same way as Magit, which is unsurprising since it reuses Magit's libraries: start typing, and a transient window will pop up to show possible completions of the command. Majutsu does have fewer transient windows (25 vs. Magit's 37), but that is partially a result of Jujutsu having fewer commands than Git. The Majutsu manual goes into more detail about the available transient windows. It does have an overall more bare-bones feel than Magit, which is somewhat to be expected with many fewer years of contributions.
The graph in the main status window shows some differences from Git's model — for example, it shows commit identifiers (on the left side, starting with a pink or purple letter) instead of commit hashes. Jujutsu commit identifiers are stable through rebases and other history-modification operations, so they can be used to refer unambiguously to a commit even in the middle of a rebase. The colored letters at the beginning highlight the minimum prefix needed to refer to them unambiguously: Jujutsu will understand "r" to refer to commit "rkrmpkzv", at least until the repository gets another commit ID starting with "r". Jujutsu commits do still have cryptographic hashes — for signing, and for interoperability with Git — which can be seen at the bottom of the status window, starting with blue letters. These names, while normally quite helpful at the Jujutsu command-line, are less helpful in Majutsu, because commits are typically referred to by placing the cursor on them instead of referring to them by name.
An example of that is Majutsu's rebase interface, which is simplified compared to Magit's interface. In Magit, a rebase is started using "r r", whereupon one will have to select a starting revision, and go through Git's normal interactive-rebase workflow. In Majutsu, the experience is more visual. A rebase starts with "r", but then one can select which revisions should be picked, squashed, or rebased onto directly in the main status window or in the detailed log window. Once the correct commits have been selected pressing return actually performs the rebase. The procedure is greatly streamlined compared to Magit, which makes sense given that Jujutsu's design encourages rebasing more frequently than Git does.
There are some rough edges with Majutsu. Using it to clone a new Git repository (with "M-x majutsu-git-clone") was a bit confusing — it warned me about being used outside a Jujutsu repository, and asked if I wanted to create one. When I did so and then cloned my target repository, it checked it out into a subdirectory, leaving me with two nested repositories. That's a fairly minor detail, however.
More annoying is the fact that Jujutsu's log command (and therefore Majutsu's status buffer) doesn't show commits from before importing a Git repository. This is despite the fact that Jujutsu supports operating colocated with Git, using both in the same repository. Subsequent commits made with Git in a colocated repository are shown, but it makes for an awkward transition. Other operations, such as committing and moving bookmarks (the equivalent of Git's tags) around, went smoothly.
Jujutsu is an interesting experiment in building a version-control system with a simplified design. After years of using Git, it can feel uncomfortable — but Majutsu makes it easy to explore. For a version-control system that has to wrestle with Git's dominance, having a discoverable interface feels like an important step toward making it easier for inveterate Git users to migrate. Majutsu has a ways to go before it reaches Magit's level of polish, but it's more than ready to help people curious about Jujutsu experiment without leaving the comfortable embrace of Emacs.
IIIF: images and visual presentations for the web
The International Image Interoperability Framework, or IIIF ("triple-eye eff"), is a small set of standards that form a basis for serving, displaying, and reusing image data on the web. It consists of a number of API definitions that compose with each other to achieve a standard for providing, for example, presentations of high-resolution images at multiple zoom levels, as well as bundling multiple images together. Presentations may include metadata about details like authorship, dates, references to other representations of the same work, copyright information, bibliographic identifiers, etc. Presentations can be further grouped into collections, and metadata can be added in the form of transcriptions, annotations, or captions. IIIF is most popular with cultural-heritage organizations, such as libraries, universities, and archives.
Collections and presentations can—and often do—link to images hosted at many different web sites. A key strength of the framework is standardizing complex, feature-rich image hosting with the explicit goal of interoperable referencing and grouping into combined presentations.
Audience and implementers
IIIF is mostly used by public-sector organizations that deal with heritage science and digital humanities, the core audience being the galleries, libraries, archives, and museums (GLAM) field. The greatest benefits of IIIF are gained when there are few (or no) legal or technical restrictions placed on the content being served, which in practice means works that are born in the public domain or whose copyright has expired.
Among IIIF users likely to be of interest to LWN readers are Wikimedia Commons and the Internet Archive. Wikimedia started tentative integration work in 2018 in the form of IIIF manifests in Wikidata properties, but it seems that further deployment is on hold. The Internet Archive started integrating IIIF in 2015 and officially adopted it in 2023.
The image server at the heart of it
The "How It Works" page on the IIIF website does a good job of explaining the basics of the framework's technical principles, but I will provide an overview of my own here. At the core of IIIF is the Image API, simply defined as a URL with this format:
https://example.org/{id}/{region}/{size}/{rotation}/{quality}.{format}
Here, id is a string that identifies an image, and region is an expression that crops a portion, for example "full" for the entire image or "50,100,200,300" for "50 pixels right, 100 down, 200 wide, 300 high". size specifies how much to downscale the image after cropping, if at all. rotation requests a rotation in degrees, quality can be "default", "grayscale" or one of a few more predefined keywords, and, finally, format is the file extension of the desired image format, usually jpg.
For example, let's look at a Map of Colorado from 1882 hosted by the Library of Congress. A direct URL to the full map, downscaled to 800 pixels wide, looks like this (split for readability):
https://tile.loc.gov/image-services/iiif
/service:gmd:gmd370m:g3700m:g3700m:gla00130:ca000120 (identifier)
/full/800,/0/default.jpg (region etc.)
And if we want to focus on Boulder County by cropping out the rest, the URL looks like:
https://tile.loc.gov/image-services/iiif
/service:gmd:gmd370m:g3700m:g3700m:gla00130:ca000120
/2910,1028,660,509/250,/0/default.jpg
The difference is in the last part of the URL, where a restricted region and lower resolution have been requested; the result appears on the right.
The specification requires another endpoint, of the form https://example.org/{id}/info.json, where the server must provide key characteristics about the image as well as which optional features of the API the server itself implements. Here is the info.json that the Library of Congress gives us for the above image. (This site still uses version 2 of the Image API.)
Using this API, IIIF viewer software can efficiently query the server for cropped and downscaled pieces of even extremely large image files (such as this panorama of the painting The Battle of Murten), to focus on an area of interest and to provide advanced, traffic-efficient, deep-zoom views, similar to applications like Google Maps or OpenStreetMap. (Keeping in mind that map services generally deal with rasterizing vector tiles, whereas IIIF deals with bitmaps.) Image servers are also required to advertise in the info.json which resolutions and tile sizes are available pre-rendered, so viewers may prioritize requesting those to minimize lag and the computation demands imposed on the server.
An image server with large assets can be susceptible to denial-of-service attacks. The protocol necessarily allows clients to repeatedly ask for computationally intensive image-processing operations and vary the parameters just a little bit to ensure cache misses. Servers must be deployed defensively, taking care to aggressively cache the most anticipated assets and their most requested tile sizes. Many implementors also try to fend off machine-learning bots with reverse proxies, web application firewalls, rate limits, or services like Cloudflare. It is a struggle.
Typically, when serving images over IIIF, the files are kept on disk in a format that supports storing multiple resolutions in a tiled representation, as opposed linear rows of pixels. The most popular formats are TIFF, using a special internal layout called "Pyramid TIFF", and JPEG 2000.
There are a handful of popular IIIF-compatible image servers, the most popular ones probably being Cantaloupe (University of Illinois/NCSA Open Source License) and IIPImage (GPL v3.0). The latter can handle a few other, similar protocols in addition to IIIF, plus it has features like support for multispectral images, allowing sites to keep not only multiple resolutions of visible-light photography but also X-ray or ultraviolet representations of a subject in one TIFF file.
An IIIF-compatible backend can also be implemented completely statically by simply pre-computing all the tiles one wishes to serve, along with the appropriate JSON metadata, as long as one declares the appropriate flag in that JSON announcing to clients that only these tiles may be requested.
As the Image API is quite simple, some organizations do end up creating their own server software, and some complicated digital-asset-management systems (DAMS) also support the API.
Presentations and collections
The other leg that IIIF stands on is the Presentation API, which is quite a bit more complex. It is a definition for a document that points at one or more images and accompanies them with metadata. The Presentation API is what makes a digital object whole by, for example, stitching the individually digitized pages of a book into a linear viewing experience. The metadata usually contains information about the physical original's location, authorship, publication date, whether it is part of some series, its Uniform Resource Name, and so on.
A presentation can comprise images from multiple sources. For instance, if multiple museums in different countries have incomplete, digitized fragments of a manuscript, an IIIF presentation can combine them together into a virtual whole—without ever having to download, process, arrange, or in any way re-host the files. See this demo from Biblissima in France, which is a virtual reconstruction of a 15th-century manuscript that had its illuminations cut out long ago.
If a digitized manuscript is difficult to read due to outdated handwriting, faded text, or old-fashioned orthography, an IIIF presentation can be used to non-destructively add a set of annotations, which viewer software can then display alongside the image. Annotations can even make a centuries-old book searchable on a computer, since it has been thoroughly transcribed in the metadata.
The Internet Archive has IIIF manifests for its materials, but finding them isn't obvious. For a quick demo, let's look at the digitized copy of the first issue of BYTE magazine, hosted at https://archive.org/details/byte-magazine-1975-09. Here, the string byte-magazine-1975-09 in the URL is the item ID. To get the item's presentation manifest, we need to plug the ID into another URL template to get https://iiif.archive.org/iiif/byte-magazine-1975-09/manifest.json. Note that the Internet Archive hosts both "items" and "collections"; for collections, the correct suffix is collection.json.
Next, we can grab the URL of the manifest and take it with us to any IIIF tool. For example, that magazine can be fed to the Mirador viewer, yielding a different interface to the same material.
There is a handful of other IIIF APIs that are not as widely deployed. They deal with authorization, content search, and various machine-to-machine data-ingest concerns, for example.
The IIIF metadata formats have not been designed ad hoc, but rather they build on prior art established in existing W3C recommendations and semantic foundations, most notably "Architecture of the World Wide Web" (2004) and "JSON-based serialization for Linked Data (JSON-LD)" (first draft 2012, latest revision 2020).
While this article only talks about images, and the second I in IIIF stands for "image", in actuality the framework also supports audio-visual materials. A presentation or collection can include audio and video files as well; similarly, annotations can target spatially addressed areas of interest in a static image, but they can also target temporally addressed sections in audio or video. A popular feature enabled by this is the coupling of digitized sheet music with an audio recording, so both can be studied simultaneously.
An update to the Presentation API is expected in 2026 in the form of version 4.0 which, most notably, adds better support for 3D objects. 3D is already doable in the current 3.0 spec, but the next version comes with a major rework of some core concepts that brings 3D to equal semantic footing with 2D images and audio-visual materials.
Client software
While the Image API is designed to be compatible with regular URLs displayable in any regular web context and browser, implementing a zoomable and pannable IIIF view or presentation display requires a client called an IIIF viewer; these are generally JavaScript programs embedded in web pages. Some popular ones are Mirador (Apache 2.0), The Universal Viewer (MIT license; not to be confused with another program with the same name), Clover IIIF (MIT), and TIFY (AGPL v3.0).
All of the viewers above are wrappers on top of OpenSeadragon (three-clause BSD) which actually handles fetching the tiles from the server, stitching them, and drawing them on the page. OpenSeadragon, which made its 6.0 release on February 18, is highly configurable, supporting many different rendering modes, tweaks to user controls as well as zooming and panning behaviors, and more. What the above viewers add on top is support for showing multiple images at once, overlaying annotations, displaying metadata, and implementing the behaviors specified in the presentation and collection manifests.
Not all client programs are simple viewers; some build more advanced applications on top of IIIF. One highly celebrated and quickly evolving tool is Allmaps, a platform for georeferencing IIIF-enabled digitized maps or aerial photographs.
Highlights from the 2026 Online Meeting
The 2026 IIIF Online Meeting took place January 27–29; there is a YouTube playlist of the plenary session and four rounds of lightning talks. The plenary covered IIIF Consortium and community news and provided an overview of new features in the upcoming Presentation API 4.0, followed by discussion.
To kick off the first set of lightning talks, Tom Cramer, chair of the IIIF Consortium Executive Committee, spoke about IIIF Content Commons, a project he wants to see happen that would enhance the discoverability of content. He began by outlining the technical success and broad adoption of IIIF at a wide range of institutions but also bemoaned the fact that IIIF content is still hard to find. He proposed a new initiative to develop content aggregation solutions to remedy this.
In another talk, Tristan Roddis of Cogapp showed how the company built a new system for the British Library's Endangered Archives Programme, now incorporating images as well as audio files on the same site, using IIIF. This is part of the long recovery the British Library has been undergoing since its 2023 cyber attack.
Sonia Cook-Broen, a writer at TheTechMargin, gave a talk from the more esoteric, cyberpunk end of things, providing colorful visions of coupling IIIF plus some artificial intelligence with the InterPlanetary File System (IPFS) and a decentralized storage platform called Storacha. She observed that data impermanence is a big problem on the Internet and that, while IIIF contributes to solving many problems, it does nothing for this one. She showed a demo of her prototype site, Codex Protocol, which integrates with Storacha to find cultural-heritage objects online and store them on IPFS.
Cook-Broen's site also contains the COLLECTION_EXPLORER, a search engine of sorts for discovering IIIF content.
Alexis Pantos from the Museum of Cultural History and the University of Oslo, Norway, demoed some impressive 3D visuals provided by the BItFROST platform, based on the 3D Heritage Online Presenter (3DHOP). He showed an in-progress use case where archeological artifacts had been 3D scanned in-context at excavations, then later compiled into a research environment where the objects were annotated, linked to the archaeological context, and they could even be viewed individually or virtually placed back in situ at the excavation site. At least for the lay person with no training in archaeology, this seemed impressive. He said that, currently, too many manual steps go into combining the landscape-scale scans of excavations with models of artifacts, but his group is hoping to build better tooling and find suitable representations for this relationship in the IIIF metadata.
Governed by consortium
The IIIF Consortium, formed in 2015, steers the development of the framework, hosts meetings, and moderates an online community of contributors and implementors. There is an annual in-person conference; the next one is coming up in June 2026 in the Netherlands. The consortium comprises 71 members from all around the world. Most members are academic institutions or non-governmental organizations involved with digital-humanities and cultural-heritage subjects—libraries, universities, and ministries of culture—but there are a few corporations as well.
IIIF has been around for roughly a decade and has gone through quite a few revisions but, at the end of the day, it is just a framework and a toolkit. Frameworks live and die by the people and organizations applying their creativity to make the most of them. To see what's out there, the "awesome-iiif" GitHub repo is a nice place to start. Some highlights: Zooniverse is a crowdsourcing platform for annotations and transcriptions, Canopy IIIF is a static-site generator for building IIIF-based exhibitions, and IMMARKUS is an experimental annotation platform that currently only runs in Chromium due to its reliance on some cutting-edge browser features.
Free software needs free tools
One of the contradictions of the modern open-source movement is that projects which respect user freedoms often rely on proprietary tools that do not: communities often turn to non-free software for code hosting, communication, and more. At Configuration Management Camp (CfgMgmtCamp) 2026, Jan Ainali spoke about the need for open-source projects to adopt open tools; he hoped to persuade new and mature projects to switch to open alternatives, even if just one tool, to reduce their dependencies on tech giants and support community-driven infrastructure.
Ainali does contract work for the Swedish chapter of the Wikimedia
Foundation, called Wikimedia
Sverige, through his company Open By
Default. Wikimedia, of course, provides the MediaWiki
software, hosts Wikipedia, and much
more. He said that all of the tooling, everything in production, the
analytics, and so forth is open source. "There is a very strong
ethos in the Wikimedia movement to do it like that.
"
However, that ethos weakens the farther away one gets from
development. "When you step away from development to the more
peripheral parts of the workflow, it gets less and less open source in
the tooling.
" For example, Wikimedia uses the proprietary
Figma software for design, and its annual conference uses Zoom to
record talks and publishes them on YouTube. Even projects that have a
strong drive to do something open, he said, struggle to do everything
using only open-source software.
He emphasized that the presentation was not a rant against
open-source projects using proprietary software. He said that he
understood that it might be challenging to use more open-source tools
and to move away from proprietary ones. It is particularly difficult,
he said, for projects that have to work with other parties which have
constraints or requirements in the tools they use. "Even though I
am going to say a lot of things here, it is all coming from a place of
love and a wish for change.
"
Tools shape culture
Proprietary tools come with many kinds of restrictions, he said. For example, perhaps users are limited in the ways they can export data, or customize tools to suit specific workflows. There are many things that would be possible with open source that are not possible with proprietary tools; a project cannot make a tool its own if it does not have the ability to modify the software.
The tools also shape a project's culture, Ainali said. First, someone
suggests using a tool that is not open source. "It's never with a
bad intent. It's often like, oh, it has this feature that I cannot
find anywhere else.
" But that is a slippery slope, he said, a bad
spiral. Once the decision is made to use one proprietary tool, it
becomes easier to do it the next time. "If our design
guide is already in a proprietary software; maybe the next thing in
the toolchain also could be like that. You don't have the same
incentive to stay open.
" That, in turn, leads to exclusion.
There are also instances where geopolitics come into play, such as
the
incident when the Organic Maps
project lost access to GitHub, presumably because it had some
contributors from Russia. "And this is not because GitHub has
something against Russia, it's because where they are located and
their local laws.
" The flip side of that is that some contributors
may not want to provide data to a platform that might be required to
hand over data to its government. Especially when that platform is in
another country.
Even in the absence of political interference, he cautioned that
dependency on closed platforms posed other problems. "They try to lure you
in, and then to lock you in, so that it will be difficult to
leave.
" It is especially easy for open-source projects to be lured
in, he said, because many platforms start out with free tiers or
special deals for nonprofits and open projects.
And, of course, there's this lovely term from Cory Doctorow, "enshittification". He defined a couple of phases of how things get worse over time. First, you get lured in, and when they have a very large user base, they feel like they can extract more and more value out of you. It's not like they deliberately try to make it worse for you. It's just going to become worse for you as a user. Maybe it becomes more expensive. Maybe they extract more data out of you. Maybe they are trying to monetize on that data by selling targeting ads in the other end. So it's sort of like just working towards something getting worse.
At the same time, proprietary cloud platforms and services get
value from open-source projects, he said. The company gets metrics,
usage data, bug reports, and may be advertising that "this
open-source system is using our product
". Projects can also be
victims of a company's whims. A platform can decide at any time to end
its free tiers for open-source projects, or change its terms of
service: "Suddenly it says in the new [terms] update that 'oh, now
we're going to use your data for training AI'
."
There are also scenarios where a company is not trying to take advantage of open-source projects, it simply makes a business decision to close down a platform or service that it no longer considers profitable. That leaves the open-source projects that depend on it in a bad spot; if a project had been using something that was open source it could spin up a local deployment and maintain the software itself. With proprietary services, of course, that is not an option; once the plug is pulled, the party is over.
Losing contributors
Ainali said that choosing open tools is not "just a purity
test
". Some people will be discouraged from participating because
they simply do not want to use proprietary tools. But if a project is
requiring proprietary tools to participate, it may mean that some
people literally can't participate. For example, if a tool requires
using macOS, then it excludes participants that do not use that operating system. In some
parts of the world, he observed, people may only have the option of
running an open-source operating system because everything else is too
expensive.
Accessibility is another consideration. Many proprietary tools
"are very slick, but they may not have good
accessibility
". Open-source tools may not be beautiful, but they
are functional, he said.
Even if the project does not fully lose contributions from a person, it may not get full participation. Perhaps a person continues to make code contributions, but they do not join video calls to discuss project direction, or participate in text chats because the project uses a proprietary product for those activities.
So you're losing an important voice in your community. They might stay on the trusted old mailing list. And these people that are often very experienced, and know very well how their data could be used. So they're happy voting with their feet and not going in there. They care a lot about their freedom and they care a lot about their data. On the flip side, they are also often very knowledgeable in open source too, because they've learned that they don't want to be locked in with proprietary tools.
He encouraged projects to invest in their own ecosystems by
deploying open-source tools, adding features if necessary, improving
documentation, submitting bug reports, and so on. He anticipated a counter
argument: "'The proprietary tools are so much better', you
scream. 'We cannot use these [open] ones.' Well, we're getting
ourselves in a convenience Catch 22
". Ainali acknowledged that,
sometimes, proprietary tools were better from a technical
perspective. "But they're bad for your resilience, for your project
sustainability. And you could be helping to improve those open tools
instead.
"
As long as open projects are using the proprietary tools, they're
providing the metrics to improve them. If the projects are paying for
proprietary tools, then they're funding the improvement of those
tools. "So instead, you should try to help the community catch up
and expand.
" It will be difficult to break free of proprietary
tools, he said, if projects keep using them and giving them all the
benefits of their use.
Ainali also predicted that people would object to leaving
proprietary platforms because "everybody's already on this
platform
"; he did not say it, but that seemed to be
primarily directed at proprietary code forges such as GitHub. The
network effects of such platforms do not last, he said. "We have
seen plenty of social media platforms rise and go and other tools come
and go
", because they need to make a profit or perish. "Whereas
maybe open-source tools are more resilient because they don't need the
same extraction of value from their users
".
Start small
Projects should experiment, he said; perhaps try to mirror Git
repositories onto a freer platform. When projects need to choose a new
tool for something, they should choose an open tool. Above all, he
said that a project should listen to its community and start moving to
open tools where it makes the most sense for the community. "Don't
wait. It really should have started already, but it's never too late
to start now. It's never a bad time.
" It is also not all or
nothing, he said. Projects do not have to become "pure" overnight, or
try to switch everything all at once. There are many places a project
can start.
You pick one tool. You make one change. You evaluate, "did that work?" If it doesn't work, if it turns out you really need that feature, don't be afraid to roll back. Often with your community it will go well if you radiate the intent why you're wanting to do this change. That everybody can see, "oh, this is coming for our sake in the long run". It will make us more sustainable. It is possible, and when it works, it's really contagious. And don't beat yourself up if you cannot do it all at once. It is okay to not be perfect.
He closed out his talk by arguing that projects had made a choice
to be open source, and that choice should be reflected in the tools
used by the projects as well. Open source, he said, is more than code
and licenses: "it is a culture, it's a way of working, it's the
community, and it's the freedom that these licenses allow
us
". Projects, he said, should not try to build that freedom on a
foundation that they do not control.
[Thanks to the Linux Foundation, LWN's travel sponsor, for funding my travel to Ghent to attend CfgMgmtCamp.]
Page editor: Joe Brockmeier
Inside this week's LWN.net Weekly Edition
- Briefs: Ad tracking; firmware updates; TCP zero-copy; Motorola GrapheneOS phones; Gram 1.0; groff 1.24.0; Texinfo 7.3; Quotes; ...
- Announcements: Newsletters, conferences, security updates, patches, and more.
