LWN.net Weekly Edition for October 8, 2020

Welcome to the LWN.net Weekly Edition for October 8, 2020

This edition contains the following feature content:

Collabora Online moves out of The Document Foundation: one of the Document Foundation's principal members moves a key product away.
Fixing our broken internet: Mozilla's initiative to address the Internet's woes.
From O_MAYEXEC to trusted_for(): yet another attempt at applying security policies to script files.
Getting KDE onto commercial hardware: an Akademy session on how to get vendors to install KDE.
Ruby 3.0 brings new type checking and concurrency features: the release is due December 25.
Zig heading toward a self-hosting compiler: an innovative new language approaches an important milestone.

This week's edition also includes these inner pages:

Brief items: Brief news items from throughout the community.
Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Collabora Online moves out of The Document Foundation

By Jonathan Corbet
October 2, 2020

The Document Foundation (TDF) was formed in 2010 as a home for the newly created LibreOffice project; it has just celebrated its tenth anniversary. As it begins its second decade, though, TDF is showing some signs of strain. Evidence of this could be seen in the disagreement over a five-year marketing plan in July. More recently, the TDF membership committee sent an open letter to the board of directors demanding more transparency and expressing fears of conflicts of interest within the board. Now the situation has advanced with one of the TDF's largest contributing companies announcing that it will be moving some of its work out of the foundation entirely.

The dispute over the marketing plan has its roots in money, as is often the case. Developing a large system like LibreOffice requires the work of dozens of engineers, who need to be paid to be able to put a full-time effort into the project. Some of the companies employing those developers — Collabora in particular — think that TDF has succeeded too well; the free version of LibreOffice is solid enough that attempts to sell commercial support for it are running into a wall. The proposed marketing plan was designed to better differentiate "community-supported" LibreOffice from the professionally supported offerings from TDF member companies. This idea did not sit well with community members, who worried that LibreOffice was being pushed into a second-class citizen status.

The tension is at its highest around LibreOffice Online, which provides for collaborative editing of documents hosted on a central server. Evidently, what revenue does exist in the LibreOffice ecosystem is mostly focused on LibreOffice Online, which is a relatively hard service to set up and maintain without having somebody dedicated to the task. TDF has encouraged potential users to go with commercial offerings by, among other things, allowing the system to suggest commercial support to users and not offering binary builds of the LibreOffice Online server. Currently, if you want to establish a LibreOffice Online instance, you must start with the source and build it from there.

As Michael Meeks describes it in this announcement from Collabora, there are members of TDF that would like to change how LibreOffice Online is managed:

LibreOffice Online has been a source-only project: a place to collaborate around development, with own-branded products versions derived from that. Publicly available products have encouraged people to buy support when under heavy use.

Some TDF community, board and staff members have made it clear they don't accept this compromise, and want TDF to use the LibreOffice brand to distribute a competing gratis product in the marketplace driving the price to zero, (perhaps combined with nags for donations to TDF).

This is something that Collabora, in particular, finds hard to accept; according to Meeks, Collabora is responsible for about 95% of the development work going into LibreOffice Online. A turnkey, binary offering from TDF would put Collabora into a position of competing with its own work, which would be available without charge, carrying the LibreOffice name, and perhaps lacking even a suggestion that users might consider buying services from TDF member companies. It is not entirely surprising that the company does not see that as an attractive position to be in.

In response, and as a way of avoiding this situation, Collabora is taking its online work out of TDF and creating a new project called Collabora Online, hosted on GitHub. It remains free software, of course, but it seems clear that Collabora Online will be managed much more like a single-company project that is intended to drive revenue for that company:

By having a single Online project hosted by Collabora, a clear brand story easily establishes an appropriate credit for the origin of the product into every reference to it by those using it for free. Just like the other prominent OSS projects we integrate with.

Meeks expresses hope that, beyond solidifying the business case for Collabora, this move will ease some of the tensions within TDF. It will define "LibreOffice" as referring to the desktop application in particular, reduce the pressure on TDF to improve its support of its member companies, and establish LibreOffice as "a core Technology which can be built upon to create amazing things such as Collabora Online". It will, he hopes, bring an end to fraught discussions (many of which are evidently private) within TDF and allow its members to "return to positive mutual regard".

Whether that happens will depend partly on how the other members of TDF respond to this move. Given that Collabora Online will still be free software, there is little preventing other TDF members from simply merging Collabora's work back into LibreOffice and offering it free of charge anyway. Thus far, nobody has (publicly) expressed any interest in escalating the situation in this way, though.

The one response that is public is this message from Lothar Becker, the current chair of the TDF board of directors. He asked members to work toward "finding compromises for all sides, win-win solutions" and noted that it will now be necessary to revisit work on the TDF marketing plan:

Sadly, this current move comes in a period where we took steps to achieve a compromise with a marketing plan in discussion, the decision was postponed to have more time for it as the membership and the public asked for. This plan with proposed compromises is now obsolete in some parts and we have to see what this move now un-knots in decisions and activities in the sake of TDF and its underlying statutes.

Figuring out what TDF's course should be from here is not an enviable task. For all of the positive words, Collabora Online represents what is, in effect, a partial secession from the foundation — a statement of a lack of confidence that working within the TDF is in Collabora's interests. Such a move from one of an organization's principal members will be an unwelcome development at best, if not a threat to the organization as a whole.

What will happen next is not easy for an outsider to predict. Perhaps the TDF board will decide to make changes with the intent of attracting Collabora back fully into the organization; that could risk increasing tensions with other parts of the community, though. Or maybe other commercial members will begin to withdraw as well, pushing TDF (and possibly LibreOffice with it) into relative irrelevance. Or perhaps Collabora Online will thrive alongside a newly refocused TDF that retains responsibility for the overwhelming majority of the LibreOffice code.

Free software is a wonderful thing, but its creation is not free. Complex software projects that lack the support of paid developers tend to languish. Look at the state of Apache OpenOffice, which has not managed to produce a feature release since 2014 and struggles to even get security fixes out, for example. Some projects naturally attract company support, while others don't; each must find its own path to sustainability. LibreOffice has been struggling with this issue for a while; one can only hope that this crucial free-software project can find a solution that works for all of its stakeholders.

Comments (28 posted)

Fixing our broken internet

By Jake Edge
October 7, 2020

In unusually stark terms, Mozilla is trying to rally the troops to take back the internet from the forces of evil—or at least "misinformation, corruption and greed"—that have overtaken it. In a September 30 blog post, the organization behind the Firefox web browser warned that "the internet needs our love". While there is lots to celebrate about the internet, it is increasingly under threat from various types of bad actors, so Mozilla is starting a campaign to try to push back against those threats.

The effort is, to a certain extent, an attempt to raise the profile of Firefox, which does generally have a better track record on respecting privacy than its competitors. That should not come as a huge surprise since the other major browsers come from companies that stand to profit from surveillance capitalism. The Mozilla Foundation, on the other hand, is a non-profit organization that is guided by a pro-privacy manifesto. But beyond just pushing its browser, Mozilla is looking to try to fundamentally change things:

And in recent years we’ve seen those with power — Big Tech, governments, and bad actors — become more dominant, more brazen, and more dangerous. That’s a shame, because there’s still a lot to celebrate and do online. Whether it’s enjoying the absurd — long live cat videos — or addressing the downright critical, like beating back a global pandemic, we all need an internet where people, not profits, come first.

So it’s time to sound the alarm.

The internet we know and love is fcked up.

That is some of the background behind the "Unfck the Internet" campaign. The blog post gets more specific about exactly what kinds of abuses are being targeted:

Let’s take back control from those who violate our privacy just to sell us stuff we don’t need. Let’s work to stop companies like Facebook and YouTube from contributing to the disastrous spread of misinformation and political manipulation. It’s time to take control over what we do and see online, and not let the algorithms feed us whatever they want.

The plan consists of "five concrete and shareable ways to reclaim what’s good about life online by clearing out the bad", much of which revolves around Firefox add-ons that are intended to help combat some of the abuses. For example, Facebook and others target ads at particular groups, so that it is hard for researchers to get a full view of all of the different ads being served. It is difficult to determine what abuses are being perpetrated in these ads if they cannot be seen. So the Ad Observer add-on collects the ads that are seen on Facebook and YouTube to share with them with the Online Political Transparency project at NYU.

Another entry revolves around the recent documentary The Social Dilemma, which, somewhat ironically, comes from Netflix. It is a much-talked-about look at the problems inherent in social media and our reliance upon it. The campaign is suggesting that people watch the movie and share it with their friends, but also that they take action to realign social media and their use of it. Beyond that, there is a suggested reading list to dig further into the topic of social media and its effects on society.

Two other Firefox add-ons are suggested. Facebook Container is meant to make it harder for Facebook to track users across the web by making use of Firefox Multi-Account Containers. The idea is that interaction with a site is done only in a color-coded tab that doesn't share identity information (and cookies) with other containers. Facebook Container ensures that links from Facebook pages are followed in a separate container so that Facebook cannot track the user; using Facebook "Share" buttons outside of the container will route them through the container as well.

Unfck the Internet also recommends the RegretsReporter extension to report on YouTube videos that were recommended but turned out to be objectionable. The idea is to try to crowdsource enough information about the YouTube recommendation system to better understand it—and the AI behind it.

The RegretsReporter extension gives you a way to report YouTube Regrets—videos that YouTube has recommended to you that you end up wishing you had never watched. This extension is part of a crowdsourced data campaign to learn more about what people regret watching, and how it was recommended to them. [...]

Insights from the RegretsReporter extension will be used in Mozilla Foundation's advocacy and campaigning work to hold YouTube and other companies accountable for the AI developed to power their recommendation systems. These insights will be shared with journalists working to investigate these problems and technologists working to build more trustworthy AI.

As might be guessed, there are some serious privacy implications from these add-ons, RegretsReporter in particular. Mozilla is clearly conscious of that; it specifically describes which data it is collecting and how users' privacy will be protected in the description of the add-on. The company has generally been seen as a beacon of pro-privacy actions over the years, but it did have a prominent stumble in late 2017 when it installed an add-on ("Looking Glass") without user consent. The privacy implications of that were probably not large, in truth, but the action certainly gave off a bad smell, which was acknowledged by Mozilla in a retrospective analysis of "Looking Glass". That the add-on was a tie-in to a television show, thus presumably done for monetary gain, only made things worse.

The final recommendation is to use more of what it calls "independent tech", which are products and projects that, like Firefox, are focused on protecting the privacy and security of their users. It lists a small handful of companies, Jumbo, Signal, Medium, ProtonMail, and Mozilla's own Pocket, that embody the attributes that Unfck the Internet would like to see:

Luckily, we aren't the only ones who believe that the internet works best when your privacy and security are protected. There are a number of us out there pushing for an internet that is powered by more than a handful of large tech companies, because we believe the more choice you have the better things are for you — and for the web. We vetted these companies for how they treat your data and for their potential to shake things up. In short: they’re solid.

Together, we have power. We all win when everyone supports indie tech. Here are just a few of the smaller, independent players whose services we think you should be using. If you help them, you help yourself. So go ahead and join the anti-establishment.

The intent, it would seem, is for this announcement to be a starting point. More recommendations and ideas will be forthcoming from the project down the road. Getting the word out more widely is another focus of effort, of course, so those interested are encouraged to spread the word—presumably via social media, ironically. The time is ripe: "It’s time to unfck the internet. For our kids, for society, for the climate. For the cats."

The problems being addressed are real enough, for sure; it would be great to see a grass-roots effort make some serious headway in solving them. Whether or not that is truly realistic is perhaps questionable, but it is hard to see other plausible ways to combat the problems. We humans have made this trap for ourselves by consistently choosing convenience and gratis over other, possibly more important, values. Companies that stand to gain from all of these problems are going to be fighting tooth and nail to retain their positions and prerogatives, so they can increase their profits, thus their shareholder value. Until and unless humanity, as a whole, wises up, things probably will not change all that much. In the meantime, though, we can find ways to better protect our own privacy and security—and help our friends, family, and neighbors do the same.

Comments (32 posted)

From O_MAYEXEC to trusted_for()

By Jonathan Corbet
October 1, 2020

The ability to execute the contents of a file is controlled by the execute-permission bits — some of the time. If a given file contains code that can be executed by an interpreter — such as shell commands or code in a language like Perl or Python, for example — there are easy ways to run the interpreter on the file regardless of whether it has execute permission enabled or not. Mickaël Salaün has been working on tightening up the administrator's control over execution by interpreters for some time, but has struggled to find an acceptable home for this feature. His latest attempt takes the form of a new system call named trusted_for().

Tightly locked-down systems are generally set up to disallow the execution of any file that has not been approved by the system's overlords. That control is nearly absolute when it comes to binary machine code, especially when security modules are used to enforce signature requirements and prevent techniques like mapping a file into some process's address space with execute permission. Execution of code by an interpreter, though, just looks like reading a file to the kernel so, without cooperation from the interpreter itself, the kernel cannot know whether an attempt is being made to execute code contained within a given file. As a result, there is no way to apply any kernel-based policies to that type of access.

Enabling that cooperation is the point of Salaün's work; it is, at its core, a way for an interpreter to inform the kernel that it intends to execute the contents of a file. Back in May 2020, the first attempt tried to add an O_MAYEXEC flag to be used with the openat2() system call. If system policy does not allow a given file to be executed, an attempt to open it with O_MAYEXEC will fail.

This feature was controversial for a number of reasons, but Salaün persisted with the work; version 7 of the O_MAYEXEC patch set was posted in August. At that point, Al Viro asked, in that special way he has, why this check was being added to openat2() rather than being made into its own system call. Florian Weimer added that doing so would allow performing checks on an already-open file; that would enable interpreters to indicate an intent to execute code read from their standard input, for example — something that O_MAYEXEC cannot do. Salaün replied that controlling the standard input was beyond the scope of what he was trying to do.

Nonetheless, he tried to address this feedback in version 8, which implemented a new flag (AT_INTERPRETED) for the proposed faccessat2() system call instead. That allowed the check to be performed on either a file or an open file descriptor. This attempt did not fly either, though, with Viro insisting that a separate system call should be provided for this feature. This approach also introduced a potential race condition if an attacker could somehow change a file between the faccessat2() call and actually opening the file. So Salaün agreed to create a new system call for this functionality.

Thus, the ninth version introduced introspect_access(), which would ask the kernel if a given action was permissible for a given open file descriptor. There comes a point in kernel development (and beyond) when one can tell that the substantive issues have been addressed: when everybody starts arguing about naming instead. That is what happened here; Matthew Wilcox didn't like the name, saying that checking policy on a file descriptor is not really "introspection". Various suggestions then flew by, including security_check(), interpret_access(), entrusted_access(), fgetintegrity(), permission(), lsm(), should_faccessat(), and more.

In the tenth version, posted on September 24, Salaün chose none of those. The proposed new system call is now:

    int trusted_for(const int fd, const enum trusted_for_usage usage,
                    const unsigned int flags);

The fd argument is, of course, the open file descriptor, while usage describes what the caller intends to do with that file descriptor; in this patch set, the only possible option is TRUSTED_FOR_EXECUTION, but others could be added in the future. There are no flags defined, so the flags argument must be zero. The return value is zero if the system's security policy allows the indicated usage, or EACCES otherwise. In the latter case, it is expected that the caller will refuse to proceed with executing the contents of the file.

The patch also adds a new sysctl knob called fs.trust_policy for setting a couple of policy options. Setting bit zero disables execution access for files located on a filesystem that was mounted with the noexec option; bit one disables execution for any file that does not have an appropriate permission bit set. Both of these are checks that are not made by current kernels. There are no extra security-module hooks added at this time, but that would appear to be the plan in the future; that will allow more complex policies and techniques like signature verification to be applied.

This time around, even the name of the system call has avoided complaints — as of this writing, at least. So it may just be that this long-pending feature will finally make its way into the mainline kernel. That is not a complete solution to the problem, of course. Security-module support will eventually be needed, but also support within the interpreters themselves. That will require getting patches accepted into a variety of user-space projects. Fully locking down access to files by interpreters, in other words, is going to take a while yet.

Comments (31 posted)

Getting KDE onto commercial hardware

October 5, 2020

This article was contributed by Marta Rybczyńska

Akademy

At Akademy 2020, the annual KDE conference that was held virtually this year, KDE developer Nate Graham delivered a talk entitled "Visions of the Future" (YouTube video) about the possible future of KDE on commercial products. Subtitled "Plasma sold on retail hardware — lots of it", the session concentrated on ways to make KDE applications (and the Plasma desktop) the default environment on hardware sold to the general public. The proposal includes creating an official KDE distribution with a hardware certification program and directly paying developers.

Graham started by giving some context; the ideas to be presented were a followup on the KDE accessibility and productivity goals from 2017. One of the objectives was to get Plasma and KDE applications ready to work on all kinds of hardware. Graham thinks that this has been achieved and it is the time to move to the next step: creating an official KDE operating system. He commented: "we have to, if we want to have direct relations with hardware vendors".

KDE already has an official distribution called neon, he said. Neon is, however, a "halfway product", because it showcases current KDE products on top of a distribution (Ubuntu 20.04 LTS) that may otherwise be outdated. On the other hand, it is good enough to be shipped on Slimbook laptops. A member of the audience requested an explanation of what has changed from the inception of neon, which was not called an official KDE OS at that time. Graham responded that there was a fear of harming the relationships between KDE and distributors, but agreed that a good way to go forward would be to call neon what it really is: the official KDE distribution.

Graham continued by presenting his list of requirements for such an OS. First, it needs the latest software, including a current Linux kernel, which is necessary for hardware enablement. The applications should be newer than those found in Ubuntu; they could be installed from Flatpaks or Snaps. The last requirement was to make it possible to rebase this system onto another distribution. With such features, this system "will be more awesome", he said.

Another member of the audience asked whether neon is a good reference platform, since there is not much development effort going into it right now. Graham said that neon is not perfect and it is up to KDE to create the official OS. It could be based on neon, but could also be something else like Fedora Silverblue or openSUSE, with a deeper partnership. The platform needs to be something that KDE controls, he emphasized, to get better relations with hardware vendors.

The next question related to whether a rolling distribution, like neon, is suitable for non-technical users. Graham answered that neon is, at the same time, both rolling (for the KDE software) and non-rolling (for the Ubuntu LTS part); this satisfies nobody. He imagines some kind of a user-facing switch to decide between a fully rolling distribution and one with packages that have been more fully tested, but said he has never seen an OS that works like that.

A bad idea?

Graham then asked: "Is this the worst idea anyone ever came up with?"; he elaborated that people may fear that an official KDE OS could destroy the ecosystem and alienate other distributions shipping KDE. He does not fear this outcome "because we already did it" with neon and "it worked". He thinks it would, instead, push the other distributors to improve their offerings. He also thinks that there is room for more than one solution, and that the end result will be beneficial. In the end, it would allow KDE to work with hardware vendors in a better way.

The next step, after the official distribution, is to create a hardware certification program, according to Graham. This would give hardware vendors a channel to introduce the changes they need. It also would bring more confidence to users, with "no more fear or guesswork"; institutions often rely on such programs to mitigate risk. He gave a number of ideas on how such certification could work. Part of it could be an official quality-assurance program, and part crowdsourced, like the Arch wiki "Hardware" pages. "We can basically do the same thing", he said, so that the KDE developers can learn what the hardware vendors need, and can adjust their work to make it more attractive for preinstallation.

Once the hardware certification program exists, the next logical step would be for KDE to partner with hardware vendors to create products — devices with KDE preinstalled. The project would want to contact companies selling Linux-enabled hardware now and convince them to enable KDE by default. This would mean, Graham said, "more hardware running Plasma" sold to consumers, so that more people would get it. That would make it easier for KDE to enter other markets, like embedded and automotive, where Qt (the library KDE software is based on) is already strong. This will be a virtuous cycle for everyone, he added, and pointed out that it is already happening with the Slimbook, Kubuntu Focus, and many other laptops using Plasma.

Graham then asked how this vision could be realized. His answer was that KDE needs to pay for development work. Currently there are important areas that do not have enough resources, including system administration and the web site. "We need more sustainable means", he added, and explained the problem: employed professionals are busy with their jobs and do not have the time to do the advanced work in KDE, even if they are capable of doing it. On the other hand, students have time, but do need guidance from professionals. Those professionals, however, do not have the time to give it.

Graham suggested that the funding for this work could come from KDE e.V. (the organization representing KDE when needed, especially in legal and financial matters). The association has a lot of money and encounters difficulties spending it all, which it is legally obliged to do, he explained. Currently KDE e.V. spends on everything except development. Starting to fund development is the next logical step, he said.

He then explained that this would not change the current situation, in which most KDE long-term contributors are paid anyway, but not by KDE. There is an ecosystem of companies around KDE that employs those developers. Some examples of projects led by paid developers are Krita (supported by the Krita Foundation), and Plasma (supported by Blue Systems, Enioka, and others).

With regard to choosing the projects to support, Graham proposed making the decisions democratically — by voting. Then it will be up to KDE e.V. to handle the hiring process, as it is already doing for contractors, he added. Ideally, the hiring would be done within the community. In addition, hiring may allow KDE e.V. to ask for bigger financial contributions from its members. Graham's recommendation would be to hire system administrators first, then developers for Plasma and basic applications like Dolphin, Gwenview, and Okular. This solution would create a career path for developers and reduce brain drain: KDE developers would move from being new to experienced, then to senior, and then work as mentors. This would mean, according to Graham, a more stable and professional community. In addition, it would give more assurance to hardware vendors.

In conclusion

Once again, he asked whether this idea might be a bad one. It could, some might worry, drain motivation from volunteer developers. His rapid response is that it would not, because "this is where we are". Some developers are already paid, and KDE e.V. is funding non-technical work as well. "There is nothing in free software that requires unpaid volunteerism; people have bills", he said, and suggested to try it out and see if it works. The community can expand the effort if it does, or retreat if it does not.

He finished by giving some examples of projects already having paid contributors, including five full-time contributors for Krita for about €20,000/month (that was corrected from the audience: the actual number of developers is higher, but was not revealed), Blender with 20 full-time contributors and a €100,000/month budget, and Linux Mint, with three paid developers for $12-14,000/month. He added that more solutions exist and asked the audience to think about what KDE could do with paid developers.

Graham wrapped up early to allow for questions. One audience member wanted to know what types of hardware he considered. The answer was "the sky is the limit". Laptops are an obvious choice, Graham said, but they also allow opening to other solutions. "You can have Plasma on your TV", he said, or have it on a smart voice assistant. If vendors would install KDE by default, that would make such systems widely available.

There was also a discussion about the distribution choice. One suggestion was to contribute to existing distributions instead of creating a separate KDE OS. Graham responded that this kind of contribution is happening already. "Not saying it is a bad model", he added, but, according to him, there is room for KDE to be a distributor too. It would add vibrancy to the ecosystem, allowing distributions to learn from KDE, like KDE learns from them.

Another audience member noted that hardware vendors are already building their own distributions, so why not talk to them? Graham responded that he has one example in mind; he tried to contact the company, but got no answer. He heard from rumors that it may be open to the idea, but that requires a personal connection inside the company. He asked the audience to contact him if someone had such a contact. He is "very interested in pitching hardware vendors", he added.

The final question concerned the details of how to pay developers, by suggesting per-feature or per-bug payments. Graham prefers hiring a person to work on a project rather than on specific features. The exact features to support can be discussed between KDE e.V. members. There might also be a prioritization system, with also main bugs that need to be fixed for a release. The KDE product manager can help, he added.

Comments (9 posted)

Ruby 3.0 brings new type checking and concurrency features

By John Coggeshall
October 7, 2020

The first preview of Ruby version 3.0 was released on September 25. It includes better support for type checking, additional language features, and two new experimental features: a parallel execution mechanism called Ractor, and Scheduler, which provides concurrency improvements.

According to a 2019 keynote [YouTube] by Ruby chief designer Yukihiro "Matz" Matsumoto, type checking is a major focus of Ruby 3. In his presentation, he noted that Python 3, PHP, and JavaScript have all implemented some version of the feature. In fact, Ruby already has type-checking abilities in the form of a third-party project, Sorbet. For Ruby 3.0, type checking has been promoted into the core project, implemented as a new sub-language called Ruby Signature (RBS). This mirrors the approach taken by Sorbet, which implemented a sub-language called Ruby Interface (RBI). Sorbet allows annotations to exist within Ruby scripts, something that the community wanted to avoid, according to a presentation [YouTube] (slides [PDF]) by contributor Yusuke Endoh; by keeping RBS separate from Ruby, he explained, the project doesn't have to worry about conflicts in syntax or grammar between the two languages. In a recent blog post, the Sorbet project committed to supporting RBS in addition to its RBI format.

In a post introducing RBS, core developer Soutaro Matsumoto provided a detailed look at the feature. Conceptually, RBS files are similar to C/C++ header files, and currently are used in static code analysis with a project called Steep. As a part of the 3.0 release, Ruby will ship with a full collection of type annotations for the standard library.

Here is an example of an RBS file defining the types for an Author class:

    class Author
        attr_reader email: String
        attr_reader articles: Array[Article]

        def initialize: (email: String) -> void

        def add_article: (post: Article) -> void
    end

This declaration defines the types for two properties of the Author class: email (of type String) and articles (Array[Article]). attr_reader signifies that a property should provide an attribute accessor, which generates a method to read the property. The initialize() method is defined to take a single parameter, email, which is typed String and the method returns void. Finally, the add_article() method takes a single parameter, post, declared as an Article type; it also returns void.

Type unions, used when there are multiple valid types, are represented using the | operator (e.g. User | Guest). An optional value can be indicated by adding the ? operator to the end of the type declaration (e.g. Article?), which would allow either the specified type(s) or a nil value.

A new interface type enhances Ruby support for the duck typing design pattern. An interface in Ruby, as in other languages, provides a means to describe the methods that an object needs to implement to be considered compliant. In Ruby, classes are not explicitly declared to implement an interface. Instead, when an interface is used in RBS, it indicates that any object which implements the methods defined by that interface is allowed. Here is an example of an Appendable interface:

    interface Appendable
        def <<: (String) -> void
    end

As shown, Appendable defines an interface that requires an implementation of the << operator often used by classes like String as an append operation. This interface then can be used in other type definitions, such as this example of an RBS declaration for an AddToList class:

    class AddToList
        def append: (Appendable) -> void
    end

By specifying Appendable as the parameter type for the append() declaration shown above, any object which implements the << operator (for a String operand) can be used when calling the method.

Parallel execution

Another (currently experimental) addition coming in Ruby 3 is a parallel-execution feature called Ractor. According to the documentation, Ractors look a lot like processes, with no writable shared memory between them. Most Ruby objects cannot be shared across Ractors, save a few exceptions: immutable objects, class/module objects, and "special shareable objects" like the Ractor object itself.

Ractors communicate through "push" and "pull" messaging, with the mechanisms being implemented using a pair of sending and receiving ports for each Ractor. Below is an example of using the Ractor API to communicate between a Ractor instance and the parent program:

    # The code within the Ractor will run concurrently to the
    # main program

    r = Ractor.new do
        msg = Ractor.recv  # Receive a message from the incoming queue
        Ractor.yield msg   # Yield a message back
    end

    r.send 'Hello'       # Push a message to the Ractor
    response = r.take    # Get a message from the Ractor

Multiple Ractors can be created to produce additional concurrency or construct complex workflows, see the examples provided in the documentation for details.

While most Ruby objects cannot be shared between the main program and its Ractors, there are options available for moving an object between these contexts. When sending or yielding an object to or from a Ractor, an optional move boolean parameter may be provided in the API call. When set to true, the object will move into the appropriate context, making it inaccessible to the previous context:

    r = Ractor.new do
        object = Ractor.recv
        object << 'world'
        Ractor.yield object
    end

    str = 'hello '
    r.send str, move : true
    response = r.take

    str << ' again' # This raises a `Ractor::MovedError` exception

In the example above, we define a Ractor instance r that receives an object, uses the << operator to append the string "world", then yields that object back using the yield() method. In the program's main context, a String object is assigned a value of "hello "; this is then passed into the Ractor r with a call to send(), setting the move parameter to true and making str available to r as a mutable object. Conversely, str in the main context becomes inaccessible, so the attempt to modify it will raise a Ractor::MovedError exception. For now, the types of objects that can be moved between contexts is limited to the IO, File, String, and Array classes.

Other updates

The preview release included another experimental concurrency-related feature, the scheduler interface, designed to intercept blocking operations. According to the release notes, it "allows for light-weight concurrency without changing existing code." That said, the feature is designated "strongly experimental" and "both the name and feature will change in the next preview release." It appears that this feature is largely targeted to be a wrapper for gems like EventMachine and Async that provide asynchronous or concurrency libraries for the language.

Ruby 3 also includes some new syntax like rightward assignments using the => operator (e.g. 0 => x to assign x). The new release will also have several backward-compatibility breaks with Ruby 2.7. Per the release notes on backward compatibility, "code that prints a warning on Ruby 2.7 won't work"; see the provided compatibility documentation for a complete description of breaking changes.

A significant number of changes are being made to the default and bundled gems in Ruby 3.0, including the removal of two previously bundled gems: net-telnet and xmlrpc. Likewise, 25 gems were promoted to "default gems"; the core development team maintains these gems, and, unlike bundled ones, they cannot be removed from a Ruby installation. Many of the new default gems provide implementations of various protocols such as net-ftp, net-http, and net-imap, while others, like io-wait and io-nonblock, improve Ruby's I/O functionality; see the release notes for a complete listing.

Yukihiro Matsumoto recently confirmed [YouTube] that he expects Ruby 3.0 to be completed on December 25 this year. It will be interesting to see if that date holds; the project only released the first 3.0 preview a few days ago. With multiple features in the current preview release still designated experimental, his timetable looks aggressive. On the other hand, the project has been delivering significant releases on Christmas for many years; it seems likely, given that tradition, that they will find a way to make it this year too. Schedule aside, Ruby 3.0 is sure to have many new features fans of the language will enjoy when it is released.

Comments (4 posted)

Zig heading toward a self-hosting compiler

By Jake Edge
October 6, 2020

The Zig programming language is a relatively recent entrant into the "systems programming" realm; it looks to interoperate with C, while adding safety features without sacrificing performance. The language has been gaining some attention of late and has announced progress toward a Zig compiler written in Zig in September. That change will allow LLVM to become an optional component, which will be a big step forward for the "maturity and stability" of Zig.

Zig came about in 2015, when Andrew Kelley started a GitHub repository to house his work. He described the project and its goals in an introductory blog post in 2016. As he noted then, it is an ambitious project, with a goal to effectively supplant C; in part, that is done by adopting the C application binary interface (ABI) for exported functions and providing easy mechanisms to import C header files. "Interop with C is crucial. Zig embraces C like the mean older brother who you are a little afraid of but you still want to like you and be your friend."

Hello

The canonical "hello world" program in Zig might look like the following, from the documentation:

const std = @import("std");

pub fn main() !void {
    const stdout = std.io.getStdOut().outStream();
    try stdout.print("Hello, {}!\n", .{"world"});
}

The @import() function returns a reference to the Zig standard library, which gets assigned to the constant std. That evaluation is done at compile time, which is why it can be "assigned" to a constant. Similarly, stdout is assigned to the standard output stream, which then gets used to print() the string (using the positional formatting mechanism for "world"). The try simply catches any error that might get returned from print() and returns the error, which is a standard part of Zig's error handling functionality. In Zig, errors are values that can be returned from functions and cannot be ignored; try is one way to handle them.

As the documentation points out, though, the string being printed is perhaps more like a warning message; perhaps it should print to the standard error stream, if possible, and not really be concerned with any error that occurs. That allows for a simpler version:

const warn = @import("std").debug.warn;

pub fn main() void {
    warn("Hello, world!\n", .{});
}

Because this main() cannot return an error, its return type can be void, rather than !void as above. Meanwhile, the formatting of the string was left out in the example, but could be used with warn() as well. In either case, the program would be put into hello.zig and built as follows:

$ zig build-exe hello.zig
$ ./hello
Hello, world!

Compiler and build environment

The existing compiler is written in C++ and there is a stage-2 compiler written in Zig, but that compiler cannot (yet) compile itself. That project is in the works; the recent announcement targets the imminent 0.7.0 release for an experimental version. The 0.8.0 release, which is due in seven months or so, will replace the C++ compiler entirely, so that Zig itself will be the only compiler required moving forward.

The Zig build system is another of its distinguishing features. Instead of using make or other tools of that sort, developers build programs using the Zig compiler and, naturally, Zig programs to control the building process. In addition, the compiler has four different build modes that provide different tradeoffs in optimization, compilation speed, and run-time performance.

Beyond that, Zig has a zig cc front-end to Clang that can be used to build C programs for a wide variety of targets. In a March blog post, Kelley argues that zig cc is a better C compiler than either GCC or Clang. As an example in the post, he downloads a ZIP file of Zig for Windows to a Linux box, unzips it, runs the binary Zig compiler on hello.c in Wine targeting x86_64-linux, and then runs the resulting binary on Linux.

That ability is not limited to "toy" programs like hello.c. In another example, he builds LuaJIT, first natively for his x86_64 system, then cross-compiles it for aarch64. Both of those were accomplished with some simple changes to the make variables (e.g. CC, HOST_CC); each LuaJIT binary ran fine in its respective environment (natively or in QEMU). One of the use cases that Kelley envisions for the feature is as a lightweight cross-compilation environment; he sees general experimentation and providing an easy way to bundle a C compiler with another project as further possibilities.

The Zig compiler has a caching system that makes incremental builds go faster by only building those things that truly require it. The 0.4.0 release notes have a detailed look at the caching mechanism, which is surprisingly hard to get right, due in part to the granularity of the modification time (mtime) of a file, he said:

The caching system uses a combination of hashing inputs and checking the fstat values of file paths, while being mindful of mtime granularity. This makes it avoid needlessly hashing files, while at the same time detecting when a modified file has the same contents. It always has correct behavior, whether the file system has nanosecond mtime granularity, second granularity, always sets mtime to zero, or anything in between.

The tarball (or ZIP) for Zig is around 45MB, but comes equipped with the cross-compilation and libc targets for nearly 50 different environments. Multiple architectures are available, including WebAssembly, along with support for the GNU C library (glibc), musl, and Mingw-w64 C libraries. A full list can be found in the "libc" section toward the end of the zig cc blog post.

Types

Types in Zig have first-class status in the language. They can be assigned to variables, passed to functions, and be returned from them just like any other Zig data type. Combining types with the comptime designation (to indicate a value that must be known at compile time) is the way to have generic types in Zig. This example from the documentation shows how that works:

fn max(comptime T: type, a: T, b: T) T {
    return if (a > b) a else b;
}
fn gimmeTheBiggerFloat(a: f32, b: f32) f32 {
    return max(f32, a, b);
}
fn gimmeTheBiggerInteger(a: u64, b: u64) u64 {
    return max(u64, a, b);
}

T is the type that will be compared for max(). The example shows two different types being used: f32 is a 32-bit floating-point value, while u64 is an unsigned 64-bit integer. That example notes that the bool type cannot be used, because it will cause a run-time error when the greater-than operator is applied. However, that could be accommodated if it were deemed useful:

fn max(comptime T: type, a: T, b: T) T {
    if (T == bool) {
        return a or b;
    } else if (a > b) {
        return a;
    } else {
        return b;
    }
}

Because the type T is known at compile time, Zig will only generate code for the first return statement when bool is being passed; the rest of the code for that function is discarded in that case.

Instead of null references, Zig uses optional types, and optional pointers in particular, to avoid many of the problems associated with null. As the documentation puts it:

Null references are the source of many runtime exceptions, and even stand accused of being the worst mistake of computer science.

Zig does not have them.

Instead, you can use an optional pointer. This secretly compiles down to a normal pointer, since we know we can use 0 as the null value for the optional type. But the compiler can check your work and make sure you don't assign null to something that can't be null.

Optional types are indicated by using "?" in front of a type name.

// normal integer
const normal_int: i32 = 1234;

// optional integer
const optional_int: ?i32 = 5678;

The value of optional_int could be null, but it cannot be assigned to normal_int. A pointer to an integer could be declared of type *i32, but that pointer can be dereferenced without concern for a null pointer:

    var ptr: *i32 = &x;
    ...
    ptr.* = 42;

That declares ptr to be a (non-optional) pointer to a 32-bit signed integer, the address of x here, and later assigns to where it points using the ".*" dereferencing operator. It is impossible for ptr to get a null value, so it can be used with impunity; no checks for null are needed.

So much more

It is a bit hard to consider this article as even an introduction to the Zig language, though it might serve as an introduction to the language's existence and some of the areas it is targeting. For a "small, simple language", Zig has a ton of facets, most of which were not even alluded to above. It is a little difficult to come up to speed on Zig, perhaps in part because of the lack of a comprehensive tutorial or similar guide. A "Kernighan and Ritchie" (K&R) style introduction to Zig would be more than welcome. There is lots of information available in the documentation and various blog posts, but much of it centers around isolated examples; a coherent overarching view of the language seems sorely lacking at this point.

Zig is a young project, currently, but one with a seemingly active community with multiple avenues for communication beyond just the GitHub repository. In just over five years, Zig has made a good deal of progress, with more on the horizon. The language is now supported by the Zig Software Foundation, which is a non-profit that employs Kelley (and, eventually, others) via donations. Its mission is:

[...] to promote, protect, and advance the Zig programming language, to support and facilitate the growth of a diverse and international community of Zig programmers, and to provide education and guidance to students, teaching the next generation of programmers to be competent, ethical, and to hold each other to high standards.

It should be noted that while Zig has some safety features, "Zig is not a fully safe language". That situation may well improve; there are two entries in the GitHub issue tracker that look to better define and clarify undefined behavior as well as looking at ways to add even more safety features. Unlike with some other languages, though, Zig programmers manually manage memory, which can lead to memory leaks and use-after-free bugs. Kelley and other Zig developers would like to see more memory safety features, especially with respect to allocation lifetimes, in the language.

Rust is an obvious choice for a language to compare Zig to, as both are seen as potential replacements for C and C++. The Zig wiki has a page that compares Zig to Rust, C++, and the D language that outlines advantages the Zig project believes the language has. For example, both flow control and allocations are not hidden by Zig; there is no operator overloading or other mechanisms where a function or method might get called in a surprising spot, nor is there support for new, garbage collection, and the like. It is also interesting to note that there is a project to use Zig to build Linux kernel modules, which is also an active area of interest for Rust developers.

One of the more interesting parts of the plan for a self-hosting Zig compiler is an idea to use in-place binary patching, instead of always rebuilding the binary artifact for a build. Since the Zig-based compiler will have full control of the dependency tracking and code generation, it can generate machine code specifically to support patching and use that technique to speed up incremental builds of Zig projects. It seems fairly ambitious, but is in keeping with Zig's overall philosophy. In any case, Zig seems like a project to keep an eye on in coming years.

Comments (56 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

Briefs: Plasma and systemd; Python 3.9; U-Boot 2020.10; SFC GPL enforcement; Quotes; ...
Announcements: Newsletters; conferences; security updates; kernel patches; ...

Next page: Brief items>>