|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for September 3, 2020

Welcome to the LWN.net Weekly Edition for September 3, 2020

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Resource management for the desktop

By Jonathan Corbet
August 27, 2020

LPC
For as long as we have had desktop systems, there have been concerns about desktop responsiveness and developers have been working to improve things in that area. Over the years, Linux has gained a number of capabilities — control groups in particular — that are applicable to the problem of improving desktop performance, but use of these features has lagged behind their availability. At the 2020 Linux Plumbers Conference, Benjamin Berg outlined some of the work that is being done by the Linux desktop projects to put recent kernel features to work.

His focus, he began, is on resource management for desktop systems, where the resources in question are CPU time, memory, and I/O bandwidth. Current systems leave applications to compete with each other for these resources, with little supervision. It is possible to favor some applications by adjusting nice (CPU-priority) levels, but such changes take effect at the process level. For the most part, Linux desktop systems are managed at the process level, which is a poor fit to the problem space.

It should be possible to do better than that by making use of the features provided by control groups. Rather than treating processes equally, it is possible to treat users or applications equally, regardless of the number of processes they run. Control groups can also help to make the desktop more responsive; a desktop manager can use them to implement decisions based on factors like the importance of a service, whether a given user is active at the moment, or whether any given application has the focus.

It may be 2020, but problems like thrashing and out-of-memory (OOM) handling still afflict desktop systems. There has been some recent work done to improve this situation; for example, Fedora adopted the earlyoom daemon for the Fedora 32 release. Earlyoom is an example of a "memory-available" [Benjamin Berg] approach to the problem; the core idea is to ensure that the system always has enough memory available to hold the files needed by the workload in cache. That prevents the system from paging out (and faulting back in) the executables that the user is currently running. Memory-available approaches have the advantage of being able to use current information; they can poll the amount of available memory multiple times per second if need be.

Another approach uses the pressure-stall information exported by relatively recent kernels. Tools like low-memory-monitor, nohang, and oomd use this information to manage system memory. In one way this is a better approach, Berg said; pressure-stall information is a more reliable way to tell whether the system is thrashing or not. But it's also relatively slow, being calculated every ten seconds or so; that is too slow for a desktop.

Some distributions are also experimenting with faster swapping as a way of improving memory availability; swapping to a zram device can improve the responsiveness of a system, for example. There are lot of approaches being tried in this area, he said.

But there is a fundamental problem: none of these approaches are effective at protecting graphical sessions and ensuring that they have the resources they need. The real goal is to ensure that critical processes like the graphical shell are responsive at all times. Problem applications must be identified and either isolated or killed. Managing available memory on a system-wide basis doesn't get that done.

Using control groups

Control groups can, though. They were designed for just this sort of process containment and control. By using the memory, CPU, and I/O-bandwidth controllers, a desktop manager should be able to contain thrashing and ensure the availability of resources for important processes.

That idea leads to the next question: how should these control groups be managed to create the desired level of process isolation? It turns out that systemd, through its management of control groups, provides the needed capabilities. So the GNOME project, and other desktop environments too, are starting to use it. "Starting" because many of the needed capabilities have only been available for the last year.

Getting there has involved setting up a per-user session bus in the D-Bus subsystem. A lot of fixes have been applied for proper session detection. It was necessary to port GNOME utilities to the systemd API. Tools like gnome-terminal have learned to create a new scope for each tab, providing proper process isolation. The KDE project, among others, is also working on systemd integration.

A set of conventions for the management of desktop sessions within systemd is under development. User sessions are split into three separate control groups:

  • session.slice contains the essential session processes — the desktop shell, for example.
  • app.slice contains normal applications.
  • background.slice contains "irrelevant stuff" like indexing processes or backup operations.

Everything that the user runs ends up in one of those groups. Each application gets its own control group under one of the above; the application ID is then encoded into the associated systemd unit name.

This arrangement allows control-group attributes to be modified on a per-slice or per-application basis; in other words, it is now possible to control resource availability in a per-application way. KDE has taken advantage of this feature to implement a new task manager that shows applications rather than processes. An application may run many processes, but that is not a detail that users are usually concerned about, and the number of processes should not affect the resources available to the application. A similar tool is being created for GNOME, making information on per-application resource usage easily available.

This work is not done, though. More tools still need to be migrated over to the systemd API; xdg-autostart applications are still not handled properly for example. Applications often launch other applications; think about a chat application launching a browser when the user clicks on a link. That browser should end up in its own group, rather than being grouped with the chat application. Applications are "just spawning stuff now"; they need to start making the proper calls to ensure that the grouping is handled correctly. KDE has a working API for this now, he said; the GNOME glib library is being updated as well.

The move to systemd is not complete at this point, and applications often still end up in the wrong control group. But it is "working in most cases" and already quite useful.

uresourced

Getting applications into the correct control groups is a step in the right direction; it ensures that applications compete against each other rather than processes competing. But that is not enough to create a truly responsive desktop; that requires adjusting the resources available to those control groups to implement the desired policy. To that end, he has been working on a resource manager called uresourced that enables and manages the CPU, memory, and I/O controllers for applications.

For example, uresourced tracks who the active user is and allocates 250MB of memory (or 10% of the total available, whichever is lower) to that user; that allocation is given to the session.slice group in particular so that it's available for the most important applications. The active user also gets a larger share of the available CPU time and I/O bandwidth. So applications (rather than processes) are now competing for the CPU, and the active user gets a larger share. The core session processes should be protected from thrashing regardless of the system workload; he confessed that he has yet to find an effective way of testing this behavior, though.

This work is not complete; among other things, the I/O controller is not fully configured. Systemd doesn't yet have all of the necessary options to do this configuration; it can't use the io.cost (formerly io.weight) controller, for example. There are also complications that come up when the system is using the logical volume manager (LVM), which distributions like Fedora do. Adding a new daemon for this task seems excessive as well; Berg expects uresourced to eventually be absorbed into something else. The logical home would be systemd-logind, but it doesn't quite fit there, he said.

For now, though, uresourced will be shipped as part of the Fedora 33 release; Fedora 32 users can also try it out by installing it separately.

Berg wound down with a few remaining questions, starting with whether systemd-oomd (the integration of oomd with systemd) will work well on desktop systems. He hasn't yet tried it, but it seems like it should be able to detect problematic workloads. The real question is how to react when that happens. Options include freezing the offending process and asking the user what to do, simply killing it off, or trying to actively contain it with control-group parameters. For now, there's little to do but to try it out and see how it works.

He wonders whether user sessions are being isolated sufficiently. The use of the CPU, memory, and I/O-bandwidth controllers should be enough to do the job. He worries, though, that the desktop may end up being crippled if the I/O controller does not work well enough. He mentioned again problems that arise when LVM is in use, and the inability to use the io.cost controller. The features provided by the kernel are good, he said, but they may not yet be usable by desktop environments.

What else could be done? One thing would be to prioritize the application that currently has the desktop focus. It should be simple to do, he said; simply find the right control group and set the priorities accordingly. Perhaps some work could be done to save power as well; finding and freezing tasks that perform too many wakeups, for example.

There is clearly a lot of work yet to be done in this area, but it seems that progress is being made. Look for desktop systems to become more responsive in the near future.

The slides from this session [PDF] are available.

Comments (13 posted)

The winding road to PHP 8's match expression

By John Coggeshall
September 2, 2020

New to the forthcoming PHP 8.0 release is a feature called match expressions, which is a construct designed to address several shortcomings in PHP's switch statement. While it took three separate request-for-comment (RFC) proposals in order to be accepted, the new expression eventually received broad support for inclusion.

The match expression story began at the end of March 2020 with an RFC suggesting changes to the switch statement. The proposal, authored by Ilija Tovilo and Michał Brzuchalski, highlighted four shortcomings of switch: the inability to return values from the statement, matches falling through to the next case, "inexhaustiveness", and type coercion.

The problem with switch

The PHP switch statement is one of the oldest constructs in the language, and shares a few common traits with the C variety; the proposal suggests that these common behaviors aren't ideal for PHP. For example, each case within a switch will fall through to the next case unless there is a break (or continue) statement. Also like C, switch is not an expression and therefore does not return any values. If a developer wants to, for example, assign a value from the logic of a switch construct to a variable, they need to do so explicitly as part of the corresponding case statement.

The RFC further suggests that the way switch handles types is incorrect for modern PHP. Since PHP is fundamentally a dynamically typed language, switch employs type coercion (called "type juggling" in the PHP manual). This means passing a string "0" into a PHP switch statement will match against an integer case 0: block, which might make less sense as PHP continues to embrace types in modern versions. Finally, the RFC points out that the switch statement is "inexhaustive", meaning that it is not an error if no case block was found to match.

The path to a solution

To address these concerns, the RFC proposed a new "expression variant" of switch:

    $result = switch($condition) {
        1 => foo(),
        2 => bar(),
        3, 4, 5 => baz(),
    };

This expression would operate in a similar fashion to the switch statement, however the evaluated code of a matched condition would be returned as an assignable value. The switch expression would also be more strict than its statement counterpart, eliminating fall-throughs and throwing an exception if a match was not found.

When Tovilo introduced the RFC (Brzuchalski did not participate in the discussion), it became clear that, as written, the proposal wasn't going to go far. In a limited discussion, Dan Ackroyd responded with several fundamental problems regarding the approach. While Ackroyd agreed there were issues with switch, he did not feel the RFC was a "good starting point" for that discussion. Ackroyd continued by itemizing his concerns and, since the RFC did not perform type coercion when comparing values, "it's way less interesting to me". In Ackroyd's view, he believed the new expression being proposed should employ a new keyword; which avoids the risk of potentially confusing developers by reusing switch.

The initial response to the RFC prompted Tovilo to conduct a poll to try to find the best way forward for the concept. When sharing the informal vote with the community, Tovilo explained the reasoning:

There's been a fundamental disagreement on what the switch expression should actually be. Due to the conflicting feedback I no longer know how to proceed.

In response to the poll (and the justification for it), Rowan Tommins said "I think this confusion is there in your proposal, not just in people's responses to it", adding "I think changing the keyword to 'match', and possibly using strict comparison [not type coercion], would make a compelling feature."

In the end, the poll received five responses, not enough to build any consensus around the value of the proposal. The lack of appetite by the internals community for the RFC led Tovilo and Brzuchalski to withdraw it from consideration before a vote, replacing it the next day with a different proposal without Brzuchalski.

The new proposal introduced the match keyword, addressed type coercion, and fixed ambiguities in the original RFC structure to make the idea more clear; it yielded significantly more discussion than the original. Tommins seemed to appreciate the changes, yet still had several concerns. For example, Tovilo had suggested that the proposed match expression "allows you to use a match expression anywhere the switch statement would have been used previously." That was an idea Tommins took issue with:

I don't think it's necessary for the new keyword to be a replacement for every switch statement, any more than switch replaces every if statement or vice versa, and doing so adds a lot of complexity to the proposal.

Tovilo and Tommins, along with other community members, continued the debate of the proposal at length, mostly with regard to the statement blocks being added for match. Here is an example of a statement block from the proposal:

$y = match ($x) {
    0 => {
        foo();
        bar();
        baz(); // This value is returned
    },
}

PHP currently has no concept of statement blocks, and many community members found it strange to see them included as part of the proposed match expression syntax. Tommins, in particular, felt strongly that it was a mistake, though he liked the proposal otherwise; he expanded on this problem later in the thread, suggesting that statement blocks could be added in the future with more consideration. Another community member, Larry Garfield, supported Tommins in his suggestion to move statement blocks to a different proposal:

I really feel like this is the best solution. Multi-line expressions are not a logical one-off part of match(). They may or may not be a good idea for PHP generally, in which case they should be proposed and thought through generally. If they do end up being a good move generally, they'll apply to match naturally just like everywhere else; if not, then they wouldn't confuse people by being a one-off part of match() and nowhere else.

Tovilo declined to remove the proposed feature. Tommins also suggested separating the statement block issue into another vote within the RFC. Still, Tovilo didn't agree, saying:

If we were to remove blocks we'd probably also reconsider other things (namely the optional semicolon and break/continue) and many examples in the RFC would become invalid and irrelevant. This would probably lead to even more confusion which is why I will most likely not move blocks to an additional vote.

However, I will definitely include a poll to find out why it failed. I am committed to getting this into the language, in some form or another.

When voting opened for the proposal in late April 2020, it was handily defeated in a 28 to 6 vote. A secondary vote within the RFC, "If you voted no, why?", indicated that one of the primary reasons that the proposal failed was the inclusion of statement blocks.

For a time, it appeared that was the end of the discussion regarding the match expression. However, in late May 2020, Tovilo returned with a third proposal for a match expression. This third RFC importantly did not include statement blocks and, unlike previous proposals, focused solely on the core features the match expression needed to have.

The policy on rejected RFCs within the PHP community states six months must pass between the rejection of a proposal and its re-submission. That is unless the "author(s) make substantial changes to the proposal." Tovilo argued that this applied, as "many people have said without blocks they'd vote yes". This argument was accepted without much challenge from the community, allowing the proposal to move forward.

The final proposal of the match statement addressed all of the fundamental concerns that Tovilo originally laid out, wrapped neatly into a new expression. match expressions are strictly typed when being compared, throw an exception if a match is not found, and the concept of "fall through" was eliminated. Perhaps unsurprisingly, this less expansive version of Tovilo's proposal was received positively; most comments and discussion from the community were about using => or : in the syntax. The RFC passed in a 43 to 2 vote in time for inclusion into PHP 8.0. Below is a representative example of how match expressions in PHP 8 are used:

    $foo = getSomeValue();

    try { 
        // this uses strict checking of types when looking
        // for a match (0 != "0").
        $result = match($foo) {
                404 => 'Page not found',
                Response::REDIRECT => 'Redirect',
                $client->getCode() => 'Client error',
            };
    } catch(UnhandledMatchError $ex) {
        print "Did not match!";
    }

While it took a few tries, the match expression looks to be a useful addition to language. More examples of the match expression in PHP 8.0 are available for interested readers; it is also available in the latest PHP 8.0beta2 release.

Comments (11 posted)

Supporting Linux kernel development in Rust

August 31, 2020

This article was contributed by Nelson Elhage


LPC

The Rust programming language has long aimed to be a suitable replacement for C in operating-system kernel development. As Rust has matured, many developers have expressed growing interest in using it in the Linux kernel. At the 2020 (virtual) Linux Plumbers Conference, the LLVM microconference track hosted a session on open questions about and obstacles to accepting Rust upstream in the Linux kernel. The interest in this topic can be seen in the fact that this was the single most heavily attended session at the 2020 event.

This session built on prior work by many developers, including a talk last year by Alex Gaynor and Geoffrey Thomas [YouTube] at the Linux Security Summit. At that talk, they presented their work prototyping Rust kernel modules and made the case for adopting Rust in the kernel. They focused on security concerns, citing work showing that around two-thirds of the kernel vulnerabilities that were assigned CVEs in both Android and Ubuntu stem from memory-safety issues. Rust, in principle, can completely avoid this error class via safer APIs enabled by its type system and borrow checker.

Since then, Linus Torvalds and other core kernel maintainers have expressed openness in principle to supporting kernel development in Rust, so the session at Plumbers aimed to work through some of the requirements to eventually allowing Rust in-tree. The session was proposed and discussed on the linux-kernel mailing list, where some of the topics of discussion were previewed.

This session, too, featured Thomas and Gaynor, along with Josh Triplett — the Rust language team co-leader and a longtime Linux kernel developer — and a number of other interested developers. They briefly touched on their work so far and some of their initial thoughts and questions before opening the bulk of the time to discussion. They gave a brief example of what kernel-mode Rust code might look like (from Thomas and Gaynor's linux-kernel-module-rust project).

The speakers emphasized that they are not proposing a rewrite of the Linux kernel into Rust; they are focused only on moving toward a world where new code may be written in Rust. The ensuing conversation focused on three areas of potential concern for Rust support: making use of the existing APIs in the kernel, architecture support, and a question about ABI compatibility between Rust and C.

Binding to existing C APIs

In order to be useful for kernel development, it's not enough that Rust is able to generate code that can be linked into the kernel; there also needs to be a way for Rust to access the vast number of APIs used in the Linux kernel, which are all presently defined in C header files. Rust has good support for interoperating with C code, including support for both calling functions using the C ABI and for defining functions with C-compatible ABIs that can be called from C. Furthermore, the bindgen tool is capable of parsing C header files to produce the appropriate Rust declarations, so that Rust does not need to duplicate definitions from C, which also provides a measure of cross-language type checking.

[Rust discussion]

On the surface, these features make Rust well-equipped to integrate with existing C APIs, but the devil is in the details, and both the work to date and the conversation at the session revealed a handful of open challenges. For example, Linux makes heavy use of preprocessor macros and inline functions, which aren't easily supported by bindgen and Rust's foreign-function interface.

The ubiquitous kmalloc() function, for instance, is defined as __always_inline, meaning that it is inlined into all of its callers and no kmalloc() symbol exists in the kernel symbol table for Rust to link against. This problem can be easily worked around — one can define a kmalloc_for_rust() symbol containing an un-inlined version — but performing these workarounds by hand would result in a large amount of manual work and duplicated code. This work could potentially be automated by an improved version of bindgen, but such a tool does not yet exist.

The conversation also touched on a second question about API bindings: how much will C APIs need to be manually "wrapped" to present idiomatic Rust interfaces? A look at two existing Rust kernel module projects gives a flavor for some of the choices here.

In the linux-kernel-module-rust project, pointers into user space are wrapped into a UserSlicePtr type, which ensures appropriate use of copy_to_user() or copy_from_user(). This wrapper provides a level of safety in Rust code (these pointers can't be dereferenced directly), and also makes Rust code more idiomatic; writing to a user-space pointer looks something like

    user_buf.write(&kernel_buffer)?;

The ? here is part of Rust's error-handling machinery; this style of returning and handling errors is ubiquitous in Rust. Such wrappers make the resulting Rust more familiar to existing Rust developers, and enable Rust's type system and borrow checker to provide a maximum amount of safety. However, they must be carefully designed and developed for each API, which is a lot of work and creates distinct APIs for modules written in C and Rust.

John Baublitz's demo module, instead, binds the kernel's user-access functions more directly; the corresponding code there looks something like:

    if kernel::copy_to_user(buf, &kernel_buffer[0..count]) != 0 {
   	return -kernel::EFAULT;
    }

This style is easy to implement — the bindings are largely autogenerated by bindgen — and would also be more comfortable for existing kernel developers who have to review or patch Rust code. However, the code is much less idiomatic for Rust developers, and potentially gives up a lot of the safety guarantees that Rust promises.

There was some agreement at the session that writing Rust wrappers will make sense for some of the most common and critical APIs, but that manually wrapping every kernel API would be infeasible and undesirable. Thomas mentioned that Google is working on automatically generating idiomatic bindings to C++ code, and pondered whether the kernel could do something similar, perhaps building on top of existing sparse annotations or some new annotations added to the existing C to guide the binding generator.

Architecture support

The next area of discussion was architecture support. At present, the only mature Rust implementation is the rustc compiler, which emits code via LLVM. The Linux kernel supports a wide range of architectures, several of which have no available LLVM backend. For a few others, an LLVM backend exists, but rustc does not yet support that backend. The presenters wanted to understand whether full architecture support was a blocker to enabling Rust in the kernel.

Several people said that it would be acceptable to implement drivers in Rust that would never be used on the more obscure architectures anyway. Triplett suggested that adding Rust into the kernel would help drive increased architecture support for Rust, citing his experience with the Debian project. He mentioned that introducing Rust software into Debian helped to motivate enthusiasts and users of niche architectures to improve Rust support, and he expected that adding support to the kernel would have a similar effect. In particular, he was confident that any architecture with an LLVM backend would quickly be supported in rustc.

The conversation also discussed alternate Rust implementations as a path toward broader architecture support. The mrustc project is an experimental Rust compiler that emits C code. Using mrustc would potentially let Rust be compiled via the same C compiler that was compiling the rest of the kernel.

In addition, Triplett cited some interest in — and work toward — a Rust front end for GCC, potentially enabling Rust to target any architecture GCC supports. This project is in an early stage, but it presents another avenue toward closing the architecture gap in the future. The conclusion from this section was a little uncertain, but there did not seem to be strong pushback against the idea of supporting Rust device drivers without waiting for broader architecture support.

ABI compatibility with the kernel

Gaynor also asked for advice on a question of ABI compatibility. Since Rust is (currently) compiled via LLVM, and the kernel is most commonly built with GCC, linking Rust code into the kernel may mean mixing code emitted by GCC and LLVM. Even though LLVM aims to be ABI-compatible with GCC, there has been some pushback based on concerns that this strategy created a risk of subtle ABI incompatibilities. The presenters wondered whether the kernel community would prefer to limit Rust support to kernels built with Clang in order to ensure compatibility.

Greg Kroah-Hartman confirmed that the current kernel rule was that compatibility is only guaranteed if all object files in the kernel are built with the same compiler, using identical flags. However, he also expressed comfort with linking LLVM-built Rust objects into a GCC-built kernel as long as the objects are built at the same time, with the appropriate options set, and the resulting configurations are fully tested. He did not feel the need for any additional restrictions until and unless actual problems arise. Florian Weimer clarified that ABI issues tend to be in obscure corners of the language — for instance, returning a struct containing a bitfield by value — and that he would expect that the core, commonly-used parts of the ABI should pose no compatibility problems.

Triplett emphasized that calling between GCC and Rust was routine and widespread in user space, and so from the Rust side he has no concerns about compatibility. It sounded like this concern should not, in the end, be an impediment to bringing Rust into the kernel.

Conclusions

The session ended without any further specific next steps, but it seems that, overall, there is enthusiasm for eventually supporting Rust modules along with increasing agreement on the broad requirements for that support. The next big step will likely be when someone proposes a real Rust driver for inclusion into the kernel. A concrete use case and implementation always helps to force clarity about any remaining contentious questions and design decisions.

Comments (66 posted)

Software and hardware obsolescence in the kernel

By Jonathan Corbet
August 28, 2020

LPC
Adding code to the kernel to support new hardware is relatively easy. Removing code that is no longer useful can be harder, mostly because it can be difficult to know when something is truly no longer needed. Arnd Bergmann, who removed support for eight architectures from the kernel in 2018, knows well just how hard this can be. At the 2020 Linux Plumbers Conference, he led two sessions dedicated to the topic of obsolete software and hardware. With a bit of effort, he said, it should be possible to have a better idea of when something can be removed.

The software side

Obsolete hardware, he said, can be defined as devices that are no longer being made, usually because they have been superseded by newer, cheaper, and better products. Obsolete hardware can still be useful, and often remains in use for a long time, but it's hard to know whether any specific device is still used. Obsolete code is a bit different; the hardware it enables might still be in production, but all of its users are running older kernels and are never going to upgrade. In such cases, the code can be removed, since nobody benefits from its ongoing maintenance.

Bergmann's proposal is to create a way of documenting which code in the kernel is there solely for the support of obsolete hardware; in particular, it would note the kernel configuration symbols associated with that hardware. For each symbol, the document would describe why it is still in use and for how long that situation is expected to continue. The consequences of removing this support (effects on other drivers that depend on it, for example) would be noted, as would the benefits that would come from removing it.

There are various groups that would be impacted by this change. The kernel retains support for a number of hobbyist platforms, for example; these include processor architectures with no commercial value but an ongoing [Arnd Bergmann] hobbyist following. The kernel still supports a number of Sun 3 workstation models; he has no idea whether anybody is actually running such systems or not. Kernel developers generally like to keep hobbyist platforms alive as long as somebody is willing to maintain them.

Then there are platforms with few users, but those users may really need them. These include various types of embedded systems, industrial controllers, military systems, and more. There are also systems that are clearly headed toward obsolescence in the future. These include 32-bit architectures which, while still heavily used now, will eventually go away. Systems with big-endian byte order have declined 90% in the last ten years, and may eventually vanish entirely.

So where should this sort of information be documented? He proposed a few options, including a new file in the documentation directory, in the Kconfig files that define the relevant configuration symbols, somewhere on wiki.kernel.org, or somewhere else entirely. Responding to a poll in the conference system, just over half of the attendees indicated a preference for a file in the kernel tree.

At this point your editor had to jump in and ask how this idea compares to the kernel's feature-removal-schedule.txt file. This file was added in 2005 as a way of warning users about features that would go away soon; this file itself was removed in 2012 after Linus Torvalds got fed up with it. Why should the fate of this new file be different? Bergmann responded that this file would not be a schedule for removal of support; instead, it would be a way of documenting that support needs to be kept for at least a certain period of time. Users of the affected hardware could easily update the file at any time to assure the community that they still exist. As documentation for the reasons to keep support in the kernel, it would be more useful.

Florian Weimer asked what the effect would be on user space if this proposal were adopted; the answer was "none". User-space interfaces are in a different category, Bergmann said, with a much higher bar to be overcome before they can be removed. This file would cover hardware-specific code. Mike Rapoport added that it would be a way to know when users would be unaffected, once it becomes clear that nobody is running upstream kernels on the hardware in question.

Catalin Marinas suggested creating a CONFIG_OBSOLETE marker for code that supports obsolete hardware, but Will Deacon was nervous about that idea. He recently did some work on the page-table code for 32-bit SPARC machines; he got no comments on those changes, but he did get reports when various continuous-integration systems tested them. A CONFIG_OBSOLETE marker might be taken as a sign by the maintainers of such systems that the code no longer needs to be tested, reducing the test coverage significantly.

Bergmann added that 32-bit SPARC is an interesting case. People have been finding serious bugs in that code, he said, and System V interprocess communication isn't working at all. There is a lot of testing of 32-bit SPARC user space, but the tests all run on 64-bit kernels, where these problems do not exist. He is confident that this code has few — if any — remaining users.

Len Brown returned to the question of the old feature-removal-schedule.txt file, asking what had been learned from that experience. Bergmann replied that his proposed documentation is intended to help plan removal. It is, he said, a lot of work to try to figure out if a particular feature is being used by anybody; documenting users in this way would reduce that work considerably. Laurent Pinchart added that this information could also be useful for developers who would like to find users of a given piece of hardware to test a proposed change.

As the session came to a close, James Bottomley noted that this kind of problem arises often in the SCSI subsystem, which tends to "keep drivers forever". Eventually, though, internal API changes force discussions on the removal of specific drivers, but that is always a hard thing to do. It is easy to say that a removed driver can always be resurrected from the Git history if it turns out to be needed, but that doesn't work out well in practice.

Bergmann ended things by noting that the maintainer of a given driver is usually the person who knows that nobody is using a given device. But once that happens, the maintainer often goes away as well, taking that knowledge with them. At that point, it's nobody's job to remove the code in question, and it can persist for years.

System-on-chip obsolescence

Bergmann returned to this topic in another session dedicated to the life cycle of system-on-chip (SoC) products. Having spent a lot of time working on various aspects of architecture support in the kernel, he has learned a few things about how support for SoCs evolves and what that might mean for the architectures currently supported by the kernel.

There are, he said, five levels of support for any given SoC in the kernel:

  1. Full upstream support, with all code in mainline, all features working, and new kernel features fully supported.
  2. Minimal upstream support, but fixes and updates still make it into the stable kernel releases.
  3. Updates in mainline are sporadic at best; perhaps fixes go into the long-term-support kernels.
  4. No more upstream support; users are getting any updates directly from the vendor.
  5. The system runs, but there are no updates or ongoing support in the mainline kernel. There might still be code in the kernel, but it is not used by anybody.

The most important phase for SoC support is the bringup stage, when things are first made to work; if at all possible, that support should be brought all the way to the "full support" level. The level of support normally only goes down from there. People stop applying updates and, eventually, those updates stop appearing at all.

Problems at bringup tend to happen in fairly predictable areas, with GPU drivers being at the top of the list. That said, the situation has gotten a lot better in recent times, with increasing numbers of GPUs having upstream support. Android patches can be another sticking point; that, too, is improving over time. Short time to market and short product lifespan can both be impediments to full support as well.

Bergmann put up a diagram displaying the "CPU architecture world map" as of 2010; it can be seen on page 6 of his slides [PDF]:

[CPU architecture
world map]

This map plots architectures used in settings from microcontrollers through to data-center applications on one dimension, and their affinity to big-endian or little-endian operation on the other. These architectures were spread across the map, with IBM Z occupying the big-endian, data-center corner, and numerous architectures like Blackfin and unicore32 in the little-endian, microcontroller corner.

There were a lot of architectures available at that time, he said, and the future looked great for many of them. The Arm architecture was "a tiny thing" only used on phones, not particularly significant at the time. But phones turned out to be the key to success for Arm; as it increased its performance it was able to eliminate most of the others.

The SoC part of the market, in particular, is represented by the middle part of the map: systems larger than microcontrollers, but smaller than data-center processors. There are three generations of these that are important to the kernel. The first, typified by the Armv5 architecture, came out around 2000 and is still going strong; these are uniprocessor systems with memory sizes measured in megabytes. The Armv7-A generation launched in 2007 with up to four cores on an SoC and memory sizes up to 2GB; this generation is completely dominated by Arm processors. Finally, the Armv8-A (and x86-64) generation, beginning in 2014, supports memory sizes above 2GB and 64-bit processors.

He discussed memory technologies for a while, noting that DDR3 memory tends to be the most cost-effective option for sizes up to 2-4GB, but it is not competitive above that. That's significant because middle-generation processors cannot handle DDR4 memory.

The only reason to go with first-generation processors, he said, is if extremely low cost is the driving factor. For most other applications, 64-bit systems are taking over; they are replacing 32-bit SoCs from a number of vendors. The middle, Armv7-A generation is slowly being squeezed out.

Kernel support implications

So what are the implications for kernel support? He started with a plot showing how many machines are currently supported by the kernel; most of those, at this point, are described by devicetree files. There are a few hundred remaining that require board files (compiled machine descriptions written as C code). He suggested that the time may be coming when all board-file machines could be removed; if those machines were still in use, he said, somebody would have converted them to devicetree.

By 2017, it became clear that many architectures were approaching the end of their lives; that led to the removal of support for eight of them in 2018. Some remaining architectures are starting to look shaky; there will probably be no new products for the Itanium, SPARC M8, or Fujitsu SPARC64 processors, for example. The industry is coalescing mostly on the x86 and Arm architectures at this point.

Those architectures clearly have new products coming out in 2020 and beyond, so they will be around for a while. There are some others as well. The RISC-V architecture is growing quickly. The IBM Power10 and Z15 architectures are still being developed. Kalray Coolidge and Tachyum Prodigy are under development without in-kernel support at this point. There is a 64-bit version of the ARC architecture under development with no kernel support yet. There are still MIPS chips coming out from vendors like Loongson and Ingenic and, perhaps surprisingly, still SoCs based on the 20-year-old Armv5 core being introduced.

Big-endian systems are clearly on their way out, he said. There were a number of architectures that supported both; most are moving to little-endian only. SPARC32 and OpenRISC are holdouts, but their users are expected to migrate to RISC-V in the relatively near future. About the only architecture still going forward with big-endian is IBM Z.

There are some new SoC architectures in the works. The most significant one is RISC-V, with numerous SoCs from various vendors. Expectations for RISC-V are high, but there are still no products supported in the kernel. The ARC architecture has been around for 25 years and remains interesting; it sees a lot of use in microcontrollers. There is not much support for 32-bit ARC SoCs in the kernel, and no support yet for the upcoming 64-bit version. That support is evidently under development, though.

Where does all this lead? Bergmann concluded with a set of predictions for what the situation will be in 2030. The market will be split among the x86-64, Armv8+, and RISC-V architectures, he said; it will be difficult for any others to find a way to squeeze in. The upstreaming of support for these architectures in the kernel will continue to improve. IBM Z mainframes will still be profitable.

The last Armv7 chips, instead, have been released now, but they will still be shipping in 2030 (and in use for long after that). So 32-bit systems will still need to be supported well beyond 2030. For those reasons and more, he is not expecting to see further removals of architecture support from the kernel for at least the next year.

At the other end, 128-bit architectures, such as CHERI, will be coming into their own. That is likely to be a huge challenge to support in the kernel. The original kernel only supported 32-bit systems until the port to the Alpha architecture happened; that port was only feasible because the kernel was still quite small at the time. The (now much larger) kernel has the assumption that an unsigned long is the same size as a pointer wired deeply into it; breaking that assumption is going to be a painful and traumatic experience. Fixing that may be a job for a new generation of kernel hackers.

Comments (124 posted)

"Structural pattern matching" for Python, part 2

By Jake Edge
September 1, 2020

We left the saga of PEP 622 ("Structural Pattern Matching") at the end of June, but the discussion of a Python "match" statement—superficially similar to a C switch but with extra data-matching features—continued. At this point, the next steps are up to the Python steering council, which will determine the fate of the PEP. But there is lots of discussion to catch up on from the last two months or so.

As a quick review, the match statement is meant to choose a particular code block to execute from multiple options based on conditions specified for each of the separate case entries. The conditions can be simple, as in:

    match x:
        case 1:
	    print('1')
	case 2:
	    print('2')
That simply compares the value of x to the numeric constants, but match can do far more than that. It is a kind of generalized pattern matching that can instantiate variables in the case condition when it gets matched, as the following example from the PEP shows:
# The target is an (x, y) tuple
match point:
    case (0, 0):
        print("Origin")
    case (0, y):
        print(f"Y={y}")
    case (x, 0):
        print(f"X={x}")
    case (x, y):
        print(f"X={x}, Y={y}")
    case _:
        raise ValueError("Not a point")
In the second case, for example, y will be instantiated to the value of the second element in the point tuple if the first is zero; that allows its value to be printed. This example also shows the "_" value being used as the match-all wildcard, though that character was not an entirely popular choice; it is, however, commonly used by other languages for that purpose. Even that example barely scratches the surface of the power and reach of the proposed match statement.

Anti-PEP

On the other hand, there are some cognitive hurdles that folks will need to clear in order to understand this new syntax—and that's what is driving many of the complaints about the feature. Within a day after Guido van Rossum posted the first version of the PEP, he asked that commenters refrain from piling onto the huge thread so that the authors could have some time to hash things out. For the most part, that is what happened, but there was some concern that perhaps the opposition's arguments against the PEP would not get a full airing in the PEP itself. Mark Shannon, who had earlier questioned the need for the feature, proposed an "Anti-PEP" as a mechanism for opponents to marshal their arguments:

When deciding on PEP 484 ["Type Hints"], I had to decide between a formally written PEP on one hand, and a mass of emails and informal polls I had done at conferences, on the other. I hope I made the right decision.

Whether the ultimate decision is made by the steering committee or by a PEP delegate, it is hard to make a decision between the pros and cons, when the pros are in a single formal document and the cons are scattered across the internet.

An Anti-PEP is a way to ensure that those opposed to a PEP can be heard and, if possible, have a coherent voice. Hopefully, it would also make things a lot less stressful for PEP authors.

Notably, an Anti-PEP would not propose an alternative, it would just collect the arguments against a proposed PEP. But Brett Cannon (and others) thought that there is another mechanism to be used:

[...] that's what the Rejected Ideas section is supposed to capture. If a PEP is not keeping a record of what is discussed, including opposing views which the PEP is choosing not to accept, then that's a deficiency in the PEP and should be fixed. And if people feel their opposing view was not recorded properly, then that should be brought up.

Most other commenters agreed that the "Rejected Ideas" section (or perhaps a new "Objections" section) is the right place to record such arguments, though Raymond Hettinger agreed with Shannon about the need for a more formalized opposition document: "The current process doesn't make it likely that a balanced document is created for decision making purposes." They were in the minority, however, so a formal anti-PEP process seems improbable.

On July 1, Van Rossum announced a match statement "playground" that can be used to test the proposed feature. It is a Binder instance that runs a Jupyter Python kernel that has been modified with a "complete implementation of the PEP". But the playground and other aspects of the process made Rob Cliffe uneasy; he was concerned that the PEP was being "railroaded through".

However, PEP 622 only seems to have been presented to the Python community only *after* a well-developed (if not finalised) implementation was built.  A fait accompli.  So there will inevitably be resistance from the developers to accept changes suggested on python-dev.  And since the PEP has Guido's authority behind it, I think it is likely that it will eventually be accepted pretty much as it was originally written.

Cliffe was under the impression that PEPs are never implemented until they have been accepted, but several people in the thread pointed out that is not the case. Chris Angelico said:

Speaking with my PEP Editor hat on, I would be *thrilled* if more proposals came with ready-to-try code. Only a very few have that luxury, and a lot of the debating happens with nothing but theory - people consider what they *think* they'd do, without actually being able to try it out and see if it really does what they expect. Having a reference implementation isn't necessary, of course, but it's definitely a benefit and not a downside. Also, there HAVE been proposals with full reference implementations that have ended up getting rejected; it's not a guarantee that it'll end up getting merged.

Round 2

The second version of the PEP was announced on July 8. Van Rossum noted that the __match__() protocol (for customizing a class's matching behavior) had been removed at the behest of Daniel Moisset, who was added as the sixth author of the PEP (with Van Rossum, Brandt Bucher, Tobias Kohn, Ivan Levkivskyi, and Talin). The protocol is not required for the feature; "postponing it will allow us to design it at a later time when we have more experience with how `match` is being used". Moisset was added in part for his contribution of new text that "introduces the subject matter much more gently than the first version did"

The other big change was to drop the leading dot for constants (e.g. .CONST) in the case statements. The problem addressed by that feature is that identifier strings in the patterns to be matched can either be a variable to be stored into if there is a match or a constant value to be looked up in order to match it (sometimes called "load-and-compare" values); Python cannot determine which it is without some kind of syntactical element or convention (e.g. all uppercase identifiers are constants). Consider the following example from the announcement:

USE_POLAR = "polar"
USE_RECT = "rect"
[...]
match t:
    case (USE_RECT, real, imag):
	return complex(real, imag)
    case (USE_POLAR, r, phi):
	return complex(r * cos(phi), r * sin(phi))

Python cannot distinguish the USE_RECT constant, which should cause it to only match tuples with "rect" as the first element, from the real and imag variables that should be filled in with the values from the match. Since the original choice of prepending a dot to the constants was quite unpopular, that was removed. It is a problem that other languages with this kind of match have struggled with and the PEP authors have as well; it was the first issue in their bug tracker. Adding a sigil for the to-be-stored variables, as has been suggested (e.g. ?real) "makes this common case ugly and inconsistent". In the end, they have decided to only allow constants that come from a namespace:

No other language we’ve surveyed uses special markup for capture variables; some use special markup for load-and-compare, so we’ve explored this. In fact, in version 1 of the PEP our long-debated solution was to use a leading dot. This was however boohed off the field, so for version 2 we reconsidered. In the end nothing struck our fancy (if `.x` is unacceptable, it’s unclear why `^x` would be any better), and we chose a simpler rule: named constants are only recognized when referenced via some namespace, such as `mod.var` or `Color.RED`.

If that part of the proposal is a "deal-breaker", Van Rossum said, then, when any other problems have been resolved, that decision could be reconsidered. He also outlined the other outstanding items, with the authors' position on them:

There’s one other issue where in the end we could be convinced to compromise: whether to add an `else` clause in addition to `case _`. In fact, we probably would already have added it, except for one detail: it’s unclear whether the `else` should be aligned with `case` or `match`. If we are to add this we would have to ask the Steering Council to decide for us, as the authors deadlocked on this question.

Regarding the syntax for wildcards and OR patterns, the PEP explains why `_` and `|` are the best choices here: no other language surveyed uses anything but `_` for wildcards, and the vast majority uses `|` for OR patterns. A similar argument applies to class patterns.

That post predictably set off yet another mega-thread on various aspects of the new syntax. The alignment of else (if added) was one such topic; Stefano Borini suggested that the code to be executed should not be two levels of indentation in from the match, but be more like if statements. Glenn Linderman took that further:

That would also sidestep the dilemma of whether else: (if implemented) should be indented like case: or like match: because they would be the same.
match:
    t
case ("rect", real, imag):
    return complex(real, imag)
case ("polar", r, phi):
    return complex( r* cos(phi), r*sin(phi)
else:
    return None

He compared that syntax favorably with that of try/except/else blocks, in addition to it resolving the else-alignment question. The PEP authors had addressed the idea, noting that there are two possibilities: either the match expression is in its own single-statement block (as Linderman has it), which is unique in the language, or the match and its expression are introducing a block with a colon, but that block is not indented like every other block after a colon in Python.

    match:
        expression
    case ...
    
    # or
    
    match expression:
    case ...
Either would violate a longstanding expectation in Python, so the authors rejected both possibilities.

Larry Hastings wondered about the special treatment being given to the "_" wildcard match. That symbol acts like a regular identifier, except in case statements, where it does not get bound (assigned to) for a match; it can also be used more than once in a case, which is not allowed for other match variables:

    match x:
        case (_, _):  # match any tuple
	    print(_)  # _ will not have a value
	case (x, x):  # ILLEGAL

Hastings argued that if the same variable can be used more than once and that underscore does get bound, the special case disappears. The cost is an extra store for the binding, which could be optimized away as a dead store if that was deemed important. Moisset pointed out a few technical hurdles, and Van Rossum added more, but also thought it really did not make sense to do the binding for a value that is being explicitly described as a "don't care":

When I write `for x, _, _ in pts` the main point is not that I can write `print(_)` and get the z coordinate. The main point is that I am not interested in the y or the z coordinates (and showing this to the reader up front). The value assigned to `_` is uninteresting [...]

The need for a wildcard pattern has already been explained -- we really want to disallow `Point(x, y, y)` but we really need to allow `Point(z, _, _)`. Generating code to assign the value to `_` seems odd given the clear intent to *ignore* the value.

Compelling case?

As he was with the first version, Shannon is far from convinced that Python needs match and that the PEP actually lays out a compelling case for it. He returned to that question in a mid-July post that pointed out several flaws that he saw in the justification for the feature. In general, the examples in the PEP do not show any real-world uses of the new syntax that he finds compelling. As he said in a followup message: "I worry that the PEP is treating pattern matching as an ideal which we should be striving towards. That is a bad thing, IMO."

Others disagreed with that view, however. Kohn, one of the PEP authors who also participated in that short thread, started his own short thread with a different way to look at the feature. It is not, he said, adding a switch statement to Python; instead, it is providing a mechanism for doing function overloading in various guises:

Indeed, pattern matching is much more closely related to concepts like regular expressions, sequence unpacking, visitor patterns, or function overloading.  In particular, the patterns themselves share more similarities with formal parameters than with expressions.

Kohn went through a few detailed examples of function overloading and the visitor design pattern, showing how the proposed match feature would bring multiple benefits to the code. It is worth a read, especially for those who may not fully see the scope of what the feature can do. On the flipside, though, Shannon posted a link to his lengthy deconstruction of PEP 622, which was met with skepticism—at best. Several responders felt that it was simply a rehash of various arguments that had already been made. Van Rossum effectively dismissed it entirely: "[...] it just repeats Mark's own arguments, which are exclusively focused on the examples in the PEP (it's as if Mark read nothing *but* the examples)".

But Shannon (and others) pointed out his analysis of code in the standard library, which is linked in the critique. He concluded that in the roughly 600,000 lines of Python code in CPython, there were only three or four examples of places where the match statement made things more clear. While Stephen J. Turnbull is in favor of the PEP, he agreed that Shannon's analysis of the standard library was useful. Unlike Shannon, Turnbull thought that most of the pattern-matching rewrites were more clear, but there were not many of them; however, there is still a missing piece:

In the Analysis Mark argues that idioms amenable to pattern matching in existing stdlib code are quite rare (a couple of dozen instances, and as I wrote above I think his query was reasonably representative). While that certainly is a useful analysis, it cannot account for the fact that pattern matching is a *paradigm* that has not been available in Python in the past. *Availability of a pleasant syntax for a new paradigm causes code to be designed differently.* That is something that none of us can claim to be able to quantify accurately, although the experience Guido and others have with "async" may be relevant to guesstimating it.

To the steering council

At this point, the PEP is in the hands of the steering council, which could approve it as written, ask for changes, or reject it entirely. One suspects that an outright rejection would likely kill the idea forever, though some pieces of it could perhaps survive in other guises; Shannon had some suggestions along those lines.

If modification is requested, else seems like a logical addition, though there is no real consensus on how it should be aligned, either within the author group or the Python community at large. The requirement that constants be referred to via a namespace (e.g. color.RED) is another area that could draw attention from the council. Doing things that way requires that match variables not be referred to via a namespace, which has a few downsides, including the inability to assign to self.x in a match pattern. While it was deemed "ugly" (at best), the original idea of requiring constants to have a dot prepended to them (e.g. .RED) would at least remove that restriction on match variables.

The council is an unenviable position here; one suspects that Van Rossum knows how the members feel after the "PEP 572 mess" that led to his resignation as benevolent dictator for life (BDFL)—thus to the creation of the council. Contentious decisions are draining and the aftermath can be painful as well. In this case, the opposition does not seem as strong as it was to the "walrus operator" from PEP 572, but the change here is far more fundamental—and the effects are far-reaching.

The council has not really tipped its hand in the multiple, typically long, threads discussing the feature—most of its members have not participated at all. Clearly a lot of work has gone into the PEP, which may make it somewhat harder to reject outright. The opinion of the language's inventor and former BDFL also likely helps tip the scales somewhat in the PEP's favor; most of the community seemed to like the idea overall, while there were (lots of) quibbles about details. It would not be surprising to find that Python 3.10 (slated for October 2021) shows up with a match statement.

Comments (51 posted)

Building a Flutter application (part 2)

By John Coggeshall
August 28, 2020

Our previous article explored the fundamentals of Flutter, a cross-platform open-source user-interface (UI) toolkit. We complete our introduction of Flutter by returning to the simple LWN RSS feed headline viewer that was introduced in part one. We will be adding several new features to that application in part two, including interactive elements to demonstrate some of the UI features of Flutter.

The LWNRssService class introduced in part one is responsible for fetching and processing the RSS feed from LWN using the http and dart_rss packages. This class was used in the former version of the application, implemented in such a way as to block the UI from starting until the feed was loaded. The result was an empty window in the interim, which is not an ideal user experience. There are several ways to solve that problem in Flutter, but here we incorporate the FutureBuilder widget into the LWNHeadlineScreen class (which provides the main interface). For reference purposes, below is the class we used for part one:

    class LWNHeadlineScreen extends StatelessWidget
    {
      final Rss1Feed feed;

      LWNHeadlineScreen(this.feed);

      @override
      Widget build(BuildContext context)
      {
        return Scaffold(
          appBar: AppBar(
            title: Text(feed.title)
          ),
          body: ListView.builder(
            itemCount: feed.items.length,
            itemBuilder: (BuildContext ctxt, int index) {
              final item = feed.items[index];
              return ListTile(
                title: Text(item.title),
                contentPadding: EdgeInsets.all(16.0),
              );
            }
          )
        );
      }
    }

The FutureBuilder widget is a wrapper widget for asynchronous data that "builds itself based on the latest snapshot of interaction with a Future". We will use this widget in the Scaffold class's body parameter above, replacing the call to ListView.builder(). The change will allow the application to choose the widget that will render in our main view, depending on the state of the asynchronous LWNRssService.getFeed() operation. Below we will walk through the changes needed to implement FutureBuilder in the LWNHeadlineScreen class step-by-step (full class available here):

    class LWNHeadlineScreen extends StatelessWidget
    {
        Future<Rss1Feed> feed;

This version of the class has changed the data type of feed from Rss1Feed to Future<Rss1Feed>. In the application from part one, LWNHeadlineScreen used a concrete instance of Rss1Feed for feed, requiring the data to exist before an instance of the interface could be created. The new version, using a Future<Rss1Feed> for feed, allows LWNHeadlineScreen to be instantiated prior to feed being fetched. This Future<Rss1Feed> is provided to LWNHeadlineScreen via the constructor defined in (shorthand) form below:

    LWNHeadlineScreen(this.feed);

The build() method is responsible for returning the widget to be rendered. In this case, the widget will consist of a Scaffold with an application bar type of AppBar:

    @override
    Widget build(BuildContext context)
    {
      return Scaffold(
        appBar: AppBar(
            title: Text("LWN RSS Reader")
        ),

This defines the "application bar" at the top of our UI, where toolbars or other interface elements can be added if desired. In this example we assign only a Text widget to give the application a title. The body of the Scaffold class is the most important part of the application; it contains the widgets that build the primary user interface; it uses FutureBuilder:

    body: FutureBuilder<Rss1Feed>(
        future: feed,
        builder: (BuildContext context, AsyncSnapshot snapshot) {

Here we create a FutureBuilder instance for a Rss1Feed type (matching the type of the feed property). The instance of the class is provided two parameters: future, which is the Future<Rss1Feed> variable we are working with, and builder, which takes a function with two parameters as a value. We will be focusing on the snapshot parameter, which is an instance of AsyncSnapshot and offers an "immutable representation of the most recent interaction with an asynchronous operation". In practical terms, snapshot provides several properties that allow a developer to choose the rendered widget based on the current state of the asynchronous operation. Here is how snapshot is used:

   if(snapshot.hasData) {
        return ListView.builder(
            itemCount: snapshot.data.items.length,
            itemBuilder: (BuildContext ctxt, int index) {
                final item = snapshot.data.items[index];
                return ListTile(
                    title: Text(item.title),
                    contentPadding: EdgeInsets.all(16.0),
                 );
             }
         );
    } else if(snapshot.hasError) {
         return Text("{$snapshot.error}");
    }

    return Center(
        child: CircularProgressIndicator()
    );

In the example above, we use the hasData and hasError properties of snapshot to render a ListView widget once the asynchronous operation has completed successfully (or a Text widget for an error), and a centered CircularProgressIndicator widget while a request is in progress. The result is a better UI that displays an appropriate spinner while a HTTP request is in progress, replacing it with the original list view upon successful completion.

Making things clickable

To add the interactivity to the application, we will be working exclusively with the code contained within the ListView.builder method, and more accurately the ListTile instances created by it. For reference, here is how we defined the UI for each news item in part one:

    return ListTile(
        title: Text(item.title),
        contentPadding: EdgeInsets.all(16.0),
    ); 

To improve upon this implementation, we start by incorporating the item.description property (a summary of an article taken from the RSS feed). The description provided by the RSS feed is in HTML format and must be rendered to make the embedded links work. To convert HTML to a Flutter widget, the pub.dev community has provided the flutter_html package. This package accepts a block of HTML code and creates the necessary widget components for it to be rendered in the UI. The flutter_html package provides the Html class used below, which is later assigned to the subtitle parameter of ListTile:

    Html(
        data: "<p>" + item.description.trim() + "</p>",
        onLinkTap: (String url) async {
            if(await canLaunch(url)) {
                await launch(url);
            } else {
                throw 'Could not launch $url';
            }
        }
    )

The Html widget is created using two parameters: data, the HTML content to render, and onLinkTap, which is called when a user clicks a generated link. The data parameter is straightforward, taking the HTML content directly from the item.description property with minor modifications.

The function passed to onLinkTap implements features included in another Flutter package, url_launcher, which provides a reliable method of opening a URL in a browser without worrying about platform specifics. By using the URL passed to onLinkTap in conjunction with the url_launcher package, we direct all links rendered by flutter_html to open into a browser.

To make an entire list item (outside of embedded links) clickable to open a full article, we need to return to the ListTile class and add the onTap parameter. The onTap parameter is a function that is called when a user clicks on an item in the list (outside of an embedded link). The onTap implementation is not as straightforward as onLinkTap; the links provided by the LWN RSS go to the RSS view of the item rather than the full article. The difference between the full content URL and the RSS version on LWN is simply rss being added to the URL. For example, here is a representative link we might find in the feed:

    https://lwn.net/Articles/826124/rss

To make list items clickable and take users to a full article on the site, we need a direct link, which can be achieved by manipulating an item's RSS feed URL. Here is the implementation of the function provided to the onTap method to do so:

    
    () async {
        
        Uri rssLink = Uri.parse(item.link);
        String linkUrl = "https://lwn.net/Articles/${rssLink.pathSegments[1]}/";

        if(await canLaunch(linkUrl)) {
            await launch(linkUrl);
        } else {
            throw 'Could not launch $linkUrl';
        }
    }

We use the Uri class to parse the provided link, and create a new String linkUrl to store the full article URL. To extract the article ID, we use the pathSegments list provided by the Uri class, which contains a list item for each part of the parsed URI path. Since the Article ID is the second item in the list we reference it using rssLink.pathSegments[1] when creating the linkUrl string. This value can then be used to create a new URL and launch a browser to open that location when the list item is clicked.

With a few more slight modifications to the user interface (such as adjusting each list item's margins and bolding the title of the article), our enhancements to the application are complete. Readers who would like to see the full application code with all of the changes can do so in the project repository. Here is how the application now looks running on Ubuntu 20.04 (with the security updates item highlighted):

[Enhanced LWN RSS Reader]

As described, each hyperlink can be clicked to open up the relevant content in an appropriate web browser for the platform, and clicking on an item opens a link to a full article. In all, the first basic implementation of a simple headline reader discussed in part one was roughly 60 lines of code; this more useful version was double that at around 120 lines of code. Overall, Flutter was reasonably easy to use for this project, and I found the documentation to be an excellent reference. The ability to execute small chunks of code in a web browser using DartPad was also a great help to test snippets of code as needed. For Linux (and mobile) application development, Flutter provides an experience worth checking out.

Comments (none posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Briefs: Case-insensitive ext4; LXD 4.5; ChromeOS graphics; Rust 1.46; Quotes; ...
  • Announcements: Newsletters; conferences; security updates; kernel patches; ...
Next page: Brief items>>

Copyright © 2020, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds