|
|
Subscribe / Log in / New account

LWN.net Weekly Edition for September 9, 2021

Welcome to the LWN.net Weekly Edition for September 9, 2021

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Applying PEP 8

By Jake Edge
September 8, 2021

Two recent threads on the python-ideas mailing list have overlapped to a certain extent; both referred to Python's style guide, but the discussion indicates that the advice in it may have been stretched further than intended. PEP 8 ("Style Guide for Python Code") is the longstanding set of guidelines and suggestions for code that is going into the standard library, but the "rules" in the PEP have been applied in settings and tools well outside of that realm. There may be reasons to update the PEP—some unrelated work of that nature is ongoing, in fact—but Pythonistas need to remember that the suggestions in it are not carved in stone.

Emptiness

On August 21, Tim Hoffmann posted his idea for an explicit emptiness test (e.g. isempty()) in the language; classes would be able to define an __isempty__() member function to customize its behavior. Currently, PEP 8 recommends using the fact that empty sequences are false, rather than any other test for emptiness:

# Correct:
if not seq:
if seq:

# Wrong:
if len(seq):
if not len(seq):

But Hoffmann said that an isempty() test would be more explicit and more readable, quoting entries from PEP 20 ("The Zen of Python"). He also pointed to a video of a talk by Brandon Rhodes, where Rhodes suggested that the second ("Wrong") version of the test was more explicit, thus a better choice. Effectively Hoffmann wanted to take that even further, but Steven D'Aprano said that Python already has an explicit way to test collections for emptiness:

We do. It's spelled:
    len(collection) == 0
You can't get more explicit than that.

He perhaps should have known that the last line would be too absolute for other Python developers to resist; Serhiy Storchaka and others came up with "more explicit" tests that D'Aprano laughingly acknowledged. But, perhaps more to the point, Chris Angelico wondered what actual problems isempty() would solve. Testing a collection in a boolean context (e.g. in an if statement or using bool()), as suggested in the PEP, works for many types, he said; "Are there any situations that couldn't be solved by either running a type checker, or by using len instead of bool?"

But, as Thomas Grainger pointed out, both NumPy arrays and pandas DataFrames have a different idea about what constitutes emptiness; evaluating those types as booleans will not produce the results expected. NumPy and pandas are popular Python projects for use in scientific and data-analysis contexts, so their behavior is important to take into account. Grainger also mentioned the "false" nature of time objects set to midnight, which was addressed back in 2014, as another example.

While the wisdom of treating zero as false in Python in general was questioned by Christopher Barker, Angelico said that the real problem with the false midnight was in treating midnight as zero (thus false). In any case, Hoffmann believes that objects should be able to decide whether they are empty: "It's a basic concept and like __bool__ and __len__ it should be upon the objects to specify what empty means." In a later message, he conceded that adding a new emptiness protocol (i.e. __isempty__()) may well be overkill, however.

Several commenters asked about use cases where emptiness-test problems manifest; Hoffmann said that SciPy and Matplotlib both have functions that can accept NumPy arrays or Python lists and need to decide if they are empty at times. Using len() works, but:

We often can return early in a function if there is no data, which is where the emptiness check comes in. We have to take extra care to not do the PEP-8 recommended emptiness check using `if not data`.

He suggested that having two different ways to test for emptiness depending on the types of the expected data was "unsatisfactory"; "IMHO whatever the recommended syntax for emptiness checking is, it should be the same for lists and arrays and dataframes." But Paul Moore objected to the rigid adherence to PEP 8:

You can write a local isempty() function in matplotlib, and add a requirement *in your own style guide* that all emptiness checks use this function.

Why do people think that they can't write project-specific style guides, and everything must be in PEP 8? That baffles me.

But the inconsistency for using the object in a boolean context versus checking its len() led Hoffmann to suggest that PEP 8 needs changing, "because 'For sequences, (strings, lists, tuples), use the fact that empty sequences are false:' is not a universal solution". While Moore was not opposed to changing the wording in PEP 8, he said that things are not as clear cut as Hoffmann seems to think:

PEP 8 is a set of *guidelines* that people should use with judgement and thought, not a set of rules to be slavishly followed. And in fact, I'd argue that describing a numpy array or a Pandas dataframe as a "sequence" is pretty inaccurate anyway, so assuming that the statement "use the fact that empty sequences are false" applies is fairly naive anyway.

But if someone wants to alter PEP 8 to suggest using len() instead, I'm not going to argue, I *would* get cross, though, if the various PEP 8 inspired linters started complaining when I used "if seq" to test sequences for emptiness.

Hoffmann eventually decided not to pursue either a language change or one for PEP 8. There are some differences of opinion within the thread, but, by and large, the Python core developers do not see anything that requires much in the way of change. Meanwhile, PEP 8 popped up again right at the end of August.

is versus ==

Nick Parlante posted a lengthy message about a problem he has encountered when teaching a first course in programming using Python. Unlike other languages (e.g. Java), Python has a much simpler rule for how to do comparisons:

To teach comparisons in Python, I simply say "just use ==" - it works for ints, for strings, even for lists. Students are blown away by how nice and simple this is. This is how things should work. Python really gets this right.

The problem is that PEP 8 has an entry in the "Programming Recommendations" section that says: "Comparisons to singletons like None should always be done with is or is not, never the equality operators." Singletons are classes that only have one instance—all references to None in Python are to the same object. Parlante calls the entry in the PEP the "mandatory-is rule" and said that it complicates teaching the language unnecessarily; tests like "x == None" generally work perfectly well.

Students often first encounter is in a warning from code that tests a variable for equality to None, Parlante said. Integrated development environments (IDEs) will typically complain about violations of PEP 8, he said, which is usually "very helpful". But there is an exception: "Having taught thousands of introductory Python students, the one PEP8 rule that causes problems is this mandatory-is rule." He suggested making the "rule" less ironclad by adding language about it being optional to the PEP.

Angelico said that the two operators are asking different questions, however; it is important to eventually understand the difference, but "just use ==" is a fine place to start. He also pointed to the specific language in the PEP and noted, again, that "EVERYTHING in that document is optional for code that isn't part of the Python standard library". He suggested turning off the specific warning in the IDE if it was causing problems. Ultimately, it is up to the instructor to determine the best approach for their course—including the style guide.

Parlante pushed back a bit on the correctness of using is as specified in the PEP, but Angelico provided several examples of where the "x == None" test will not work. Perhaps unsurprisingly, NumPy was used in one of the examples; the point is that equality is not the right question to ask because some objects have odd views on what it means—or, like NumPy, are unwilling to even attempt to decide. NumPy raises ValueError when its multi-element arrays are tested using ==, for example.

Barker noted that he also teaches Python to beginners, but that he does teach about the difference between is and == early on. There are benefits to that approach, he said:

I have seen all too many instances of code like:
if c is 0:
    ...
Which actually DOES work, for small integers, on cPython -- but it is a very bad idea. (though I think it raises a Warning now)

And your students are almost guaranteed to encounter an occasion where using == None causes problems at some point in their programming careers -- much better to be aware of it early on!

Barker suggested that Parlante leave the "mandatory is" warning turned on in the IDE, but D'Aprano had a different take. He is "not fond of linters that flag PEP 8 violations" and agreed with Angelico's configuration suggestion. As a practical matter, D'Aprano said, changing PEP 8 in order to affect the IDEs is likely to be a slow way to go about fixing this problem—if there even is one.

But "PEP-8 zealots" (as D'Aprano called them) are actually acting as a force for good, Parlante said. Students naturally pick up good habits by seeing complaints from the IDE and fixing them, even though they come from the completely optional guidelines in PEP 8. "I hope people who care about PEP8 can have a moment of satisfaction, appreciating how IDEs have become a sort of gamified instrument to bring PEP8 to the world at low cost."

He has something of an ulterior motive to get to a more "== tolerant" world, but few, if any, commenters see things his way; as "Todd" put it:

Using "==" is a brittle approach that will only work if you are lucky and only get certain sort of data types. "is" works no matter what. The whole point of Python in general and pep8 in particular is to encourage good behavior and discourage bad behavior, so you have an uphill battle convincing people to remove a rule that does exactly that.

Furthermore, as David Mertz pointed out, there are some important concepts that may be getting swept under the rug:

Moreover, I would strongly discourage any instructor from papering over the difference between equality and Identity. I guess in a pure functional language there's no difference. But in Python it's of huge importance.

As noted REPEATEDLY, this isn't just about 'is None'. As soon as you see these, it is a CRUCIAL distinction:

a = b = []
c, d = [], []
Nary a None in sight, yet the distinction is key.

In the example, all four variables are assigned to an empty list, but a and b are assigned to the same list. So:

    >>> a == c
    True
    >>> a is b
    True
    >>> a is c
    False
    >>> c is d
    False
Adding elements to a will add them to b and vice versa, which is decidedly not the case for the other two lists.

D'Aprano thinks the dangers of using an equality test for None to be a bit overblown, but using is is still beneficial:

There are a bunch of reasons, none of which on their own are definitive, but together settle the issue (in my opinion).
  1. Avoid rare bugs caused by weird objects.
  2. Slightly faster and more efficient.
  3. Expresses the programmer's intent.
  4. Common idiom, so the reader doesn't have to think about it.

It looks rather unlikely that we will see any changes to PEP 8 for either of the ideas raised in these two threads. It is important to recognize what PEP 8 is (and is not)—no matter what IDEs and linters do by default. Hopefully the PEP's goals and intent were reinforced in the discussions. Meanwhile, Barker has been working on changes to the PEP to remove Python-2-specific language from it.

Other communities might not appreciate this kind of discussion, some of which can question the foundations of the language at times. But Python (and the python-ideas mailing list in particular) seems to welcome it for the most part. Over the years, those sorts of discussions have led to PEPs of various kinds—some adopted, others not—and to a better understanding of the underpinnings of the language and its history.

Comments (19 posted)

5.15 Merge window, part 1

By Jonathan Corbet
September 2, 2021
As of this writing, 3,440 non-merge changesets have been pulled into the mainline repository for the 5.15 development cycle. A mere 3,440 patches may seem like a slow start, but those patches are densely populated with significant new features. Read on for a look at what the first part of the 5.15 merge window has brought.

Architecture-specific

  • The s390 architecture has gained support for the KFENCE and KASAN KCSAN development tools.

Core kernel

  • It is now possible to place entire control groups into the SCHED_IDLE scheduling class — something that could only be done at the task level before. The group as a whole will only run when there is nothing else for the CPU to do, but tasks within the group will still have their relative weights.
  • After something like 17 years of development effort, the realtime preemption locking code has been merged. This work began in 2004 and has fundamentally changed many parts of the core kernel. With this pull, the sleepable locks that make deterministic realtime response possible have finally joined all of that other work (though the kernel must be built with the REALTIME configuration option to use them). This merge log describes the major changes that this code brings.
  • The io_uring subsystem now supports opening files directly into the fixed-file table without the use of a file descriptor. This can yield some significant performance improvements for certain types of workloads; it also is a significant break from the Unix tradition of using file descriptors for open files.
  • Also new in io_uring is a new "BIO recycling" mechanism that cuts out some internal memory-management overhead; the result, it is claimed, is a 10% increase in the number of I/O operations per second that io_uring can sustain.
  • Finally, io_uring has gained support for the mkdirat(), symlinkat(), and linkat() system calls.
  • BPF programs can now request and respond to timer events. The timer API is severely undocumented; some terse information is available in this commit and this one, and there is a test program that contains an example.
  • Core scheduler support for scheduling on asymmetric systems has been merged. There is another piece to make use of this functionality on Arm processors that is presumably coming later in the merge window.

Filesystems and block I/O

  • The fanotify API has a new option, FAN_REPORT_PIDFD, which causes a pidfd to be returned as part of the event metadata. This (privileged) feature allows race-free identification of processes accessing monitored files.
  • A set of hole-punching fixes should eliminate a class of subtle race conditions that could lead to file corruption.
  • Support for mandatory file locking has been deprecated for years; it works poorly and is little used (if at all). As of 5.15, support for mandatory locking has been removed altogether.
  • The LightNVM subsystem, which provided direct access to solid-state storage without an emulation layer, has been removed. According to the commit message, LightNVM has been superseded by newer NVMe standards and is no longer needed.
  • The kernel finally has an in-kernel server for the SMB filesystem protocol family. According to the merge message:

    ksmbd is a new kernel module which implements the server-side of the SMB3 protocol. The target is to provide optimized performance, GPLv2 SMB server, and better lease handling (distributed caching). The bigger goal is to add new features more rapidly (e.g. RDMA aka "smbdirect", and recent encryption and signing improvements to the protocol) which are easier to develop on a smaller, more tightly optimized kernel server than for example in Samba.

    The Samba project is much broader in scope (tools, security services, LDAP, Active Directory Domain Controller, and a cross platform file server for a wider variety of purposes) but the user space file server portion of Samba has proved hard to optimize for some Linux workloads, including for smaller devices.

    This is not meant to replace Samba, but rather be an extension to allow better optimizing for Linux, and will continue to integrate well with Samba user space tools and libraries where appropriate. Working with the Samba team we have already made sure that the configuration files and xattrs are in a compatible format between the kernel and user space server.

  • The Btrfs filesystem has gained support for fs-verity file integrity assurance and ID-mapped mounts.
  • The move_mount() system call (described in this article) has been extended to allow adding a mount to an existing sharing group. This relatively obscure new feature evidently solves a lot of problems for the Checkpoint/Restore in Userspace developers; see this commit for more information.

Hardware support

  • Miscellaneous: Richtek RTQ6752 TFT LCD voltage regulators, Richtek RTQ2134 SubPMIC regulators, Rockchip serial flash controllers, Arm SMCCC random-number generators, and Aquacomputer D5 Next watercooling pumps.
  • Networking: MediaTek Gigabit Ethernet PHYs, MHI WWAN MBIM interfaces, and LiteX Ethernet interfaces.
  • Power supply: ChromeOS EC based peripheral chargers and Mediatek MT6360 chargers.
  • Virtual: Virtio I2C adapters.

Networking

Security-related

  • There is a new prctl() operation called PR_SPEC_L1D_FLUSH. If a process turns this on, the kernel will flush the L1D (level-1 data) cache whenever that process is scheduled out of the CPU. This should help to mitigate a number of potential speculative-execution vulnerabilities that can cause data to be leaked from the L1D cache — at a significant performance cost. Note that this feature will not protect against a hostile process running on an SMT sibling processor; a feature like core scheduling must be used to protect against that case. The new prctl() can be disabled by the administrator; see this documentation patch for details.
  • The device mapper has gained support for remote attestation using the kernel's integrity measurement architecture. See Documentation/admin-guide/device-mapper/dm-ima.rst for details.

The 5.15 merge window can be expected to stay open until September 12, assuming that the usual schedule holds. LWN will be back with coverage of the remainder of the merge window immediately after it closes; it seems likely that there is quite a bit of work yet to be pulled for this development cycle.

Comments (none posted)

Not-so-anonymous virtual memory areas

By Jonathan Corbet
September 3, 2021
Computing terminology can be counterintuitive at times, but even a longtime participant in the industry may have to look twice at the notion of named anonymous memory. That, however, is just the concept that this patch set posted by Suren Baghdasaryan proposes to add. There are, it seems, developers who find the idea useful enough to not only overcome the initial cognitive dissonance that comes with it, but also to resurrect an eight-year-old patch to get it into the kernel.

Memory used by user space is divided into two broad categories: file-backed and anonymous. A file-backed page of memory has a direct correspondence to a page in a file in persistent storage; when the page is clean, its contents are identical to what is found on disk. An anonymous page, instead, is not associated with a file in the filesystem; these pages hold a process's data areas, stacks, and so on. If an anonymous page must be written to persistent storage (to reclaim the page for another user, usually), space must be allocated in the swap area to hold its contents.

Whether a given process's memory use is dominated by file-backed or anonymous pages varies from one workload to the next. In many cases, the bulk of a process's pages will be anonymous; this, it seems, is more likely in workloads with a lot of cloud-computing clients, which tend not to use many local files. Android devices are one place where this sort of behavior can be found. If one is trying to optimize the memory usage of such a workload, anonymous pages can pose a challenge; since the pages are anonymous, with no information about how they were created, it is difficult to know what any given anonymous page is being used for.

That situation can be improved by making anonymous pages just a bit less anonymous. If it were possible to know which user-space subsystem or library created a given page, it would become easier to figure out who the biggest users are. Information on, say, how many anonymous pages in the system were created by the jemalloc library, for example, could help determine whether jemalloc users are the best target for optimization efforts. Linux systems, however, do not make it easy (or even possible) to get that sort of information.

Making things better requires obtaining some cooperation from user space, since the kernel cannot know which subsystem is allocating any given page. To that end, at the core of the patch set is this patch from Colin Cross, which was originally posted in 2013. It adds a new prctl() operation:

    prctl(PR_SET_VMA, PR_SET_VMA_ANON_NAME, start, len, name);

This operation will cause the given name to be associated with the len anonymous pages beginning at start. In truth, the name is associated with the virtual memory area (VMA) structure describing a range of memory. Thus, what actually happens is that all pages that are part of the VMAs in the given range will have the name assigned to them, even if the pages themselves are not within that range. Each mmap() call usually creates a VMA (though there are complications), so all pages associated with any given VMA will normally have been created in the same way.

The maps and smaps files in each process's /proc directory already contain a lot of information about that process's VMAs. With this patch set applied, those files will also contain the name that has been associated with the anonymous VMAs, if any; the name is duly checked for printability before being accepted. Using that information, system tools can associate pages with those names and, from there, with the subsystems that created them.

Assigning a name to a VMA does not seem like a difficult endeavor, but it has proved to be the trickiest part of this patch. A system can have a lot of processes, each of which can have a lot of VMAs, so the management of these names needs to scale reasonably well. Previous versions of the patch set have tried just pointing to the provided names in user space; this avoids the need to allocate memory in the kernel but, as Kees Cook pointed out, it presents some interesting security problems as well. At the time, Cook suggested simply copying the strings into kernel space.

While copying the strings works, there is still a little problem: when a process forks, its VMAs are copied for the new child. Now all of those name strings must be copied too. Baghdasaryan ran a worst-case test, with a process creating 64,000 VMAs, assigning a long name to each, then calling fork(), the result was a nearly 40% performance regression. Even if such numbers will not be seen in real-world workloads, a slowdown of that magnitude is sure to raise eyebrows.

As a way of avoiding excessive eyebrow elevation, Baghdasaryan added a mechanism to use shared, reference-counted names. A fork() call now need only increase the reference counts rather than allocate memory and copy a string. With this added machinery in place, the performance cost is "reduced 3-4x" in the worst case, and is said to not be measurable for more reasonable test cases.

This functionality is evidently useful; Android has been using it for years, having kept the original patch going for all of that time. Thus far, the review comments have focused on relatively minor issues — which characters should be allowed in names, for example. So there would not appear to be a lot of obstacles to overcome before this work can be merged. For this feature, it seems, eight years of waiting on the sidelines should be enough, and anonymous pages may soon lose a bit of their anonymity.

Comments (18 posted)

More IOPS with BIO caching

By Jonathan Corbet
September 6, 2021
Once upon a time, block storage devices were slow, to the point that they often limited the speed of the system as a whole. A great deal of effort went into carefully ordering requests to get the best performance out of the storage device; achieving that goal was well worth expending some CPU time. But then storage devices got much faster and the equation changed. Fancy I/O-scheduling mechanisms have fallen by the wayside and effort is now focused on optimizing code so that the CPU can keep up with its storage. A block-layer change that was merged for the 5.15 kernel shows the kinds of tradeoffs that must be made to get the best performance from current hardware.

Within the block layer, an I/O operation is represented by struct bio; an instance of this structure is usually just called a "BIO". Contained within a BIO are a pointer to the relevant block device, a description of the buffer(s) to be transferred, a pointer to a function to call when the operation completes, and a surprising amount of ancillary information. A BIO must be allocated, managed, and eventually freed for every I/O operation executed by the system. Given that a large, busy system with fast block devices can generate millions of I/O operations per second (IOPS), huge numbers of BIOs will be going through this life cycle in a constant stream.

The kernel's slab allocator is optimized for the task of repeatedly allocating and freeing structures of a uniform size; it seems like it should be well suited as a source of BIOs for the block subsystem. It turns out, though, that the slab allocator is not fast enough; it has become a bottleneck slowing down block I/O. So block maintainer Jens Axboe has put together a set of patches to circumvent the problem.

The result is a simple cache of BIO structures. It is built as a set of linked lists, one for each CPU in the system. When a new BIO is needed (and when some other conditions are met — see below), the linked list for the current CPU is checked; if a free BIO is found there, it can be removed from the list and used without having to call into the slab allocator. If the list is empty, a slab call must be made as usual, of course. When the time comes to free a BIO, it is put onto the current CPU's list. Should the list grow too large (more than 576 cached BIOs), 64 BIOs will be handed back to the slab allocator.

It is a simple mechanism, which is the source of its speed. Rather than calling into the slab allocator, the block layer can just grab an available BIO directly off of the appropriate per-CPU list without any function calls at all. The use of a per-CPU list eliminates the need for locking, speeding things further. The lists are managed like a stack, maximizing the chance that an allocated BIO will already be present in the CPU cache. The end result is a significant improvement in performance.

At least, that is the case for some workloads. The BIO cache is, as noted, simple; one of the things it doesn't bother with is interrupt safety. A per-CPU data structure is only safe for lockless access if the kernel cannot be preempted while executing the critical section; interrupts, being the definitive form of preemption, violate that rule. If a block-driver interrupt handler tries to allocate or free a BIO while some other kernel code is doing the same, the results are likely to be unpleasant and users will be remarkably unappreciative of the improved performance.

The BIO cache could be made interrupt-safe, of course, and someday that might just have to happen. But disabling interrupts has a performance cost as well, so there are good reasons for avoiding it. The cost associated with leaving interrupts enabled is that the BIO cache can only be used in situations where concurrent access in interrupt handlers is not a possibility. The good news is that one such situation is well defined: when block-layer I/O polling is in use. Polling turns off interrupts from the storage device in favor of simply looping until an I/O request is completed; this can actually be a reasonable thing to do with fast devices. In settings where getting the highest I/O rates possible is important, administrators are likely to have polling enabled anyway; targeting this additional performance improvement at that use case thus makes some sense.

Getting the slab allocator out of the loop improves performance considerably, but there is one other bottleneck to overcome. The block layer has a function called bio_init(), the core of which reads:

    memset(bio, 0, sizeof(*bio));

One might think that memset() would be the fastest way to initialize a moderately sized structure like this, but that turns out not to be the case. So Axboe added a patch that replaces the memset() call with a series of statements explicitly setting each BIO field to zero. The changelog notes that this change halves the time it takes to allocate and initialize a BIO (when using the BIO cache, of course).

With these changes in place, Axboe said, the block layer's performance increased by about 10%; it can now execute over 3.5 million IOPS on each CPU core on his test system. That is a lot of blocks moving back and forth, which will surely please the managers of storage servers. This series shows what can be (and must be) done to optimize I/O throughput on current hardware; it also suggests, though, that it may be time to put some more optimization effort into the (already highly optimized) slab allocator. If the kernel starts to fill up with per-subsystem object caches as a way of bypassing the allocator, performance overall will suffer. Meanwhile, though, few are likely to argue that this effort to improve block-I/O performance is anything but well placed.

Comments (31 posted)

FOSS for amateur radio

September 7, 2021

This article was contributed by Sam Sloniker

Amateur ("ham") radio operators have been experimenting with ways to use computers in their hobby since PCs became widely available—perhaps even before then. While many people picture hams either talking into a microphone or tapping a telegraph key, many hams now type on a keyboard or even click buttons on a computer screen to make contacts. Even hams who still prefer to talk or use Morse code may still use computers for some things, such as logging contacts or predicting radio conditions. While most hams use Windows, there is no shortage of ham radio software for Linux.

Utilities

HamClock, as its name implies, has a primary function as a clock, but it has several other features as well. It shows a world map, and the user can click anywhere on the map to see the current time and weather conditions at that location. It also shows radio-propagation predictions, which indicate the probability that a ham's signals will be received at any particular location on Earth. These predictions are available in numerical form and as map overlays. In addition to propagation predictions, HamClock provides graphs and images indicating solar activity such as sunspots, which strongly affect radio propagation.

[HamClock]

Most hams keep logs of all contacts they have made over the radio; this was (and still may be) required by law in some countries. Historically, hams have kept logs on paper, but many now use electronic logging programs. There are several Linux-based, FOSS logging programs, such as FLLog (documentation/download) and Xlog. One logging-related program that is designed to work with other logging software is TQSL, which cryptographically signs confirmations of contacts and sends them to the Logbook of the World (LoTW). The American Radio Relay League (ARRL) uses LoTW verification to issue awards for certain achievements, such as contacting 100 countries or all 50 US states, which previously required submitting postcards (called QSL cards) received from the person contacted from each country or state. Collecting QSL cards is still popular, and they can still be used for awards, as LoTW is completely optional.

Communication tools

Traditionally, in order to communicate, hams have used either continuous wave (CW) to send Morse code or any of a variety of "phone" (voice) modes. The different phone modes all allow two or more radio operators to talk to each other, but they convert audio signals to radio waves in different ways. However, many hams now use digital modes. One of the main benefits of these modes is that they can be decoded from weak signals, allowing more reliable long-range communication compared to CW or phone.

FT8 is the most popular digital mode for ham radio. It is used for structured contacts, typically exchanging call signs, locations, and signal strength reports. FT8 sends short, encoded messages such as "CQ KJ7RRV CN72". In that message, CQ means "calling all stations", KJ7RRV is my call sign, and CN72 is my location on the southern Oregon coast, encoded using the Maidenhead Locator System.

FT8 is much slower than most other digital modes—sending the message above takes about 13 seconds—but its slow speed allows it to be extremely reliable, even under poor radio conditions. Radio propagation conditions over the last few years have been relatively poor (although they are currently improving) due to a minimum in the 11-year solar cycle. FT8 has been usable under all but the worst conditions, though it is certainly easier to make contacts when conditions are good. WSJT-X, the original and most popular program for FT8, is FOSS and is available for Linux.

Fldigi is another program used for digital modes. Unlike FT8, most of the modes in fldigi can transfer free-form text. The most popular mode included, PSK31, is designed for conversational contacts over long distances. Some other modes are primarily used for transferring files, which is supported well by fldigi. Flamp is a separate program that connects to fldigi; it is used to transfer files over radio by encoding them into a plain-text format that can be decoded by flamp on another computer. If an error occurs in transmission, flamp can detect the error and determine which portion of the file it is in, so the sender can resend only the portion that failed.

Flmsg is a program that allows email-like forms to be used with fldigi and (optionally) flamp. A form allows structured spreadsheet-like data to be transferred efficiently, by avoiding the need to transmit common information with every message. Many forms are intended for disaster response; for example, there is an "ICS-216 MEDICAL PLAN" form which is specifically designed for sending information about available ambulances, hospitals, and other emergency medical care resources. Some other forms, such as "ICS-213 GENERAL MESSAGE," mostly contain free-form text and are intended for use when no more-specific form is available.

Fldigi, flamp, and flmsg (with its forms), along with a few related programs, are all available at W1HKJ's web site or from SourceForge.

Radio modems

WSJT-X and fldigi use modems that are completely software-based; they use a computer's sound card to transmit and receive audio signals. These signals are sent and received by a radio using a special device called a radio-sound-card interface. There are schematics available online for these interfaces, although most hams purchase a pre-built one. The SignaLink USB is a popular model that also has a built-in USB sound card, allowing the user to continue using the computer's internal sound card for other purposes. Although the manufacturer does not officially support Linux, many people have successfully used the device without needing to install extra drivers.

Another digital mode that is commonly used is packet radio. Most packet networks use AX.25, which is a modified version of X.25 that is designed to be used over ham radio. Linux works well for packet radio, because the kernel's networking stack has native support for AX.25. Although external hardware modems can be used, it is now common to use a computer sound card for packet radio. Dire Wolf, a FOSS packet-radio program for Linux, includes a sound-card modem, as well as some routing features that are not provided by the kernel.

Winlink, which is a radio-based email system, is another popular digital radio system. Pat is a Linux-compatible, FOSS Winlink client with a web-based GUI. The sound-card modem for Amateur Radio Digital Open Protocol (ARDOP), which is one of the modes for connecting to Winlink, is available for Linux. Winlink can also be used (over shorter distances) with packet radio. There are other modes for Winlink, as well, but most of them are either deprecated or proprietary, and some are Windows-only.

FreeDV is a new digital mode with a different purpose than the ones previously mentioned, most of which transfer text in some form. FreeDV is a digital voice mode. It requires two sound cards; as the user speaks into a microphone connected to one, FreeDV uses an open codec called Codec 2 to compress the digital audio, then uses a sound-card modem on the second one to transmit the encoded audio signal over the radio. At the receiver, the same process runs in reverse. FreeDV, under many circumstances, allows more reliable communication than traditional analog phone modes. With analog voice, a weak signal can still be heard, but is difficult to understand. With digital voice, the signal is either clearly audible and intelligible, or it is not heard at all. This means that when a signal is neither strong nor weak, digital voice will usually be clearer and easier to understand.

One radio-related device that works with Linux is the RTL-SDR. This is a low-cost software-defined radio receiver, which can be used to receive most radio signals, including some AM broadcast stations, most ham radio signals, shortwave radio stations, marine and aviation communications, many police radios, and more. (Some digital signals can also be received, but most of those outside the ham bands are encrypted.) Some RTL-SDRs cost less than $15, but it is worth spending around $30-40 for a good-quality device. I recommend the ones available at RTL-SDR.com, because some others may not be able to receive AM broadcast, shortwave, and certain ham signals.

Becoming a ham

For many, getting a ham license is a good way to learn about and experiment with radio technology. At least in the US, any licensed ham can design their own digital mode and use it on the air, as long as it meets certain restrictions and is publicly documented. For others, becoming a ham is a way to help with disaster response. Organizations like the American Red Cross depend on ham radio for communication when internet and cellular infrastructure fails. Yet another reason is simply as a way to meet new people. While the possibility for this is somewhat limited with modes like FT8, which are more "computer-to-computer" than "person-to-person", many hams do publish their email addresses on sites such as QRZ.com, and most are happy to receive emails from people they have contacted on the air.

For those interested in getting a ham radio license, there are several resources available. The ARRL's Licensing, Education, and Training web page would be a good place to start if you live in the US. HamStudy.org is an excellent resource for both studying for the test and finding an exam session; it provides study guides for the US and Canadian tests, though its exam finder only lists US sessions. Finally, an Internet search for "ham radio club in [your city/town]" will most likely find a club's web site, which will probably either have contact information or more info on getting a license, or both.

Comments (21 posted)

Page editor: Jonathan Corbet

Inside this week's LWN.net Weekly Edition

  • Briefs: OSS-Fuzz on 100+ projects; OpenWrt 21.02; Firefox 92; OpenSSL 3; Quotes; ...
  • Announcements: Newsletters; conferences; security updates; kernel patches; ...
Next page: Brief items>>

Copyright © 2021, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds