LWN.net Weekly Edition for July 16, 2015
Mozilla and Pocket
Starting in version 38.0.5, Firefox includes a built-in integration with the bookmarking service Pocket. Although the Pocket service has been available in Firefox through an extension for several years, the integrated feature sparked an outcry among some users. Critics raised a variety of perceived problems with the feature, but most of the backlash focused on the proprietary nature of the Pocket service or on the perception that the feature resulted from a secret deal between the company and Mozilla—a deal that, presumably, did not take the community's best interests into account.
Recent history teaches that Mozilla should probably expect blowback whenever it adds a Firefox feature that involves cooperation with a closed-source service or company—implementing the W3C Encrypted Media Extension (EME) API or H.264 support, for example. Though blowback should perhaps be expected for every new Firefox feature (see the controversy about signed extensions, for example). In any case, although the past week has seen a rise in public debate about the Pocket feature (with blog posts critical of Mozilla from Benjamin Kerensa and Julien Voisin, among others), the feature itself is more than a month old, which warrants examining it in its historical context.
The Firefox 38.0.5 release landed on June 2. Pocket integration adds a button to the toolbar; clicking on it essentially allows the user to store the URL in a per-user Pocket account, from which it can be looked up and read later. In that sense, Pocket is no different than a traditional bookmark, except that a user's Pocket list is accessible from non-Firefox browsers (unlike bookmarks synchronized with Firefox Sync).
The addition of the feature was mentioned in the release notes and accompanying blog post, but some users seemed to find that degree of communication insufficient. For one thing, the 38.0.5 release is a "point point" release, which is not the normal place one expects to find the introduction of a significant new feature. For another, the feature evidently landed for Firefox 38 without first spending the usual amount of time in the Nightly channel—which, again, is the expected behavior. Many users—including Nightly channel testers—were taken by surprise when the feature appeared.
Questions
The most detailed critique of the feature, though, took place on the Mozilla Governance mailing list. Tucker McKnight filed a bug report about the move, in which he listed several issues. Shortly thereafter, McKnight was told to take the topic to the mailing list instead—which he did, there reiterating his concerns. McKnight focused on implementation details, starting with the fact that the Pocket integration is not implemented as a Firefox extension, but as native code. This, he said, raises three concerns:
- Extensions can be removed entirely, but Pocket support can only be disabled.
- Pocket support can only be disabled through the
about:config page, which is not user friendly, "
and therefore not in line with Mozilla's mission. In the past, Mozilla has been very good about showing the user what new features have been added to the interface and explaining any privacy implications that may come with them.
" - Pocket support uses the user's existing Firefox Account to sign in to the
Pocket web site. "
It may also not be clear to some users that, even when signing in with your Firefox account, you are still giving your email address to a third party whose privacy policy is different than Mozilla's.
"
Adam Porter replied, raising the lack-of-public-discussion issue, and also pointed out that the move gives favored status to a proprietary service at the expense of similar free-software projects (like wallabag). A more appropriate approach, he said, would have been to define a "save for later" API that Pocket and other projects could hook into.
The ensuing back-and-forth was, at times, overly heated—in ways that will sound familiar to those experienced in Internet discourse. A number of community members chimed in just to express outrage and announce that they were switching to Chrome, and some Mozilla employees lashed out at the critics to accuse them of being uninformed.
If one takes away the emotion, though, a few key points remain.
Some critics objected
to the Pocket feature because Mozilla has historically resisted
adding functionality to the core Firefox code that could easily be
implemented in extensions (ad blocking, for example). That philosophy
was one of the original justifications for decoupling Firefox from
the Netscape suite, so changing it now seems like a policy shift.
Similarly, others pointed
out that "back in the day, Mozilla implemented Mozilla Weave
(now Firefox Sync) exactly because existing alternatives were
proprietary.
" Thus, partnering with a proprietary vendor is an
about-face, one that is particularly noticeable given that Mozilla dropped
its Pocket-like
"Reading
List" feature at the same time.
Finally, a few critics raised specific objections to the privacy policy and terms of service (TOS) for Pocket. At the very least, the language of both documents is written to apply to an installable software project (as the Pocket extension was), while the new Pocket-integration feature is implemented as a set of web API calls. Those API calls use a pocket namespace, which adds some additional fuel to the argument that the feature favors one vendor to the exclusion of all others.
Most critics seemed to feel that Pocket, as a commercial entity, should not be implicitly trusted with user data, and many worried that the privacy policy allows Pocket to change its mind and begin commercializing the submitted data—leaving little recourse to users. Others raised concerns about the US-centric language in the policies and about prohibitions on using the service commercially or with objectionable (to some) links.
Answers
For its part, Mozilla representatives have provided responses to most of the core criticisms. Gijs Kruitbosch, a Mozilla engineer who worked on Pocket feature, answered both the lack-of-discussion and "playing favorites" criticisms. The feature landed late in the development cycle, he said, so the API and preference names were written specific to Pocket for the sake of speed—but the plan is to generalize them in future releases. Furthermore, Mozilla is using the Pocket implementation to gather usage data that will lead to a more open API once the use patterns and requirements are better understood. Mozilla's Mike Connor added that the same approach was taken for the first versions of search-engine integration and Firefox's Social API.
As to the concern that Pocket is a closed-source service, Mozilla's Gervase Markham replied
Mozilla has partnered with closed-back-end services in the past
without raising ire—most notably "the bundled search engines, safe browsing
and (until recently) our location service
". He did, however, agree
that the UI's perceived ambiguity about the fact that user data is being sent to a third party is a
valid complaint.
Ultimately, though, Mozilla could not provide easy answers to every question—in particular, to the privacy and TOS concerns. Dan Stillman called the comparison to search-engine integration invalid, given that Firefox already had a bookmark-sync feature that did offer privacy safeguards:
Connor noted
that Mozilla's bookmark-saving web service, Firefox Sync, was designed with strong cryptography
and strong privacy protections in mind, and that it failed to catch
on. "The vast majority of users didn't understand
or care about the added security. It was more of a liability than an asset.
Firefox Accounts make a different tradeoff as a result, and it's
unsurprisingly more popular (and _useful_) as a result.
"
Meanwhile, he said, Pocket has already proven itself
popular—both as a browser
extension and on other platforms (such as e-readers and mobile OSes
without Firefox).
On June 10, Markham volunteered to get clarification on the Pocket TOS and privacy policy as they apply to the Firefox-integration feature. On July 14, Urmika Devi from the Mozilla legal team joined the discussion and gave a blanket answer to the policy questions:
It remains to be seen how Devi's response (which also addressed some of the specific, recurring concerns) will be interpreted, but the legal team has agreed to follow up on any additional questions.
Nevertheless, there remain other unanswered questions, too. For example, Stillman, McKnight, and several others requested more information (and even a timeline) about when and how the "save for later" feature now used only by Pocket would be opened up to additional participants, as Kruitbosch suggested it would. Others have asked whether or not the Pocket deal provides revenue to Mozilla. There has not yet been a reply on either point. Whatever else Mozilla may have in mind for the feature, this debate indicates that one thing it certainly needs is improved clarity and communication with the community.
A split within the Tox project
Key developers from the Tox encrypted-chat project recently parted ways with the project's titular leader, after a dispute over the misuse of donated funds. The team has set up a new web site and source-code repository, but the old sites remain in place, too. In addition, the conflict over the project's funds could have a serious impact on Tox's Google Summer of Code (GSoC) projects—not to mention the project's future.
The root of the disagreement is an accusation that Sean Qureshi, the sole board member of the Tox Foundation, withdrew funds from the Tox Foundation's bank account for personal use, and has refused subsequent calls to rectify the situation. How the Tox project collects and manages donations has, evidently, been a point of contention for quite some time—prompting other developers to depart. But this incident appears to have been the breaking point, which led to an acrimonious split between the parties.
The Tox project was started in mid-2013 by users on the 4chan message board. At its heart, it provides a peer-to-peer encrypted messaging service. The main project, however, develops only the core library and protocol; outside contributors are responsible for all of the client software. Qureshi has long acted as project leader (maintaining the domain registrations, managing infrastructure, coordination activities, and so on) but has never been lead developer—that is an individual who chooses to go only by "irungentoo" online.
In June of 2015, rumors began to circulate that Qureshi had withdrawn funds from the Tox Foundation bank account and used them on personal expenses. Those funds included both individual donations and money collected for mentoring GSoC students. As word of these events spread, several contributors to the project announced their departure, criticizing Qureshi as well as irungentoo—the latter for not taking action.
On July 5, GitHub user "rr44" opened up a bug against Tox asking for an explanation. Irungentoo replied, saying:
I think everyone (Including myself) has learnt from this and we will make sure that something like this won't ever happen again.
On July 11, he followed up by re-launching the Tox project at a new
site: tox.chat, rather than the
previous tox.im, and by making an announcement
on the new site's blog. There, irungentoo described the events in more detail,
saying that Qureshi admitted he "'took a loan against the Tox
Foundation', and used the entirety of the foundation’s funds on
personal expenses completely unrelated to the project.
"
Irungentoo issued an apology to the community, saying "we
certainly could have handled our finances in a more responsible and
transparent manner
" and "we can blame no one but
ourselves for this.
"
He also said that one reason for the lengthy delay between the
first rumors of misconduct and the formal split was that project
members had spent a lot of time trying to privately rectify matters
with Qureshi. In the end, though, no such reconciliation was
achieved, so the development team felt forced to disassociate itself
from Qureshi and the Tox Foundation, and begin again with new servers
and new domains. As for finances, the announcement said only that
"we will not be taking any official donations until we have set
up a proper organization with an emphasis on transparency and
protection of assets (more details on this at a future date)
".
It also noted, though, that many developers would accept individual
donations.
The nebulous status of the Tox Foundation seems to play a major role in the story. Several commenters, including rr44, complained that the Tox Foundation, despite its name, was not set up as a non-profit charity. Unfortunately, the online records available are rather limited; the Tox Foundation is listed as a California-registered corporation with Qureshi as its acting agent, but that does not preclude it being a non-profit entity. Complicating matters is the fact that, while Qureshi may be the Chief Financial Officer of the Tox Foundation company, irungentoo is supposedly the Chief Executive Officer.
This, too, is hard to verify online. Developer David Lohle left the project several months ago (following arguments about the management and transparency of finances). He posted an image of the forms allegedly filed to register the Tox Foundation, but admitted he had no way of knowing if the copy of the form he saw (sent to him by Qureshi) matched what was filed. In any case, blog posts and comments repeatedly refer to a Tox Foundation "board" that may not exist in any formal sense at all.
Concern over the status of the Tox Foundation has come up before—another former developer, Mark Winter, also cited disagreements about it in late 2014 as a reason he withdrew from the project. It is also clear that several contributors and community members have been upset that, whatever funds were collected (from various sources), money was not spent to improve the Tox code—such as by paying for a security audit.
For his part, Qureshi posted a message on the Tox.im blog on July 9. His account of the recent events claims that "the project" was unwilling to work with "The Tox Foundation" on infrastructure matters:
We wanted to peacefully transition things and share data, working together to move things from 1 owner to the next while ensuring everything continued to work and operate without issue.
Unfortunately rather than work together to move everything and its data when I tried to discuss how to do so I was told to fuck off.
He concludes by noting that the tox.im wiki and mailing list will be put into read-only mode, that development and maintenance will continue for several Tox components, and that in two years he will reassess whether to continue or suspend the project.
It is always troubling to see a free-software project experience an acrimonious falling out, and is even more troubling when accusations of financial impropriety are at its core. Perhaps the only good news in this series of events is that irungentoo's relaunched Tox project seems to have the support of virtually the entire Tox community.
But, even there, the "entire Tox community" seems to have shrunk in recent months with the departures of Lohle, Winter, and a few other long-time contributors over the very lack of financial transparency that would appear to have made the current situation possible. The new project will likely have a uphill climb ahead of it to re-establish trust, and it may be in a precarious position with its 2015 class of GSoC students. Irungentoo claimed in his blog post to have learned a hard lesson about project governance and finances; wherever it is that Tox goes from here, its story will certainly be a lesson to many other software projects.
The Tox developers may have placed too much trust in a particular individual, but they also failed to set up a formal governance structure—complete with a well-known decision-making board (composed of multiple people) and, instead, relied on one person to file some sort of paperwork, the nature and specific contents of which remain mysterious even now. Furthermore, the Tox Foundation may have been set up to accept donations and payment, but there does not seem to be any formal agreement on the ownership of domain names and other pieces that make up the project infrastructure.
It is understandable that irungentoo and others felt out of their depth on these issues, but that is also why umbrella organizations like the Software Freedom Conservancy and Software in the Public Interest exist. They can help projects protect themselves against such conflicts, well before they reach the crisis stage.
Python 3.5 is on its way
It has been nearly a year and a half since the last major Python release, which was 3.4 in March 2014—that means it is about time for Python 3.5. We looked at some of the new features in 3.4 at the time of its first release candidate, so the announcement of the penultimate beta release for 3.5 seems like a good time to see what will be coming in the new release. Some of bigger features are new keywords to support coroutines, an operator for matrix multiplication, and support for Python type checking, but there are certainly more. Python 3.5 is currently scheduled for a final release in September.
As usual, the "What's New In Python 3.5" document is the place to start for information about the release. At this point, it is still in draft form, with significant updates planned over the next couple of months. But there is lots to digest in what's there already.
Type hints
One of the more discussed features for 3.5 only showed up in the first beta back in May: type hints (also known as PEP 484). It provides an way to optionally annotate the types of Python functions, arguments, and variables in a way that can be used by various tools. Type hints have been championed by Python benevolent dictator for life (BDFL) Guido van Rossum and came about rather late in the 3.5 development cycle, so there was always a bit of a question whether it would make the feature freeze that came with the first beta release.
But, just prior to that deadline, BDFL-Delegate Mark Shannon accepted the PEP on May 22, with a nod to some of the opposition to the feature that has arisen:
Shannon continued by noting that other languages have an operator-overloading
feature that often get used in ugly ways, but that Python's is generally
used sensibly. That's because of Python's culture, which promotes the idea
that "readability matters
". He concluded:
"Python is your language, please use type-hints
responsibly :)
".
Matrix multiplication
A new binary operator will be added to Python to support matrix multiplication (specified by PEP 465). "@" is a new infix operator that shares its precedence with the standard "*" operator for regular multiplication. The "@=" operator will perform the matrix multiplication and assignment, as the other, similar operators do.
The @ operator will be implemented with two "dunder" (double underscore) methods on objects: __matmul__() and __rmatmul__() (the latter is for right-side matrix multiplication when the left-side object does not support @). Unsurprisingly, @= is handled with the __imatmul__() method. That is all the new feature defines, since Python does not have a matrix type, nor does the language define what the @ operator actually does. That means no standard library or builtin types are being changed to support matrix multiplication, though a number of different scientific and numeric Python projects have reached agreement on the intended use and semantics of the operator.
Currently, developers of libraries like NumPy have to make a decision about how to implement Python's multiplication operator (*) for matrices. All of the other binary operators (e.g. +, -, /) can only reasonably be defined element-wise (i.e. applying the operation to corresponding elements in each operand), but multiplication is special in that both the element-wise and the specific 2D matrix varieties are useful. So, NumPy and others have resorted to using a method for 2D matrix multiplication, which detracts from readability.
The only other use of @ in Python is at the start of a decorator, so there will be no confusion in parsing the two different ways of using it. Soon, users of NumPy will be able to multiply matrices in a straightforward way using statements like:
z = a @ b a @= b
async and await
Another fast-moving feature adds new keywords for coroutine support to the language. PEP 492 went from being proposed in early April to being accepted and having its code merged in early May. The idea is to make it easier to work with coroutines in Python, as the new async and await keywords don't really add new functionality that couldn't be accomplished with existing language constructs. But they do provide readability improvements, which is one of the main drivers of the addition.
Essentially, functions, for loops, and with statements can be declared as asynchronous (i.e. may suspend their execution) by adding the async keyword:
async def foo(...): ... async for data in cursor: ... async with lock: ... await foo() ...
The await statement that can be seen in the with example is similar to the yield from statement that was added to Python 3.3 (described in PEP 380). It suspends execution until the awaitable foo() completes and returns its result.
While async and await will eventually become full-fledged keywords in Python, that won't happen until Python 3.7 in roughly three years. The idea is to allow programmers time to switch away from using those two statement names as variables, which is not allowed for Python keywords. As it turns out, await can only appear in async constructs, so the parser will keep track of its context and treat those strings as variables outside of those constructs. It is a clever trick that allows the language to "softly deprecate" the keywords.
Zip applications
Over the last few years there has been a lot of discussion on python-dev, python-ideas, and elsewhere about how to easily distribute Python programs; the zipapp module (specified in PEP 441) is a step along that path. Partly, the PEP authors simply want to promote a feature that was added back in the Python 2.6 days. It provides the ability to execute a ZIP format file that contains a collection of Python files, including a __main__.py, which is where the execution starts.
Besides just publicizing a "relatively unknown
" language
feature, the zipapp module would be added to provide some tools to help
create and maintain
"Python Zip Applications", which is the formal name for these application
bundles. The bundles will have a .pyz extension for console
applications, while windowed applications (i.e.
won't require a console on Windows) will end in .pyzw . The
Windows installer will associate those extensions with the Python
launcher. It is hoped that .pyz will be connected with Python on
Unix-like systems (by way of the MIME-info
database), but that is not directly under the control of the Python
developers.
The appropriate interpreter (i.e. Python 2 or Python 3) can be associated with the file by prepending a "shebang" line to the ZIP format (which ZIP archivers will simply skip). For example (from the PEP):
#!/usr/bin/env python3 # Python application packed with zipapp module (binary contents of archive)
In addition, the zipapp module can be directly accessed from the command line. It provides two helper functions (create_archive() and get_interpreter()) but will likely be used as follows:
$ python -m zipapp dirnameThat creates a dirname.pyz file that contains the files in the directory (which must include a __main__.py). There are also command-line options that govern the archive name, the interpreter to use, or the main function to call from a __main__.py that zipapp will generate.
Formatting bytes
The distinction between bytes and strings in Python 3 is one of the defining characteristics of the language. That was done to remove some of the ambiguities and "magic" from handling Unicode that are present in Python 2. "Bytes" are simply immutable sequences of integers, each of which is in the range 0–255, while "bytearrays" are their mutable counterpart. There is no interpretation placed on the byte types; strings, on the other hand, always contain Unicode.
But there are a number of "wire protocols" that combine binary and textual data (typically ASCII), so when writing code to deal with those, it would be convenient to be able to use the Python string formatting operations to do so. Today, though, trying to interpolate an integer value into a bytes type does not work:
>>> b'%d' % 5 Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: unsupported operand type(s) for %: 'bytes' and 'int'
The adoption of PEP 461 will change that. All of the numeric formatting codes, when applied to bytes, will be treated as if they were:
('%x' % val).encode('ascii')Where x is any of the dozen or so numeric formatting codes. In addition, %c will insert a single byte (either from an integer in the 0–255 range or from a byte type of length 1) and %b can be used to interpolate a series of bytes. Neither of those will accept string types since converting a string to bytes requires an encoding, which Python 3 will not try to guess.
In addition, bytes and bytearrays will be getting a hex() method. That will allow turning bytes into hexadecimal strings:
>>> bytes([92, 83, 80, 255]).hex() '5c5350ff'
The grab bag
There are, of course, lots more features coming in Python 3.5—way more than we can cover here—but there are some additional ones that caught our eye and we will look at briefly. For example, the ordered dictionary in collections.OrderedDict has been reimplemented in C, which provides a 4–100x performance increase. PEP 485, which has been accepted for 3.5, proposes adding a function to test approximate equality to both the math and cmath modules: math.is_close() and cmath.is_close().
The "system call interrupted" error code on POSIX systems (EINTR, which gets turned into the Python InterruptedError exception) is often unexpected by applications when they are making various I/O calls, so those exceptions may not be handled correctly. It would make more sense to centralize the handling of EINTR. So, in the future, the low-level system call wrappers will automatically retry operations that fail with EINTR (which is a feature that is described in PEP 475).
The process of initializing extension modules (which are implemented as shared libraries), built-in modules, and imported modules is subtly different in ways that can lead to import loops and other problems. PEP 489 was proposed to regularize this process so that all of the different "import-like" mechanisms behave in the same way; it will be part of Python 3.5.
Eliminating .pyo files is the subject of PEP 488 and it will be implemented for the upcoming release. Those files were meant to hold optimized versions of the Python bytecode (unoptimized bytecode is usually placed into .pyc files), but the extension was used for multiple different optimization levels, which meant manually removing all .pyo files to ensure they were all at the same level. In the future, these __pycache__ file names will all end in .pyc, but will indicate both the optimization level and interpreter used in the name, for example: foo.cpython-35.opt-2.pyc.
Lastly, at least for our look, Python 3.5 will add an
os.scandir() function to provide a "better and faster
directory iterator
". The existing os.walk() does
unnecessary work that results in roughly 2x the number of system calls
required. os.scandir() will return a generator that produces file
names as needed, rather than as one big list, and os.walk() will
be implemented using the new function. That will result in performance
increases of 2–3x on POSIX systems and 8–9x for Windows.
As can be seen, there is lots of good stuff coming in Python 3.5. Over the next few months, testing of the betas and release candidates should hopefully shake out all but the most obscure bugs, leading to a solid 3.5 release in mid-September.
Security
Thwarting the "evil maid"
The "evil maid" attack against disk encryption has been known for quite a few years now. We looked at it back in 2009, shortly after Joanna Rutkowska (who first described the attack) announced a proof-of-concept "evil maid" attack on the popular TrueCrypt disk-encryption tool. In 2011, she also came up with an "Anti Evil Maid" tool; more recently, Matthew Garrett has come up with some refinements that he calls "Anti Evil Maid 2 Turbo Edition". These methods for stopping evil maids (and others with physical access to our systems) are worth a look.
The "evil maid" attack got its name from a scenario where the maid at a hotel (or someone pretending to be) would access a guest's laptop while they were out of their room. The attacker would install a kind of malware in the master boot record (MBR) of the system that would record the decryption password. Another visit is all it would take for the attacker to get the password (and perhaps to decrypt some data). More sophisticated attacks might send the password to the attacker over the network. Some way to access the disk's data, now that it can be decrypted, would need to be arranged (e.g. copy the disk at installation time, steal the laptop later, etc.).
Ensuring the code being run by the system is the same as what is expected is the only mechanism to thwart many of these kinds of attacks. Using the Trusted Platform Module (TPM) hardware that is present on many systems these days can provide a way to ensure that the system's firmware, bootloader, and other code involved in the disk encryption have not changed. Garrett described how the TPM can be used:
If any of those components has been modified, the TPM measurement by the previous component will detect it. In addition, the TPM can be used to encrypt some data that it will only decrypt if all of the PCR values match their expected values. So an encrypted key for the disk could be stored in such a way that it can only be decrypted if the system has not been modified. So far, so good.
But, either the disk decryption key is applied automatically at boot time, which leaves the system open to simply being stolen and booted, or a password can be applied to the boot process. However, that leaves another problem behind, as Garrett outlined:
This is where Rutkowska's Anti Evil Maid comes into play. Users can encrypt a secret phrase with the TPM, which can be used in one of two forms. It can be stored on a USB stick that is consulted whenever the user believes there is a reason to check the integrity of their system. If the TPM can decrypt the phrase, then all is well. That does, however, require the user to decide when to check, which is not fully reliable.
An alternative is to do it on every boot, but that has its flaws as well.
The attacker could simply boot the system, see the phrase, and modify their
malware to simply print the proper phrase. That can also be handled by
password-protecting the TPM, but that results in
the scenario where a fake password prompt is offered, the password is
stored, the malware removes itself, and the system is rebooted. As Garrett
noted, users can be trained to recognize the attack: "if the system
reboots without the user seeing the secret, the user must assume
that their
system has been compromised and that an attacker now has a copy of their
TPM password.
"
The usability of that mechanism is not all that good, though. Garrett has come up with his "Turbo Edition" that adds a dynamic element into the mix, so that both the user and the computer can independently agree on a password that changes frequently. As he pointed out, many already use a one-time password (OTP) for two-factor authentication. In that scenario, users prove to the computer that they can generate the proper OTP, while Garrett's mechanism would reverse that: the computer would prove that it can generate the proper OTP to show that it hasn't been compromised.
Garrett has created a prototype that uses the time-based OTP (TOTP) algorithm to generate the passwords. TOTP takes a secret and the time of day to generate the OTP. That secret can be encrypted by the TPM so it will only be available if the system has not been tampered with. Enrolling the secret into a TOTP smartphone app allows the user to generate the same OTP. So instead of a static secret phrase that gets decrypted and printed to the screen (which a physically present attacker could learn), the boot process calculates an OTP and prints that to the screen, which the user verifies on their smartphone. An attacker who learns an OTP will have no advantage as long as the user is diligent about verifying the OTP on every boot.
It is a clever combination of two existing security technologies that should work well, once all of the pieces are in place. As Garrett pointed out, there are caveats:
There is an alternative to all of these complicated "Anti Evil Maid" techniques, of course: maintaining physical control of laptops (or other systems) at all times. That is a tall order for most people; for the truly paranoid (or targeted), though, it is the safest course. But, for that to be effective, any loss of control, even for a short time, has to result in the system being discarded as "compromised". Obtaining a new, trusted system and retrieving the data from backups will be required, though now the system used for backups is the big target—and may be far harder to prevent physical access to.
Brief items
Security quotes of the week
The NSA would have had to weigh its collection programs against the possibility of public scrutiny. Sony would have had to think about how it would look to the world if it paid its female executives significantly less than its male executives. HBGary would have thought twice before launching an intimidation campaign against a journalist it didn't like, and Hacking Team wouldn't have lied to the UN about selling surveillance software to Sudan. Even the government of Saudi Arabia would have behaved differently.
A new OpenSSL vulnerability
The OpenSSL project has disclosed a new certificate validation vulnerability. "During certificate verification, OpenSSL (starting from version 1.0.1n and 1.0.2b) will attempt to find an alternative certificate chain if the first attempt to build such a chain fails. An error in the implementation of this logic can mean that an attacker could cause certain checks on untrusted certificates to be bypassed, such as the CA flag, enabling them to use a valid leaf certificate to act as a CA and 'issue' an invalid certificate." This is thus a client-side, man-in-the-middle vulnerability.
Note that the affected versions of OpenSSL were released in mid-June; anybody with an older release should not be vulnerable.
NSA releases Linux-based open source infosec tool (ITNews)
ITNews reports that the US National Security Agency is in the process of releasing its systems integrity management platform - SIMP. "SIMP helps to keep networked systems compliant with security standards, the NSA said, and should form part of a layered, "defence-in-depth" approach to information security. NSA said it released the tool to avoid duplication after US government departments and other groups tried to replicate the product in order to meet compliance requirements set by US Defence and intelligence bodies." Currently only RHEL and CentOS versions 6.6 and 7.1 are supported.
Jones: Future development of Trinity
Here's a discouraging blog post from Dave Jones on why he will no longer be developing the Trinity fuzz tester. "It’s no coincidence that the number of bugs reported found with Trinity have dropped off sharply since the beginning of the year, and I don’t think it’s because the Linux kernel suddenly got lots better. Rather, it’s due to the lack of real ongoing development to 'try something else' when some approaches dry up. Sadly we now live in a world where it’s easier to get paid to run someone else’s fuzzer these days than it is to develop one."
Bruce Schneier: IT Teams Need Cyberattack Response Planning More Than Prevention (Linux.com)
Linux.com has an interview with Bruce Schneier. "Schneier: The most important takeaway is that we are all vulnerable to this sort of attack. Whether it's nation-state hackers (Sony), hactivists (HB Gary Federal, Hacking Team), insiders (NSA, US State Department), or who-knows-who (Saudi Arabia), stealing and publishing an organization's internal documents can be a devastating attack. We need to think more about this tactic: less how to prevent it -- we're already doing that and it's not working -- and more how to deal with it. Because as more people wake up and realize how devastating an attack it is, the more we're going to see it."
The Core Infrastructure Initiative census project
The Core Infrastructure Initiative (a Linux Foundation effort to direct resources to critical projects in need of help) has announced a census project to identify the development projects most in need of assistance. "Unlike the Fed’s stress tests, which are opaque, all of the census data and analysis is open source. We are eager for community involvement. We encourage developers to fork the project and experiment with different data sources, different parameters, and different algorithms to test out the concept of an automated risk assessment census. We are also eager for input to help sanitize and complete the data that was used in this first iteration of the census."
New vulnerabilities
java: multiple vulnerabilities
Package(s): | java-1.7.0-openjdk | CVE #(s): | CVE-2015-2590 CVE-2015-2601 CVE-2015-2621 CVE-2015-2625 CVE-2015-2628 CVE-2015-2632 CVE-2015-4731 CVE-2015-4732 CVE-2015-4733 CVE-2015-4748 CVE-2015-4749 CVE-2015-4760 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Created: | July 15, 2015 | Updated: | January 14, 2016 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Description: | From the Red Hat advisory:
Multiple flaws were discovered in the 2D, CORBA, JMX, Libraries and RMI components in OpenJDK. An untrusted Java application or applet could use these flaws to bypass Java sandbox restrictions. (CVE-2015-4760, CVE-2015-2628, CVE-2015-4731, CVE-2015-2590, CVE-2015-4732, CVE-2015-4733) A flaw was found in the way the Libraries component of OpenJDK verified Online Certificate Status Protocol (OCSP) responses. An OCSP response with no nextUpdate date specified was incorrectly handled as having unlimited validity, possibly causing a revoked X.509 certificate to be interpreted as valid. (CVE-2015-4748) It was discovered that the JCE component in OpenJDK failed to use constant time comparisons in multiple cases. An attacker could possibly use these flaws to disclose sensitive information by measuring the time used to perform operations using these non-constant time comparisons. (CVE-2015-2601) Multiple information leak flaws were found in the JMX and 2D components in OpenJDK. An untrusted Java application or applet could use this flaw to bypass certain Java sandbox restrictions. (CVE-2015-2621, CVE-2015-2632) A flaw was found in the way the JSSE component in OpenJDK performed X.509 certificate identity verification when establishing a TLS/SSL connection to a host identified by an IP address. In certain cases, the certificate was accepted as valid if it was issued for a host name to which the IP address resolves rather than for the IP address. (CVE-2015-2625) It was discovered that the JNDI component in OpenJDK did not handle DNS resolutions correctly. An attacker able to trigger such DNS errors could cause a Java application using JNDI to consume memory and CPU time, and possibly block further DNS resolution. (CVE-2015-4749) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Alerts: |
|
java: two vulnerabilities
Package(s): | java-1.8.0-openjdk | CVE #(s): | CVE-2015-2659 CVE-2015-3149 | ||||||||||||||||||||||||||||||||
Created: | July 15, 2015 | Updated: | July 21, 2015 | ||||||||||||||||||||||||||||||||
Description: | From the Red Hat advisory:
It was discovered that the GCM (Galois Counter Mode) implementation in the Security component of OpenJDK failed to properly perform a null check. This could cause the Java Virtual Machine to crash when an application performed encryption using a block cipher in the GCM mode. (CVE-2015-2659) Multiple insecure temporary file use issues were found in the way the Hotspot component in OpenJDK created performance statistics and error log files. A local attacker could possibly make a victim using OpenJDK overwrite arbitrary files using a symlink attack. Note: This issue was originally fixed as CVE-2015-0383, but the fix was regressed in the RHSA-2015:0809 advisory. (CVE-2015-3149) | ||||||||||||||||||||||||||||||||||
Alerts: |
|
kernel: two remote denial of service vulnerabilities
Package(s): | kernel | CVE #(s): | CVE-2015-5364 CVE-2015-5366 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Created: | July 13, 2015 | Updated: | June 14, 2016 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Description: | From the CVE assignment email, which was evidently prompted by a tweet from grsecurity:
It appears that you are primarily asking for a CVE ID for the issue involving the absence of a cond_resched call. Use CVE-2015-5364. However, the presence of "return -EAGAIN" may also have been a security problem in some realistic circumstances. For example, maybe there's an attacker who can't transmit a flood with invalid checksums, but can sometimes inject one packet with an invalid checksum. The goal of this attacker isn't to cause a system hang; the goal is to cause an EPOLLET epoll application to stop reading for an indefinitely long period of time. This scenario can't also be covered by CVE-2015-5364. Is it better to have no CVE ID at all, e.g., is udp_recvmsg/udpv6_recvmsg simply not intended to defend against this scenario? | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Alerts: |
|
libcapsinetwork: denial of service
Package(s): | libcapsinetwork | CVE #(s): | CVE-2015-0841 | ||||
Created: | July 13, 2015 | Updated: | July 15, 2015 | ||||
Description: | From the Gentoo advisory:
An off-by-one buffer overflow in libcapsinetwork network handling code is discovered. A remote attacker could send a specially crafted request to application, that is linked with libcapsinetwork, possibly resulting in a Denial of Service condition. | ||||||
Alerts: |
|
libunwind: buffer overflow
Package(s): | libunwind | CVE #(s): | CVE-2015-3239 | ||||||||||||||||||||||||||||||||||||||||
Created: | July 13, 2015 | Updated: | October 5, 2015 | ||||||||||||||||||||||||||||||||||||||||
Description: | From the Debian LTS advisory:
Invalid dwarf opcodes can cause references beyond the end of the array. | ||||||||||||||||||||||||||||||||||||||||||
Alerts: |
|
openssl: certificate verification botch
Package(s): | openssl | CVE #(s): | CVE-2015-1793 | ||||||||||||||||||||||||||||||||
Created: | July 10, 2015 | Updated: | July 15, 2015 | ||||||||||||||||||||||||||||||||
Description: | From the OpenSSL advisory:
During certificate verification, OpenSSL (starting from version 1.0.1n and 1.0.2b) will attempt to find an alternative certificate chain if the first attempt to build such a chain fails. An error in the implementation of this logic can mean that an attacker could cause certain checks on untrusted certificates to be bypassed, such as the CA flag, enabling them to use a valid leaf certificate to act as a CA and "issue" an invalid certificate. | ||||||||||||||||||||||||||||||||||
Alerts: |
|
perl: denial of service
Package(s): | perl | CVE #(s): | CVE-2013-7422 | ||||||||
Created: | July 10, 2015 | Updated: | July 15, 2015 | ||||||||
Description: | From the Gentoo advisory:
S_regmatch() function lacks proper checks before passing arguments to atoi() A remote attacker could send a specially crafted input, possibly resulting in a Denial of Service condition. | ||||||||||
Alerts: |
|
portage: certificate verification botch
Package(s): | portage | CVE #(s): | CVE-2013-2100 | ||||
Created: | July 10, 2015 | Updated: | July 15, 2015 | ||||
Description: | From the Gentoo advisory:
Portage does not verify X.509 SSL certificate properly if HTTPS is used. A remote attacker can spoof servers and modify binary package lists via specially crafted certificate. | ||||||
Alerts: |
|
python-django: two vulnerabilities
Package(s): | python-django | CVE #(s): | CVE-2015-5143 CVE-2015-5144 | ||||||||||||||||||||||||||||||||||||||||||||
Created: | July 9, 2015 | Updated: | October 22, 2015 | ||||||||||||||||||||||||||||||||||||||||||||
Description: | From the Debian advisory:
CVE-2015-5143: Eric Peterson and Lin Hua Cheng discovered that a new empty record used to be created in the session storage every time a session was accessed and an unknown session key was provided in the request cookie. This could allow remote attackers to saturate the session store or cause other users' session records to be evicted. CVE-2015-5144: Sjoerd Job Postmus discovered that some built-in validators did not properly reject newlines in input values. This could allow remote attackers to inject headers in emails and HTTP responses. | ||||||||||||||||||||||||||||||||||||||||||||||
Alerts: |
|
rubygem-moped: denial of service
Package(s): | rubygem-moped | CVE #(s): | CVE-2015-4411 | ||||||||
Created: | July 14, 2015 | Updated: | July 15, 2015 | ||||||||
Description: | From the Red Hat bugzilla:
The following Denial of Service issue was discovered in Moped Ruby gem: If a crafted value will be passed to Moped::BSON::ObjecId.legal? method, this will cause Moped to think MongoDB is down, and ping it 39 more times with intervals. In other words, Moped will keep a worker busy for 5 seconds and make x40 requests to MongoDB. This covers an incomplete fix for CVE-2015-4410. | ||||||||||
Alerts: |
|
virtuoso-opensource: multiple unspecified vulnerabilities
Package(s): | virtuoso-opensource | CVE #(s): | |||||
Created: | July 9, 2015 | Updated: | July 15, 2015 | ||||
Description: | From a message to the oss-security list from Florian Weimer of Red Hat:
A long time ago, we looked at the low-level data marshaling code in the database server, and found quite a few memory safety issues. We also encountered server crashes and problems which looked like race conditions, affecting server stability. [...] We have not assigned CVE identifiers because the number of different crashes we saw was fairly large, and we could not completely understand how the RPC implementation is pieced together. More information can also be found in the Mageia bug. | ||||||
Alerts: |
|
Page editor: Jake Edge
Kernel development
Brief items
Kernel release status
The current development kernel is 4.2-rc2, released on July 12. Linus said: "This is not a particularly big rc, and things have been fairly calm. We definitely did have some problems in -rc1 that bit people, but they all seemed to be pretty small, and let's hope that -rc2 ends up having fewer annoying issues."
Stable updates: 4.1.2, 4.0.8, 3.14.48, and 3.10.84 were released on July 10. Note that 4.0.9, when it is released, will be the last of the 4.0.x series.
Quotes of the week
Kernel development news
Making kernel pages movable
A longtime aspect of the kernel's memory-management subsystem is that it tends to fragment memory over time. After a system has been running for a while, finding groups of physically contiguous pages can be difficult; that is why kernel code will often go to great lengths to avoid the need for contiguous blocks of pages. But there are times when larger blocks are needed; among other things, the transparent huge pages feature requires them. Memory can be defragmented on the fly with the kernel's memory-compaction mechanism, but compaction is easily thwarted by kernel-allocated pages that cannot be moved out of the way.User-space pages are easily migrated; they are accessed via the page tables, so relocating a page is just a matter of changing the appropriate page-table entries. Pages in the system's page cache are also accessed via a lookup, so they can be migrated as well. Pages allocated by a random kernel subsystem or driver, though, are not so easy to move. They are accessed directly using kernel-space pointers and cannot be moved without changing all of those pointers. Because kernel pages are so hard to move, the memory-management subsystem tries to separate them from pages that can be moved, but that separation can be hard to maintain, especially when memory is in short supply in general. A single unmovable page can foil compaction for a large block of memory.
Solving this problem in any given subsystem will require getting that subsystem's cooperation in the compaction process; that is just what Gioh Kim's driver-page migration patch series sets out to do. It builds on some special-case code (introduced in 2012) that makes balloon-driver pages movable; the patches generalize that code so that it may be used in other subsystems as well.
To make a driver (or other kernel subsystem) support page migration (and, thus, compaction), the first step is to allocate an anonymous inode to represent those pages:
#include <linux/anon_inodes.h> struct inode *anon_inode_new(void);
The only real purpose of this inode appears to be to hold a pointer to an address_space_operations structure containing a few migration-related callbacks. The relevant methods are:
bool (*isolatepage) (struct page *page, isolate_mode_t mode); void (*putbackpage) (struct page *page); int (*migratepage) (struct address_space *space, struct page *page, struct page *newpage, enum migrate_mode mode);
migratepage() has been in the kernel (in various forms) since 2.6.16; the other two are new with Gioh's patch. To support compaction of its pages, a kernel subsystem should provide all three of these operations. Once the anonymous inode has been allocated, its i_mapping->a_ops field should be set to point to the address_space_operations structure containing the above methods.
Needless to say, only whole pages can be supported in the page-compaction system; memory allocated from slab caches will remain immobile. To make a page movable by the compaction code, a kernel subsystem needs to (1) mark the page as being "mobile" and (2) set its mapping field to that of the anonymous inode:
__SetPageMobile(page); page->mapping = anon_inode->mapping;
Once that is done, the kernel may consider that page for migration if it turns out to be in the way. The first step will be a call to isolatepage() to disconnect any internal mappings and ensure that the page can, indeed, be moved. The mode argument doesn't appear to be relevant for most code outside of the memory-management subsystem; the function should return true if the page can be migrated. Note that it's not necessary to cease use of the page at this point, but it is necessary to retain its ability to be moved.
The actual migration may or may not happen, depending on whether other nearby pages turn out to be movable. If it does happen, the migratepage() callback will be invoked. It should do whatever work is needed to copy the page's contents, set the new page's flags properly, and update any internal pointers to the new page. It should also perform whatever locking is needed to avoid concurrent access to the pages while the migration is taking place. The return code should be MIGRATEPAGE_SUCCESS if the operation worked, or a negative error code otherwise. If the migration succeeds, the old page should not be touched again after migratepage() returns.
The final step is a call to putbackpage(); its job is to replace the page in any internal lists and generally complete the migration process. If isolatepage() has been called on a given page, there will eventually be a putbackpage() call, regardless of whether the page is actually migrated in between the two calls.
As can be seen, there is a fair amount of work required to support compaction in an arbitrary kernel subsystem. As a result, this support is likely to be confined to a relatively small number of subsystems that use substantial amounts of memory. Gioh's patch adapts the balloon driver subsystem in this way; on systems employing virtualization, balloon devices can (by their nature) use large amounts of memory, so making it movable makes some sense. Other possible use cases include long-lived I/O buffers or drivers (such as graphics drivers) that need to store large amounts of data. Fixing just a few of these drivers should go a long way toward making more large, physically contiguous regions of memory available even after the system has been up for some time.
A walk among the symlinks
This is the third and final article in a series on pathname lookup in the Linux kernel; the first two were an introduction to pathname lookup and a look at RCU-walk. Thus far, the discussion has carefully avoided the complex subject of symbolic links, but that is about to change. Linux 4.2 will contain a substantial rewrite of much of the code for handling symbolic links in pathname lookup, which was part of the motivation for writing this series. Now we finally have enough background understanding to explore how this new symlink handling works.Symbolic links were first introduced into Unix with 4.1c-BSD in the early 1980s, but were not uniformly heralded as a good idea. Questions arose concerning whether they should be "as obvious as possible" (Dennis Ritchie's position) or whether they should be largely transparent. This particularly related to how ".." should be handled when the kernel is following a symlink. David Korn - author of the Korn Shell — made a fairly concrete proposal for the kernel to track which path was used to reach the "current working directory" so that ".." could lead back along that path. This never made it into any released Unix kernel, but does explain the behavior of the pwd built-in to ksh and related shells such as bash.
Other concerns were raised over what permission bits should mean and whether hard links to symlinks made sense. Such discussions have long since died down and POSIX came to define a set of semantics which, if not ideal, are at least uniformly implemented and fairly well understood. The task for pathname lookup in Linux is not to debate the meaning or value of symbolic links, but only to implement those semantics, correctly handling the various corner cases.
There are two changes of note that happened in the recent rewrite. First, the recursive function calls were removed. There is still a recursive element to the algorithm because the problem itself is recursive, but it is now implemented using iteration and an explicit stack. This allows the symlink stack to be allocated separately from the system stack and so reduces pressure on what is often a limited resource. One concrete benefit of this is that code in the "lustre" filesystem that places extra limits on symlink recursion due to stack space concerns can be removed.
Second, the new code allows symlinks to be followed while in RCU-walk mode, at least some of the time. Previously this was not possible, partly because there are some awkward cases and partly because no one had bothered to do the work.
The effort needed to understand the particular needs of symlinks in order to address these issues has resulted in some significant cleaning up of the code and simplifying of interfaces. The cleanup should remove at least one source of confusion that surfaced recently.
There are several basic issues that we will examine to understand the handling of symbolic links: the symlink stack, together with cache lifetimes, will help us understand the overall recursive handling of symlinks and lead to the special care needed for the final component. Then a consideration of access-time updates and summary of the various flags controlling lookup will finish the story.
The symlink stack
There are only two sorts of filesystem objects that can usefully appear in a path prior to the final component: directories and symlinks. Handling directories is quite straightforward: the new directory simply becomes the starting point at which to interpret the next component on the path. Handling symbolic links requires a bit more work.
Conceptually, symbolic links could be handled by editing the path. If a component name refers to a symbolic link, then that component is replaced by the body of the link and, if that body starts with a '/', then all preceding parts of the path are discarded. This is what the "readlink -f" command does, though it also edits out "." and ".." components.
Directly editing the path string is not really necessary when looking up a path, and discarding early components is pointless as they aren't looked at anyway. Keeping track of all remaining components is important, but they can of course be kept separately; there is no need to concatenate them. As one symlink may easily refer to another, which in turn can refer to a third, we may need to keep the remaining components of several paths, each to be processed when the preceding ones are completed. These path remnants are kept on a stack of limited size.
There are two reasons for placing limits on how many symlinks can occur in a single path lookup. The most obvious is to avoid loops. If a symlink referred to itself either directly or through intermediaries, then following the symlink can never complete successfully — the error ELOOP must be returned. Loops can be detected without imposing limits, but limits are the simplest solution and, given the second reason for restriction, quite sufficient.
The second reason was outlined recently by Linus:
Linux imposes a limit on the length of any pathname: PATH_MAX, which is 4096. There are a number of reasons for this limit; not letting the kernel spend too much time on just one path is one of them. With symbolic links you can effectively generate much longer paths so some sort of limit is needed for the same reason. Linux imposes a limit of at most 40 symlinks in any one path lookup. It previously imposed a further limit of eight on the maximum depth of recursion, but that was raised to 40 when a separate stack was implemented, so there is now just the one limit.
The nameidata structure that we met in an earlier article contains a small stack that can be used to store the remaining part of up to two symlinks. In many cases this will be sufficient. If it isn't, a separate stack is allocated with room for 40 symlinks. Pathname lookup will never exceed that stack as, once the 40th symlink is detected, an error is returned. It might seem that the name remnants are all that needs to be stored on this stack, but we need a bit more. To see that, we need to move on to cache lifetimes.
Storage and lifetime of cached symlinks
Like other filesystem resources, such as inodes and directory entries, symlinks are cached by Linux to avoid repeated costly access to external storage. It is particularly important for RCU-walk to be able to find and temporarily hold onto these cached entries, so that it doesn't need to drop down into REF-walk.
While each filesystem is free to make its own choice, symlinks are typically stored in one of two places. Short symlinks are often stored directly in the inode. When a filesystem allocates a struct inode it typically allocates extra space to store private data (a common object-oriented design pattern in the kernel). This will sometimes include space for a symlink. The other common location is in the page cache, which normally stores the content of files. The pathname in a symlink can be seen as the content of that symlink and can easily be stored in the page cache just like file content.
When neither of these are suitable, the next most likely scenario is that the filesystem will allocate some temporary memory and copy or construct the symlink content into that memory whenever it is needed.
When the symlink is stored in the inode, it has the same lifetime as the inode which, itself, is protected by RCU or by a counted reference on the dentry. This means that the mechanisms that pathname lookup uses to access the dcache and icache (inode cache) safely are quite sufficient for accessing some cached symlinks safely. In these cases, the i_link pointer in the inode is set to point to wherever the symlink is stored and it can be accessed directly whenever needed.
When the symlink is stored in the page cache or elsewhere, the situation is not so straightforward. A reference on a dentry or even on an inode does not imply any reference on cached pages of that inode, and even an rcu_read_lock() is not sufficient to ensure that a page will not disappear. So, for these symlinks, the pathname lookup code needs to ask the filesystem to provide a stable reference and, significantly, needs to release that reference when it is finished with it.
Taking a reference to a cache page is often possible even in RCU-walk mode. It does require making changes to memory, which is best avoided, but that isn't necessarily a big cost and it is better than dropping out of RCU-walk mode completely. Even filesystems that allocate space to copy the symlink into can use GFP_ATOMIC to often successfully allocate memory without the need to drop out of RCU-walk. If a filesystem cannot successfully get a reference in RCU-walk mode, it must return -ECHILD and unlazy_walk() will be called to return to REF-walk mode in which the filesystem is allowed to sleep.
The place for all this to happen is the i_op->follow_link() inode method. In the present mainline code this is never actually called in RCU-walk mode as the rewrite is not quite complete. It is likely that in a future release this method will be passed an inode pointer when called in RCU-walk mode so it both (1) knows to be careful, and (2) has the validated pointer. Much like the i_op->permission() method we looked at previously, ->follow_link() would need to be careful that all the data structures it references are safe to be accessed while holding no counted reference, only the RCU lock. Though getting a reference with ->follow_link() is not yet done in RCU-walk mode, the code is ready to release the reference when that does happen.
This need to drop the reference to a symlink adds significant complexity. It requires a reference to the inode so that the i_op->put_link() inode operation can be called. In REF-walk, that reference is kept implicitly through a reference to the dentry, so keeping the struct path of the symlink is easiest. For RCU-walk, the pointer to the inode is kept separately. To allow switching from RCU-walk back to REF-walk in the middle of processing nested symlinks we also need the seq number for the dentry so we can confirm that switching back was safe.
Finally, when providing a reference to a symlink, the filesystem also provides an opaque "cookie" that must be passed to ->put_link() so that it knows what to free. This might be the allocated memory area, or a pointer to the struct page in the page cache, or something else completely. Only the filesystem knows what it is.
In order for the reference to each symlink to be dropped when the walk completes, whether in RCU-walk or REF-walk, the symlink stack needs to contain, along with the path remnants:
- the struct path to provide a reference to the inode in REF-walk
- the struct inode * to provide a reference to the inode in RCU-walk
- the seq to allow the path to be safely switched from RCU-walk to REF-walk
- the cookie that tells ->put_path() what to put.
This means that each entry in the symlink stack needs to hold five pointers and an integer instead of just one pointer (the path remnant). On a 64-bit system, this is about 40 bytes per entry; with 40 entries it adds up to 1600 bytes total, which is less than half a page. So it might seem like a lot, but is by no means excessive.
Note that, in a given stack frame, the path remnant (name) is not part of the symlink that the other fields refer to. It is the remnant to be followed once that symlink has been fully parsed.
Following the symlink
The main loop in link_path_walk() iterates seamlessly over all components in the path and all of the non-final symlinks. As symlinks are processed, the name pointer is adjusted to point to a new symlink, or is restored from the stack, so that much of the loop doesn't need to notice. Getting this name variable on and off the stack is very straightforward; pushing and popping the references is a little more complex.
When a symlink is found, walk_component() returns the value 1 (0 is returned for any other sort of success, and a negative number is, as usual, an error indicator). This causes get_link() to be called; it then gets the link from the filesystem. Providing that operation is successful, the old path name is placed on the stack, and the new value is used as the name for a while. When the end of the path is found (i.e. *name is '\0'), the old name is restored off the stack and path walking continues.
Pushing and popping the reference pointers (inode, cookie, etc.) is more complex in part because of the desire to handle tail recursion. When the last component of a symlink itself points to a symlink, we want to pop the symlink-just-completed off the stack before pushing the symlink-just-found to avoid leaving empty path remnants that would just get in the way.
It is most convenient to push the new symlink references onto the stack in walk_component() immediately when the symlink is found; walk_component() is also the last piece of code that needs to look at the old symlink as it walks that last component. So it is quite convenient for walk_component() to release the old symlink and pop the references just before pushing the reference information for the new symlink. It is guided in this by two flags; WALK_GET, which gives it permission to follow a symlink if it finds one, and WALK_PUT, which tells it to release the current symlink after it has been followed. WALK_PUT is tested first, leading to a call to put_link(). WALK_GET is tested subsequently (by should_follow_link()) leading to a call to pick_link(), which sets up the stack frame.
Symlinks with no final component
A pair of special-case symlinks deserve a little further explanation. Both result in a new struct path (with mount and dentry) being set up in the nameidata, and result in get_link() returning NULL.
The more obvious case is a symlink to "/". All symlinks starting with "/" are detected in get_link(), which resets the nameidata to point to the effective filesystem root. If the symlink only contains "/" then there is nothing more to do, no components at all, so NULL is returned to indicate that the symlink can be released and the stack frame discarded.
The other case involves things in /proc that look like symlinks but aren't really.
$ ls -l /proc/self/fd/1 lrwx------ 1 neilb neilb 64 Jun 13 10:19 /proc/self/fd/1 -> /dev/pts/4
Every open file descriptor in any process is represented in /proc by something that looks like a symlink. It is really a reference to the target file, not just the name of it. When you readlink() these objects you get a name that might refer to the same file — unless it has been unlinked or mounted over. When walk_component() follows one of these, the ->follow_link() method in "procfs" doesn't return a string name, but instead calls nd_jump_link(), which updates the nameidata in place to point to that target. ->follow_link() then returns NULL. Again there is no final component and get_link() reports this by leaving the last_type field of nameidata as LAST_BIND.
Following the symlink in the final component
All this leads to link_path_walk() walking down every component, and following all symbolic links it finds, until it reaches the final component. This is just returned in the last field of nameidata. For some callers, this is all they need; they want to create that last name if it doesn't exist or give an error if it does. Other callers will want to follow a symlink if one is found, and possibly apply special handling to the last component of that symlink, rather than just the last component of the original file name. These callers potentially need to call link_path_walk() again and again on successive symlinks until one is found that doesn't point to another symlink.
This case is handled by the relevant caller of link_path_walk(), such as path_lookupat(), using a loop that calls link_path_walk(), and then handles the final component. If the final component is a symlink that needs to be followed, then trailing_symlink() is called to set things up properly and the loop repeats, calling link_path_walk() again. This could loop as many as 40 times if the last component of each symlink is another symlink.
The various functions that examine the final component and possibly report that it is a symlink are lookup_last(), mountpoint_last(), and do_last(), each of which use the same convention as walk_component() of returning 1 if a symlink was found that needs to be followed. Of these, do_last() is the most interesting as it is used for opening a file. Part of do_last() runs with i_mutex held and this part is in a separate function: lookup_open().
Explaining do_last() completely is beyond the scope of this article, but a few highlights should help those interested in exploring the code.
-
Rather than just finding the target file, do_last() needs to open it. If the file was found in the dcache, then vfs_open() is used for this. If not, then lookup_open() will either call atomic_open() (if the filesystem provides it) to combine the final lookup with the open, or will perform the separate lookup_real() and vfs_create() steps directly. In the later case, the actual "open" of this newly found or created file will be performed by vfs_open(), just as if the name were found in the dcache.
-
vfs_open() can fail with -EOPENSTALE if the cached information wasn't quite current enough. Rather than restarting the lookup from the top with LOOKUP_REVAL set, lookup_open() is called instead, giving the filesystem a chance to resolve small inconsistencies. If that doesn't work, only then is the lookup restarted from the top.
-
An open with O_CREAT does follow a symlink in the final component, unlike other creation system calls (like mkdir). So the sequence:
ln -s bar /tmp/foo echo hello > /tmp/foo
will create a file called /tmp/bar. This is not permitted if O_EXCL is set but otherwise is handled for an O_CREAT open much like for a non-creating open: should_follow_link() returns 1, and so does do_last(), so that trailing_symlink() gets called and the open process continues on the symlink that was found.
Updating the access time
We previously said of RCU-walk that it would "take no locks, increment no counts, leave no footprints." We have since seen that some "footprints" can be needed when handling symlinks as a counted reference (or even a memory allocation) may be needed. But these footprints are best kept to a minimum.
One other place where walking down a symlink can involve leaving footprints in a way that doesn't affect directories is in updating access times. In Unix (and Linux) every filesystem object has a "last accessed time", or "atime". Passing through a directory to access a file within is not considered to be an access for the purposes of atime; only listing the contents of a directory can update its atime. Symlinks are different it seems. Both reading a symlink (with readlink()) and looking up a symlink on the way to some other destination can update the atime on that symlink.
It is not clear why this is the case; POSIX has little to say on the
subject. The clearest
statement is that, if a particular implementation
updates a timestamp in a place not specified by POSIX, this must be
documented "except that any changes caused by pathname resolution need
not be documented
". This seems to imply that POSIX doesn't really
care about access-time updates during pathname lookup.
An examination of history shows that, prior to Linux 1.3.87, the ext2 filesystem, at least, didn't update atime when following a link. Unfortunately we have no record of why that behavior was changed.
In any case, access time must now be updated and that operation can be quite complex. Trying to stay in RCU-walk while doing it is best avoided. Fortunately it is often permitted to skip the atime update. Because atime updates cause performance problems in various areas, Linux supports the relatime mount option, which generally limits the updates of atime to once per day on files that aren't being changed (and symlinks never change once created). Even without relatime, many filesystems record atime with a one-second granularity, so only one update per second is required.
It is easy to test if an atime update is needed while in RCU-walk mode and, if it isn't, the update can be skipped and RCU-walk mode continues. Only when an atime update is actually required does the path walk drop down to REF-walk. All of this is handled in the get_link() function.
A few flags
A suitable way to wrap up this tour of pathname walking is to list the various flags that can be stored in the nameidata to guide the lookup process. Many of these are only meaningful on the final component, others reflect the current state of the pathname lookup. And then there is LOOKUP_EMPTY, which doesn't fit conceptually with the others. If this is not set, an empty pathname causes an error very early on. If it is set, empty pathnames are not considered to be an error.
Global state flags
We have already met two global state flags: LOOKUP_RCU and LOOKUP_REVAL. These select between one of three overall approaches to lookup: RCU-walk, REF-walk, and REF-walk with forced revalidation.
LOOKUP_PARENT indicates that the final component hasn't been reached yet. This is primarily used to tell the audit subsystem the full context of a particular access being audited.
LOOKUP_ROOT indicates that the root field in the nameidata was provided by the caller, so it shouldn't be released when it is no longer needed.
LOOKUP_JUMPED means that the current dentry was chosen not because it had the right name but for some other reason. This happens when following "..", following a symlink to "/", crossing a mount point or accessing a "/proc/$PID/fd/$FD" symlink. In this case the filesystem has not been asked to revalidate the name (with d_revalidate()). In such cases the inode may still need to be revalidated, so d_op->d_weak_revalidate() is called if LOOKUP_JUMPED is set when the look completes — which may be at the final component or, when creating, unlinking, or renaming, at the penultimate component.
Final-component flags
Some of these flags are only set when the final component is being considered. Others are only checked for when considering that final component.
LOOKUP_AUTOMOUNT ensures that, if the final component is an automount point, then the mount is triggered. Some operations would trigger it anyway, but operations like stat() deliberately don't. statfs() needs to trigger the mount but otherwise behaves a lot like stat(), so it sets LOOKUP_AUTOMOUNT, as does quotactl() and the handling of "mount --bind".
LOOKUP_FOLLOW has a similar function to LOOKUP_AUTOMOUNT but for symlinks. Some system calls set or clear it implicitly, while others have API flags such as AT_SYMLINK_FOLLOW and UMOUNT_NOFOLLOW to control it. Its effect is similar to WALK_GET that we already met, but it is used in a different way.
LOOKUP_DIRECTORY insists that the final component is a directory. Various callers set this and it is also set when the final component is found to be followed by a slash.
Finally LOOKUP_OPEN, LOOKUP_CREATE, LOOKUP_EXCL, and LOOKUP_RENAME_TARGET are not used directly by the VFS but are made available to the filesystem and particularly the ->d_revalidate() method. A filesystem can choose not to bother revalidating too hard if it knows that it will be asked to open or create the file soon. These flags were previously useful for ->lookup() too but with the introduction of ->atomic_open() they are less relevant there.
End of the road
Despite its complexity, all this pathname lookup code appears to be in good shape — various parts are certainly easier to understand now than even a couple of releases ago. But that doesn't mean it is "finished". As already mentioned, RCU-walk currently only follows symlinks that are stored in the inode so, while it handles many ext4 symlinks, it doesn't help with NFS, XFS, or Btrfs. That support is not likely to be long delayed.
There is also room for new enhancements. Having a single mutex to
serialize all changes and uncached lookups in a directory can cause
problems in some scenarios. As Linus said while discussing
the issue: "anyway, just grepping for 'i_mutex' made me almost
cry.
"
There is no immediate solution apparent, but it is likely that
something could be done if sufficient motivation were found.
A much simpler change that has been suggested is to add new lookup flags for "no symlinks" and "no dotdot". This could be possibly used by Samba, or by the Apache web server to handle lookup more efficiently when the "FollowSymlinks" directive is not in effect. This would need little more than an agreement on the correct API — so maybe not so easy after all.
But these are all issues for the future. For now it is good to have something that works, that handles all the corner cases, that is really very efficient, and that is even documented.
Patches and updates
Kernel trees
Architecture-specific
Core kernel code
Device drivers
Device driver infrastructure
Documentation
Filesystems and block I/O
Janitorial
Memory management
Networking
Security-related
Virtualization and containers
Miscellaneous
Page editor: Jonathan Corbet
Distributions
Why Debian returned to FFmpeg
Slightly less than one year ago, the Debian community had an extended discussion on whether the FFmpeg multimedia library should return to the distribution. Debian had followed the contentious libav fork when it happened in 2011, but some community members were starting to have second thoughts about that move. At the time, the discussion died out without any changes being made, but the seeds had evidently been planted; on July 8, the project's multimedia developers announced that not only was FFmpeg returning to Debian, but it would be replacing libav.
Chances are that many Debian (and Ubuntu) users are not more than peripherally aware of which multimedia library is running on their system. Libav has been in use for some years and has generally filled the bill, so it is natural to wonder what drove the project to make a change that will require a lot of work and which seems certain to prove disruptive while it is underway. Getting to the answers is an interesting study in how distributions try to ensure that the code they ship comes from healthy upstream projects.
Security and more
The Debian project is not normally known for being afraid to ship multiple packages providing the same function, so one might wonder why it can't just ship both FFmpeg and libav, letting users decide which one they want to use. The big sticking point in 2014 was security support. Both projects have had more than their share of security issues, and the Debian security team didn't think it could keep up with patching both of them. At that time, FFmpeg seemed to have a better record of responding to vulnerabilities than libav, but that still was not enough to convince the security team to support both libraries.
One year later, security issues remained at the top of the list, but it would appear that FFmpeg has pulled well ahead of libav in this regard. Debian security team member Moritz Muehlenhoff made it clear that he saw FFmpeg as being more responsive to security reports. A rather stronger argument came from Mateusz “j00ru” Jurczyk who, in his security-oriented role at Google, has been doing extensive fuzz testing of both projects and reporting the problems that come up:
The notion of hundreds of open security issues is generally unappealing. FFmpeg does not appear to lead on just security updates, though; by all accounts, it now supports a far wider variety of codecs and containers than libav does. There is an increasing range of formats that FFmpeg can play, but libav cannot. As the feature gap grows, the project's desire to stay with libav wanes.
The libav maintainer's perspective
When Alessio Treglia restarted the discussion at the end of April, the above points were quickly expressed. Even so, the conversation did not appear to be heading toward any sort of consensus. Arguably, the turning point was when Debian libav maintainer Reinhard Tartler entered the discussion. Reinhard argued forcefully for the advantages he saw in libav, but, in the end, could not bring himself to say that he was sure libav was the better choice.
With regard to security issues, Reinhard attributed the difference in fix rates to a difference in how the two projects approach development ("Michael" is Michael Niedermayer, the lead developer of FFmpeg):
Reinhard initially asserted that, even so, libav had parity with FFmpeg when it came to fixing security-related bugs, but he later backed down on that.
In Reinhard's view, the two projects are managed differently, with
different goals;
that difference makes libav appealing in a number of
ways. Libav, he said, is trying to improve the state of the code and come
up with something better than the "horrible
" APIs it inherited
from FFmpeg. He summarized the differences between the project this way:
Even through he seems to like the libav approach more, Reinhard, in the end, was
not able to argue against the change; his position came down to:
"I still have some concerns with this move, but I can't claim Libav
to be superior to FFmpeg at this point
". With the project's libav maintainer
taking that position (and also, importantly, saying that he no longer has
the time to maintain libav at the same level as he has in the past), the decision
seemed to settle out fairly quickly.
Other concerns
A desire that was expressed more than once in this discussion was that the two projects would stop fighting and join back into a single, well-supported effort. There is, however, no real indication that any such reconciliation is in the cards. There is another way that the community could go back to having a single project, though: if one of them were to simply fail. Dmitry Smirnov suggested that a switch to FFmpeg by Debian could maybe bring that about:
Opinions vary on how much "life support" Debian actually provides to libav, but the loss of Debian and Ubuntu seems certain not to do the project any good. There aren't a lot of distributions out there that carry libav anymore; without Debian, that list will be short indeed. It might just be that libav is not sustainable without Debian.
That said, there are some concerns about the sustainability of FFmpeg as
well. By all accounts, Michael is a highly productive developer; he
accounts for, by far, the largest share of the patches going into FFmpeg.
Reinhard asked whether FFmpeg is a one-developer project that would find
itself in trouble should Michael stop working on it. "To me, this
constitutes a serious bus-factor: Without Michael, (probably) nobody is
able to replace him.
" He went on to suggest, though, that Michael's
departure could do a lot to bring an end to the fork.
As an argument against the "one-man show" concern, Andreas Cadhalpun posted some commit statistics for both projects, covering the period since September 2014:
libav FFmpeg Commits Developer Commits Developer 294 Vittorio Giovara 1831 Michael Niedermayer 253 Martin Storsjö 294 Vittorio Giovara 206 Anton Khirnov 252 Martin Storsjö 131 Luca Barbato 197 Anton Khirnov 72 Diego Biurrun 179 Clément Bœsch 46 Michael Niedermayer 155 James Almer 32 Rémi Denis-Courmont 150 Carl Eugen Hoyos 21 Andreas Cadhalpun 114 Andreas Cadhalpun 17 Hendrik Leppkes 113 Luca Barbato 16 Gabriel Dume 98 Lukasz Marek 16 Himangi Saraogi 93 Paul B Mahol 16 wm4 85 Ronald S. Bultje 14 Federico Tomassetti 83 wm4 12 Peter Meerwald 66 Christophe Gisquet 11 Janne Grunau 48 Benoit Fouet
At a first glance, the table shows that (1) FFmpeg appears to have a much higher commit traffic than libav, and (2) Michael, while being the largest contributor, is certainly not the only contributor. But, as Reinhard pointed out, there is a bit more to this story. Changes to libav are routinely merged into FFmpeg, but the flow of patches in the other direction is quite low. If the libav changes are subtracted out of the FFmpeg numbers, the result is that Michael very much stands alone; no other developer is even close.
The Debian multimedia developers decided to make the switch to FFmpeg even though nobody really had an answer to Reinhard's concern. For now, FFmpeg appears to be going strong, but there is a single-developer risk there that could come to the fore in the future. Given that nearly the entire distribution ecosystem now depends on FFmpeg, chances are that a way would be found to keep the project going if Michael were to decide he had better things to do. But the process of getting there might prove to be a little rough.
The Debian project was faced with a difficult choice: given that it was not possible to support both libraries in the distribution, which one offers the most to Debian's users while presenting the least long-term sustainability and security risk? The developers involved chose to move away from a project that many of them see as lacking the resources needed to be truly healthy. That choice will result in a lot of work, but, assuming the choice was the correct one, Debian users should benefit in the long term.
Brief items
Distribution quotes of the week
rBuilder Open Source
SAS purchased technology from rPath, including rBuilder. SAS has re-licensed rBuilder to Apache 2 and released it on GitHub. "We have now completed the process of separating out non-open-source third-party dependencies and have released the full source code for the current rBuilder technology under open source licenses (Apache 2 wherever possible). We have also released an installable image from which an rBuilder can be installed, as well as instructions to build it from source. It includes the ability to build systems and images based on CentOS 6, with CentOS 7 support currently being developed."
SUSE Linux Enterprise 11 SP 4 is out
SUSE has released the fourth service pack for SUSE Linux Enterprise 11. "SUSE Linux Enterprise SP4 includes a lot of updates like new versions for openSSH and zypper. OpenSSH is constantly improving and gaining new and more secure cipher suites. The newest SUSE Linux Service pack ships with version 6.6p1 of openSSH which includes modern elliptic curve ciphers based on the elliptic curve Curve25519, resulting in public key types Ed25519. Also the new transport cipher "chacha20-poly1305@openssh.com" was added, using the ChaCha20 stream cipher and Poly1305 MAC developed by Dan Bernstein." There are release notes for server and desktop.
FSF endorses embedded GNU/Linux distro ProteanOS as fully free
The Free Software Foundation has announced the addition of ProteanOS to its list of recommended GNU/Linux distributions. "ProteanOS is a new, small, and fast distribution that primarily targets embedded devices, but is also being designed to be part of the boot system of laptops and other devices. The lead maintainer of ProteanOS is P. J. McDermott, who is working closely with the Libreboot project and hopes to have ProteanOS be part of the boot system of Libreboot-compatible devices."
Distribution News
Debian GNU/Linux
Bits from the DPL - July
Neil McGovern shares a few bits about what he's been up to over the last couple of months. Topics include press and articles, conference invites, funding, TOs, legal issues, and DebConf.
Newsletters and articles of interest
Distribution newsletters
- DistroWatch Weekly, Issue 618 (July 13)
- 5 things in Fedora this week (July 9)
- Launchpad news (April-June)
- Ubuntu Weekly Newsletter, Issue 425 (July 12)
Microservices 101: The good, the bad and the ugly (ZDNet)
ZDNet has an interview about "microservices" with Red Hat VP of engineering for middleware, Dr. Mark Little. Microservices are a relatively recent software architecture that relies on small, easily replaced components and is an alternative to the well-established service-oriented architecture (SOA)—but it is not a panacea: "'Just because you adopt microservices doesn't suddenly mean your badly architected ball of mud is suddenly really well architected and no longer a ball of mud. It could just be lots of distributed balls of mud,' Little said. 'That worries me a bit. I've been around service-oriented architecture for a long time and know the plus points and the negative points. I like microservices because it allows us to focus on the positive points but it does worry me that people see it as the answer to a lot of problems that it's never going to be the answer for.'"
Mangaka Is an Artful Blend of Simplicity and Style (LinuxInsider)
LinuxInsider takes a look at Mangaka. "The new Mangaka release is based on Ubuntu 14.04 LTS, or Trusty Tahr. Actually, Mangaka is much more of a hybrid than its Ubuntu core underpinnings suggest. It is built around elements of the ElementaryOS, which also is designed around the Ubuntu core. Both run the Pantheon desktop. A newcomer to LinuxLand, this desktop is not found in many mainstream distros."
Brooks: Docker, CentOS 6, and You
Jason Brooks discusses the challenges of running Docker on CentOS 6. "Docker and CentOS 6 have never been a terrific fit, which shouldn’t be surprising considering that the version of the Linux kernel that CentOS ships was first released over three years before Docker’s first public release (0.1.0). The OS and kernel version you use matter a great deal, because with Docker, that’s where all your contained processes run. With a hypervisor such as KVM, it’s not uncommon or problematic for an elder OS to host, through the magic of virtualization, all manner of bleeding-edge software components. In fact, if you’re attached to CentOS 6, virtualization is a solid option for running containers in a more modern, if virtual, host."
Page editor: Rebecca Sobol
Development
Flash blocking, exploits, and replacements
On July 13, Mozilla added the Adobe Flash Player plugin to its realtime blocklist for Firefox, citing unresolved security vulnerabilities that would enable remote attackers to compromise the user's system. Although Mozilla has blocked vulnerable versions of Flash on several occasions in the past, this most recent incident happened to coincide with a public call from a major web service provider to deprecate Flash entirely. That coincidence led to a lot of general-news (and some tech-news) outlets reporting that Mozilla had decided to block Flash on a long-term basis. Such was not the case, but the incident has once again raised awareness and public debate over what the right approach is to solving the "Flash problem."
The most recent block (which is detailed on Mozilla's blocklist site) was prompted by the discovery of three zero-day vulnerabilities in Flash that are known to be exploited in the wild by several rootkits. The vulnerabilities (CVE-2015-5119, CVE-2015-5122, and CVE-2015-5123) were discovered in the Hacking Team leak. Hacking Team is an Italian spyware-for-hire firm that has been accused of doing contract work for, among other clients, several national governments.
In early July, Hacking team itself was compromised, and 400GB of the company's internal data was leaked out over Bittorrent. The three vulnerabilities in Flash were revealed to be key entry points through which Hacking Team gained access to its victim's machines. Adobe issued a patch fixing CVE-2015-5119 on July 8, around two days after the leak was disclosed. By July 10, Mozilla decided that the other two vulnerabilities warranted implementing a block on the Flash plugin, and a bug report was opened.
The Firefox blocklist mechanism works by disabling the vulnerable plugin in running browsers, typically by putting the plugin into "click to activate" mode, which guards against sites automatically leveraging the exploit. Mozilla's standard procedure has been to initiate a block only when there was an update available, so that users would be able to upgrade to the latest, hopefully secure version of the plugin in question. A look at the realtime blocklist history shows that Mozilla has used the blocklist feature on Flash six times in 2015 already.
As luck would have it, though, the July 13 block happened to
coincide with some other public commentary about Flash's security
problems on social media.
First, Mozilla's Mark Schmidt announced
the block on Twitter, calling it " But Schmidt's tweet came just one day after Facebook's security
chief Alex Stamos said:
" Still, the question remains open whether or not protecting Firefox
users from Flash exploits is worth the effort. The Flash plugin is
subject to dozens of CVEs every year and, after each incident, Mozilla
must wait for Adobe to release another update. Mozilla's realtime blocklist
feature has been in place since 2008 and, while Flash competes with
the Java plugin for the title of most-frequently blocked, the total
number of blocklist incidents does not seem to be decreasing.
At the same time, Flash replacements still have not made the binary
Adobe plugin obsolete. While a number of top-tier video sites support
HTML5 media playback—which was once cited as the main reason
Flash persisted—many others do not. Quite a few sites with no
media-playback functionality at all still use Flash—GitHub, for
example, uses Flash to provide its one-click "copy to clipboard"
button (a feature it added in 2013).
In 2012, Mozilla announced Shumway, an open-source
Flash runtime
implemented in JavaScript that would eventually be able to replace
Adobe's binary Flash plugin. We looked
at Shumway shortly after the announcement, and again in 2013 when Shumway was merged
into Firefox (as a feature that users could enable in the
about:config screen).
But Shumway's progress has been slow. Its most recent status
update marks it as a possible feature for Firefox 42 (scheduled
for November 2015). But that status report also notes that the target
use case is replacing Flash-based ads, and suggests that the feature
might be pulled from Firefox in favor of providing a Flash-compatible
JavaScript library to ad-delivery networks.
While providing an alternative ad-building tool to online
advertisers might reduce the amount of Flash content delivered as a
whole, that alone would hardly obviate the need for end users to have
the binary Flash plugin installed on their computers. Despite
Stamos's understandable dissatisfaction with Flash, Facebook still
uses it for embedded videos—and there are still countless sites
like GitHub that use Flash as a workaround for some limitation in the
traditional HTML document object model (DOM). There is a working draft from the
W3C for a Clipboard API that could replace GitHub's copy-to-clipboard
Flash snippet, but W3C specifications are not fast-moving entities.
For now, it is welcome news to hear that Stamos and others
understand the risky nature of supporting Flash—and to hear them
speak out in favor of doing away with it. But those calls to action
are nothing new. It has been five years since Steve Jobs famously
lambasted Flash in public and two years since Mozilla merged Shumway
into Firefox, but Flash does not appear to be that much closer to
disappearing. While it would be nice to think that the Hacking Team
exploits would catalyze the industry to do away with Flash, the format
has proven more than a little resilient over the years.
BIG NEWS!!
" and including an "Occupy Flash" image beneath the text. Judging by
the replies to Schmidt's news, more than a few people mistook that
announcement to be a long-term policy decision; Schmidt later clarified
that the block was temporary "for now
", pending updates
from Adobe.
It is time for Adobe to announce the end-of-life date for Flash
and to ask the browsers to set killbits on the same day.
"
Subsequently, word spread that Mozilla's move was a response to
Stamos's call to action. News outlets ranging from gadget blog
Gizmodo to the BBC linked the
events—although most eventually sorted out the nuances and
updated their coverage.
Brief items
Quote of the week
* it's almost certainly not urgent
* tolerating the occasional design decision we dislike won't ruin our lives
* a red bikeshed will still shelter our bikes, even if we'd have preferred blue
* it's just software, so we can put a blue wrapper around the red bikeshed if we prefer it
Coreboot 4.1 available
After more than five years of development, version 4.1 of coreboot is now available. The new release adds support for many more architectures (ARM, ARM64, MIPS, RISC-V) and numerous architectural changes, "like access to non-memory mapped SPI flash, or better insight about the internals of coreboot at runtime through the cbmem console, timestamp collection, or code coverage support.
" Maintainer Patrick Georgi promises that subsequent releases will appear on a more frequent basis.
An end to XULRunner builds from Mozilla
At his blog, Ben Hearsum notes
that Mozilla will stop performing automated builds of its XULRunner
application runtime package, starting with the Firefox 41 development
cycle in September. Automated builds have persisted in recent year
only "because its build process also happens to build the Gecko
SDK, which we do support and maintain. This will change soon, and
we'll start building the Gecko SDK from Firefox instead
".
Developers using XULRunner will still be able to build the package
themselves, although the preferred solution is to migrate away from XULRunner.
Newsletters and articles
Development newsletters from the past week
- What's cooking in git.git (July 10)
- What's cooking in git.git (July 15)
- Git Rev News (July 8)
- LLVM Weekly (July 13)
- OCaml Weekly News (July 14)
- OpenStack Community Weekly Newsletter (July 10)
- Perl Weekly (July 13)
- PostgreSQL Weekly News (July 12)
- Python Weekly (July 9)
- Ruby Weekly (July 9)
- This Week in Rust (July 13)
- Tor Weekly News (July 10)
- Tor Weekly News (July 15)
- Wikimedia Tech News (July 13)
An interview with Larry Wall (LinuxVoice)
LinuxVoice has an interview with Perl creator Larry Wall. "So I was the language designer, but I was almost explicitly told: 'Stay out of the implementation! We saw what you did made out of Perl 5, and we don’t like it!' It was really funny because the innards of the new implementation started looking a whole lot like Perl 5 inside, and maybe that’s why some of the early implementations didn’t work well."
Page editor: Nathan Willis
Announcements
Brief items
FSF and SFC work with Canonical on an "intellectual property" policy update
The Free Software Foundation (FSF) and Software Freedom Conservancy (SFC) have both put out statements about a change to the Canonical, Ltd. "intellectual property" policy that was negotiated over the last two years (FSF statement and SFC statement). Effectively, Canonical has added a "trump clause" that clarifies that the licenses of the individual packages override the Canonical policy when there is a conflict. Though, as SFC points out: "While a trump clause is a reasonable way to comply with the GPL in a secondary licensing document, the solution is far from ideal. Redistributors of Ubuntu have little choice but to become expert analysts of Canonical, Ltd.'s policy. They must identify on their own every place where the policy contradicts the GPL. If a dispute arises on a subtle issue, Canonical, Ltd. could take legal action, arguing that the redistributor's interpretation of GPL was incorrect. Even if the redistributor was correct that the GPL trumped some specific clause in Canonical, Ltd.'s policy, it may be costly to adjudicate the issue." While backing the change made, both FSF and SFC recommend further changes to make the situation even more clear.
LF launches Spanish-Language SysAdmin Course
The Linux Foundation has announced the availability of its LFS201 – Essentials of System Administration online course in Spanish. "A Portuguese version will be available in the coming months. The course comes bundled with a Linux Foundation Certified System Administrator (LFCS) exam, enabling students to demonstrate their expertise to potential employers."
SPI 2015 Annual Report available
Software in the Public Interest has announced its 2015 Annual Report [PDF], covering the period from July 1, 2014 to June 30, 2015. The annual report covers SPI's finances, elections, board members, committees, associated projects, and other significant changes throughout the year.2015 SPI Board Election Results
Dimitri John Ledkov and Michael Schultheiss have been elected to the SPI board. "As there were 2 available board positions available this means, per 2004-08-10.dbg.2, that all 2 are duly elected to the SPI board for a 3 year term."
Articles of interest
How to win the copyleft fight—without litigation (Opensource.com)
Opensource.com has an interview with Bradley Kuhn. "I continued on in my professional career, which included developing and supporting proprietary software, but I found that the lack of source code and/or the ability to rebuild it myself constantly hampered my ability to do my job. Proprietary software companies today are more careful to give "some open source"; thus, many technology professionals don't realize until it's too late how crippling proprietary software can be when you rely on it every day. In the mid 1990s, hardly any business software license gave us software freedom, so denying our rights to practice our profession (i.e, fix software) made many of us hate our jobs. I considered leaving the field of software entirely because I disliked working with proprietary software so much. Those experiences made me a software freedom zealot. I made a vow that I never wanted any developer or sysadmin to feel the constraints of proprietary software licensing, which limits technologists by what legal agreements their company's lawyers can negotiate rather than their technical skill."
Calls for Presentations
Call for proposals for PyCon Spain is open
PyConES will take place November 21-22 in Valencia, Spain. The call for proposals is open through August 31.CFP Deadlines: July 16, 2015 to September 14, 2015
The following listing of CFP deadlines is taken from the LWN.net CFP Calendar.
Deadline | Event Dates | Event | Location |
---|---|---|---|
July 17 | October 2 October 3 |
Ohio LinuxFest 2015 | Columbus, OH, USA |
July 19 | September 25 September 27 |
PyTexas 2015 | College Station, TX, USA |
July 24 | September 22 September 23 |
Lustre Administrator and Developer Workshop 2015 | Paris, France |
July 31 | October 26 October 28 |
Kernel Summit | Seoul, South Korea |
July 31 | October 24 October 25 |
PyCon Ireland 2015 | Dublin, Ireland |
July 31 | November 3 November 5 |
EclipseCon Europe 2015 | Ludwigsburg, Germany |
August 2 | February 1 February 5 |
linux.conf.au | Geelong, Australia |
August 2 | October 21 October 22 |
Real Time Linux Workshop | Graz, Austria |
August 2 | August 22 | FOSSCON 2015 | Philadelphia, PA, USA |
August 7 | October 27 October 30 |
PostgreSQL Conference Europe 2015 | Vienna, Austria |
August 9 | October 8 October 9 |
GStreamer Conference 2015 | Dublin, Ireland |
August 10 | September 2 September 6 |
End Summer Camp | Forte Bazzera (VE), Italy |
August 10 | October 26 | Korea Linux Forum | Seoul, South Korea |
August 14 | November 7 November 9 |
PyCon Canada 2015 | Toronto, Canada |
August 16 | November 7 November 8 |
PyCON HK 2015 | Hong Kong, Hong Kong |
August 17 | November 19 November 21 |
FOSSETCON 2015 | Orlando, Florida, USA |
August 19 | September 16 September 18 |
X.org Developer Conference 2015 | Toronto, Canada |
August 24 | October 19 October 23 |
Tcl/Tk Conference | Manassas, VA, USA |
August 31 | November 21 November 22 |
PyCon Spain 2015 | Valencia, Spain |
August 31 | October 19 October 22 |
Perl Dancer Conference 2015 | Vienna, Austria |
August 31 | November 5 November 7 |
systemd.conf 2015 | Berlin, Germany |
August 31 | October 9 | Innovation in the Cloud Conference | San Antonio, TX, USA |
August 31 | November 10 November 11 |
Open Compliance Summit | Yokohama, Japan |
September 1 | October 1 October 2 |
PyConZA 2015 | Johannesburg, South Africa |
September 6 | October 10 | Programistok | Białystok, Poland |
September 12 | October 10 | Poznańska Impreza Wolnego Oprogramowania | Poznań, Poland |
If the CFP deadline for your event does not appear here, please tell us about it.
Upcoming Events
EuroPython 2015 Keynote: Mandy Waite
The EuroPython team has announced that Mandy Waite will be giving a keynote on July 24. The conference takes place July 20-26 in Bilbao, Spain. "Mandy works at Google as a Developer Advocate for Google Cloud Platform and to make the world a better place for developers building applications for the Cloud." Her speech is titled "So, I have all these Docker containers, now what?"
EuroPython 2015: Recruiting Offers
There will be a sponsor job board at EuroPython. "Many of our sponsors are looking for new employees, so EuroPython 2015 is not only an exciting conference, but may very well also be your chance to find the perfect job you’ve always been looking for."
Tracing Summit Schedule is now online
The schedule for the Tracing Summit is available. The Summit will be held August 20 in Seattle, WA, co-located with LinuxCon.Events: July 16, 2015 to September 14, 2015
The following event listing is taken from the LWN.net Calendar.
Date(s) | Event | Location |
---|---|---|
July 15 July 19 |
Wikimania Conference | Mexico City, Mexico |
July 18 July 19 |
NetSurf Developer Weekend | Manchester, UK |
July 20 July 24 |
O'Reilly Open Source Convention | Portland, OR, USA |
July 20 July 26 |
EuroPython 2015 | Bilbao, Spain |
July 25 July 31 |
Akademy 2015 | A Coruña, Spain |
July 27 July 31 |
OpenDaylight Summit | Santa Clara, CA, USA |
July 30 July 31 |
Tizen Developer Summit | Bengaluru, India |
July 31 August 4 |
PyCon Australia 2015 | Brisbane, Australia |
August 7 August 9 |
GUADEC | Gothenburg, Sweden |
August 7 August 9 |
GNU Tools Cauldron 2015 | Prague, Czech Republic |
August 8 August 14 |
DebCamp15 | Heidelberg, Germany |
August 12 August 15 |
Flock | Rochester, New York, USA |
August 13 August 17 |
Chaos Communication Camp 2015 | Mildenberg (Berlin), Germany |
August 15 August 16 |
Conference for Open Source Coders, Users, and Promoters | Taipei, Taiwan |
August 15 August 16 |
I2PCon | Toronto, Canada |
August 15 August 22 |
DebConf15 | Heidelberg, Germany |
August 16 August 23 |
LinuxBierWanderung | Wiltz, Luxembourg |
August 17 August 19 |
LinuxCon North America | Seattle, WA, USA |
August 19 August 21 |
KVM Forum 2015 | Seattle, WA, USA |
August 19 August 21 |
Linux Plumbers Conference | Seattle, WA, USA |
August 20 August 21 |
Linux Security Summit 2015 | Seattle, WA, USA |
August 20 August 21 |
MesosCon | Seattle, WA, USA |
August 20 | Tracing Summit | Seattle, WA, USA |
August 21 | Golang UK Conference | London, UK |
August 21 | Unikernel Users Summit at Texas Linux Fest | San Marcos, TX, USA |
August 21 August 22 |
Texas Linux Fest | San Marcos, TX, USA |
August 22 August 23 |
Free and Open Source Software Conference | Sankt Augustin, Germany |
August 22 | FOSSCON 2015 | Philadelphia, PA, USA |
August 28 September 3 |
ownCloud Contributor Conference | Berlin, Germany |
August 29 | EmacsConf 2015 | San Francisco, CA, USA |
September 2 September 6 |
End Summer Camp | Forte Bazzera (VE), Italy |
September 10 September 12 |
FUDcon Cordoba | Córdoba, Argentina |
September 10 September 13 |
International Conference on Open Source Software Computing 2015 | Amman, Jordan |
September 11 September 13 |
vBSDCon 2015 | Reston, VA, USA |
If your event does not appear here, please tell us about it.
Page editor: Rebecca Sobol