LWN.net Weekly Edition for April 16, 2015

At the 2015 Python Language Summit, Matt Messier was first up to talk about Skython, which is an alternative Python implementation that he has been working on in stealth mode for the last two or three years. It has largely been created in a vacuum, he said, since it has just been him working on it. He has removed the global interpreter lock (GIL) from Python in Skython, which is its big feature.

He said there were lots of technical details he could go into about Skython, but that he had a limited amount of time, so he wanted to focus on one particular issue: the atomicity of operations on objects like lists and dicts in Python. Is appending to a list an atomic operation? Or can multiple threads operating on the same list interfere with each other?

There have been other attempts to remove the GIL, but they tend to slow down single-threaded operation with lots of fine-grained locking. The approach he has taken with Skython is to maintain data integrity for the interpreter itself, but to allow races when updating the same data structures in multiple threads of the program.

For example, if two threads are operating on the same list and both append to it at more or less the same time, both could complete in an undefined order. But one or both of the append operations could get lost and not be reflected in the list. He wondered if the Python core developer community wanted to specify that operations are atomic and, if it did, what that specification would be.

Jython (Python on the Java virtual machine) developer Jim Baker noted that Jython is using Java data structures that ensure the same atomicity guarantees that the standard Python interpreter (i.e. Python in C or CPython) uses. Those are not specified as part of the language, at least yet, but Baker said that there is lots of existing Python code that expects that behavior.

IronPython (Python targeting the .NET framework) developer Dino Viehland agreed with Baker. He said that, like Jython, IronPython is a Python without the GIL, but that it makes the same guarantees that CPython does. It uses lock-free reads for Python data structures, but takes a lock for update operations. Essentially, it has the same approach that Jython does, but with a different implementation.

Baker and others referenced a thread on the python-dev mailing list from many years ago (that appears to be related to this article from Fredrik Lundh). There was also a draft of a Python Enhancement Proposal (PEP) from the (now defunct) Unladen Swallow project that Alex Gaynor brought up. It was suggested that relying on old mailing list posts, articles, and PEP drafts was probably not the right approach and that either a new PEP or an update to the language reference to clarify things was probably in order.

Messier recognized that not atomically handling concurrent append (or other) operations may not be expected but, for performance it is important for Skython to bypass the fine-grained locking. While Skython is not open source, it is planned for it to be released under the Python Software Foundation (PSF) license soon. He had hoped that would happen on the day of the summit, but it appears to still be a week off.

The name "Skython" came from the name of the company where development started: SilverSky, which has since been acquired by BAE Systems. The target domain for Skython, which someone asked about, is "Python for the Cloud", which was characterized as a "cop out answer". It is intended to be highly scalable for back-end servers for web services, Messier continued. He was asked how it compares to today's solution using lots of Tornado worker processes. The idea is that handling a bunch of separate processes can be problematic, so putting them all into one multi-threaded process may simplify some things.

Brett Cannon asked about performance, but Messier said that has not been a focus of his efforts so far. He has been trying to get the Python unit tests to pass. The performance is not good right now, but he believes there is "a lot of room" for optimization.

Skython is based on Python 3.3.6, he said, which was met with applause in the room. C extensions can be written (or ported) but the C API is not the same as that provided by CPython. Most of the standard library just works with Skython. In addition, it uses a mark-and-sweep garbage collector, rather than the reference-counting implementation used by CPython.

Making Python 3 more attractive

By Jake Edge
April 14, 2015

Larry Hastings was up next at the summit with a discussion of what it would take to attract more developers to use Python 3. He reminded attendees of Matt Mackall's talk at last year's summit, where the creator and project lead for the Mercurial source code management tool said that Python 3 had nothing in it that the project cares about. That talk "hit home for me", Hastings said, because it may explain part of the problem with adoption of the new Python version.

The Unicode support that comes with Python 3 is "kind of like eating your vegetables", he said. It is good for you, but it doesn't really excite developers (perhaps because most of them use Western languages, like English, someone suggested). Hastings is looking for changes that would make people want to upgrade.

He wants to investigate features that might require major architectural changes. The core Python developers may be hungry enough to get people to switch that they may be willing to consider those kinds of changes. But there will obviously be costs associated with changes of that sort. He wanted people to keep in mind the price in terms of readability, maintainability, and backward compatibility.

The world has changed a great deal since Python was first developed in 1990. One of the biggest changes is the move to multi-threading on multicore machines. It wasn't until 2005 or so when he started seeing multicore servers, desktops, and game consoles, then, shortly thereafter, laptops. Since then, tablets and phones have gotten multicore processors; now even eyeglasses and wristwatches are multicore, which is sort of amazing when you stop to think about it.

The perception is that Python is not ready for a multicore world because of the global interpreter lock (GIL). He said that he would eventually get to the possibility of removing the GIL, but he had some other ideas he wanted to talk about first.

For example, what would it take to have multiple, simultaneous Python interpreters running in the same process? It would be a weaker form of a multicore Python that would keep the GIL. Objects could not be shared between the interpreter instances.

In fact, you can do that today, though it is a bit of a "party trick", he said. You can use dlmopen() to open multiple shared libraries, each in its own namespace, so that each interpreter "runs in its own tiny little world". It would allow a process to have access to multiple versions of Python at once, though he is a bit dubious about running it in production.

Another possibility might be to move global interpreter state (e.g. the GIL and the small-block allocator) into thread-local storage. It wouldn't break the API for C extensions, though it would break extensions that are non-reentrant. There is some overhead to access thread-local storage because it requires indirection. It is "not as bad as some other things" that he would propose, he said with a chuckle.

A slightly cleaner way forward would be to add an interpreter parameter to the functions in the C API. That would break the API, but do so in a mechanical way. It would, however, use more stack space and would still have the overhead of indirect access.

What would it take to have multiple threads running in the same Python interpreter? That question is also known as "remove the GIL", Hastings said. In looking at that, he considered what it is that the GIL protects. It protects global variables, but those could be moved to a heap. It also enables non-reentrant code as a side effect. There is lots of code that would fail if the assumption that it won't be called simultaneously in multiple threads is broken, which could be fixed but would take a fair amount of work.

The GIL also provides the atomicity guarantees that Messier brought up. A lock on dicts and lists (and other data structures that need atomic access) could preserve atomicity. Perhaps the most important thing the GIL does, though, is to protect access to the reference counts that are used to do garbage collection. It is really important not to have races on those counts.

The interpreter could switch to using the atomic increment and decrement instructions provided by many of today's processors. That doesn't explicitly break the C API as the change could be hidden behind macros. But, Hastings said, Antoine Pitrou's experiments with using those instructions resulted in 30% slower performance.

Switching to a mark-and-sweep garbage collection scheme would remove the problem with maintaining the reference counts, but it would be "an immense change". It would break every C extension in existence, for one thing. For another, conventional wisdom holds that reference counting and "pure garbage collection" (his term for mark and sweep) are roughly equivalent performance-wise, but the performance impact wouldn't be known until after the change was made, which might make it a hard sell.

PyPy developer Armin Rigo has been working on software transactional memory (STM) and has a library that could be used to add STM to the interpreter. But Rigo wrote a toy interpreter called "duhton" and, based on that, said that STM would not be usable for CPython.

Hastings compared some of the alternative Python implementations in terms of their garbage-collection algorithm. Only CPython uses reference counting, while Jython, IronPython, and PyPy all use pure garbage collection. It would seem that the GIL and reference counting go hand in hand, he said. He also noted that few other scripting languages use reference counting, so the future of scripting may be with pure garbage collection.

Yet another possibility is to turn the C API into a private API, so extensions could not call it. They would use the C Foreign Function Interface (CFFI) for Python instead. Extensions written using Cython might be another possible approach to hide the C extension API.

What about going "stackless" (à la Stackless Python)? Guido van Rossum famously said that Python would never merge Stackless, so that wasn't Hastings's suggestion. Instead, he looked at the features offered by Stackless: coroutines, channels, and pickling the interpreter state for later resumption of execution. Of the three, only the first two are needed for multicore support.

The major platforms already have support for native coroutines, though some are better than others. Windows has the CreateFiber() API that creates "fibers", which act like threads, but use "cooperative multitasking". Under POSIX, things are a little trickier.

There is the makecontext() API that does what is needed. Unfortunately, it was specified by POSIX in 2001, obsoleted in 2004, and dropped in 2008, though it is still mostly available. It may not work for OS X, however. When makecontext() was obsoleted, POSIX recommended that developers use threads instead, but that doesn't solve the same set of problems, Hastings said.

For POSIX, using a combination of setjmp(), longjmp(), sigaltstack(), and some signal (e.g. SIGUSR2) will provide coroutine support though it is "pretty awful". While it is "horrible", it does actually work. He concluded his presentation by saying that he was mostly interested in getting the assembled developers to start thinking about these kinds of things.

One attendee suggested looking at the GCC split stack support that has been added for the Go language, but another noted that it is x86-64-only. Trent Nelson pointed to PyParallel (which would be the subject of the next slot) as a possible model. It is an approach that identifies the thread-sensitive parts of the interpreter and has put in guards to stop multiple threads from running in them.

But another attendee wondered if removing the GIL was really the change that the Mercurial developers needed in order to switch. Hastings said that he didn't think GIL removal was at all interesting to the Mercurial developers, as they are just happy with what Python 2.x provides for their project.

Though there may be solutions to the multi-threading problem that are architecture specific, it may still be worth investigating them, Nick Coghlan said. If "works on all architectures" is a requirement to experiment with ways to better support multi-threading, it is likely to hold back progress in that area. If a particular technique works well, that may provide some impetus for other CPU vendors to start providing similar functionality.

Jim Baker mentioned that he is in favor of adding coroutines. Jython has supported multiple interpreters for a while now. Java 10 will have support for fibers as well. He would like to see some sort of keyword tied to coroutines, which will make it easier for Jython (and others) to recognize and handle them. Dino Viehland thought that IronPython could use fibers to implement coroutines, but would also like to see a new language construct to identify that code.

The main reason that Van Rossum is not willing to merge Stackless is because it would complicate life for Jython, IronPython, PyPy, and others, Hastings said (with Van Rossum nodding vigorously in agreement). So having other ways to get some of those features in the alternative Python implementations would make it possible to pursue that path.

Viehland also noted that there is another scripting language that uses reference counting and is, in fact, "totally single threaded": JavaScript. People love JavaScript, he said, and wondered if just-in-time (JIT) compiling should be considered as the feature to bring developers to Python 3. That led Thomas Wouters to suggest, perhaps jokingly, that folks could be told to use PyPy (which does JIT).

Hastings said that he has been told that removing the GIL would be quite popular, even if it required rewriting all the C extensions. Essentially, if the core developers find a way to get rid of the GIL, they will be forgiven for the extra work required for C extensions. But Coghlan was not so sure, saying that the big barrier to getting people to use PyPy has generally been because C extensions did not work in that environment. Someone else noted that the scientific community (e.g. NumPy and SciPy users) has a lot of C extensions.

Comments (20 posted)

PyParallel

By Jake Edge
April 14, 2015

PyParallel is an alternative version of Python that is aimed at removing the global interpreter lock (GIL) to provide better performance through parallel processing. Trent Nelson prefaced his talk by saying that he hadn't made much progress on PyParallel since he presented it at PyCon 2013. He did give a few talks in the interim that were well-received, however. He got started back working on the code in December 2014, with a focus on making it stable while running the TechEmpower Frameworks Benchmark, which "bombs the server" with lots of clients making simple requests that the server responds to with JSON or plaintext. The benchmark has lots of problems, he said, but it is better than nothing.

Because it focuses on that benchmark, PyParallel performs really well when running it, Nelson said. So it is really good at stateless HTTP and maintains low latency even under a high load. It will saturate all CPUs available, with 98% of that in user space and just 2% in the Windows kernel.

The latency is low, and it also has low variance. On a normal run of the benchmark with clients attempting to make 50,000 requests per second, PyParallel shows a fairly flat graph, with relatively few outliers. Nelson displayed the graphs, which report the following. Tornado and Node.js servers on the same hardware showed a lot more variance in latency (as well as higher latency than PyParallel overall). Node.js performed better than Tornado, but had some outliers that were seven times the size of the mean (Tornado and PyParallel had worst-case latencies less than three times their mean). Both the Tornado and Node.js benchmarks were run on Linux, since they are targeted at that operating system, while PyParallel was run on Windows for the same reason.

Nelson is working on another test that is more complicated than the simple, stateless HTTP benchmark. It is an instantaneous search feature for 50GB of Wikipedia article data, but it is not working yet.

PyParallel is running on Python 3.3.5. He plans to use the customizable memory allocators that are provided by Python 3.4 and would like to see that API extended so that the reference-count-management operations could also be customized.

Effectively, PyParallel tests to see if an operation is happening in parallel and, if so, performs a thread-safe version. In the common case where the operations have naturally been serialized, it takes the faster normal path. Minimizing the overhead of that test is one of the best ways to increase performance.

In the process of his work, he broke generators and exceptions, at least temporarily. He purposely disabled importing and trace functions. He also "destroyed the PyObject structure" by adding a bunch of pointers to it. Most of those pointers are not needed, so he plans to clean it all up.

People can get the code at download.pyparallel.org. "At the very least, it is very very fast", he said. He has also hacked up the CPython code to such an extent that it makes a good development testbed for others.

Python core development infrastructure

By Jake Edge
April 14, 2015

The core development infrastructure for Python, which includes things like version-control systems and repository hosting, is the subject of two current PEPs. The PEPs offer competing views for how to move forward, Nick Coghlan said. He noted that Brett Cannon made a comment at one point that Python changes the way its development processes and systems work periodically, then leaves them alone for a few years. That is the enterprise approach, he said with a chuckle, which "sucks as much for us as it does for them".

Two PEPs

Coghlan has one proposed plan involving Kallithea (PEP 474), while Donald Stufft (who was not present at the summit) has a proposal involving GitHub and Phabricator (PEP 481). Coghlan's PEP is not focused on CPython development directly (his deferred PEP 462 is, however) but is instead looking at the infrastructure for many of the parts and pieces that surround CPython (e.g. the PEPs and the developer guide). We looked at some of the early discussion of the issue back in December.

Coghlan's interest in the issue stems from the fact that he already works on process and infrastructure for development as part of his day job at Red Hat. His job entails figuring out how to get code from desktops into the continuous-integration (CI) system and then into production. It is a job that people don't want to work on for free, he said. Instead they will find "something more entertaining to do".

He said that one idea behind his proposal is to try to respect the time people are putting into the patches they are contributing to Python. He would also like to minimize the time between when a developer makes a contribution and when they see it land in the Python mainline. The changes he wants to see would still allow people to use their current workflow, he said. It would create a new workflow that existing developers could cherry pick pieces from if they liked them. New projects could default to the new workflow.

One of the complaints about his proposal, which is based on free-software solutions hosted on Python Software Foundation (PSF) infrastructure, is that there would be no commercial support available, unlike with GitHub. But Red Hat allows him to spend 20% of his time working on the Python infrastructure, which will provide some of the support.

The pull-request model of GitHub is a good one, Coghlan said. The pull request becomes the main workflow element. For relatively simple projects with fairly small development teams, it works well. For those kinds of projects, you can live without an issue tracker, wiki, and mailing lists as what is needed is a way to propose changes and to discuss them, which pull requests do. That distills the development problem down to its minimal core.

Both proposals will accept GitHub pull requests as part of the workflow, though only Stufft's uses GitHub itself directly (and keeps a read-only copy of any repositories in Phabricator). Someone asked, facetiously, why not move fully to GitHub. Coghlan said it is partly a matter of risk management. Outsourcing the infrastructure to a proprietary tool is too risky in the long run.

In addition, GitHub only provides "all or nothing" access to its repositories. That makes it harder to build a good on-ramp for new developers. If Python uses its own service, it can use or create a fine-grained access control mechanism to allow new developers some access without giving them access to everything. That is currently done, to some extent, in the bug tracker, which is based on Roundup.

His intent is for the Python infrastructure (what he is calling "forge.python.org") to interface with a number of different services and tools, such as GitHub, Gerrit, and GitLab, in order to try to mesh with the workflow of contributors while still supporting the existing process for current core developers.

GitHub is well-suited for a small, central development team, whereas other tools are a better fit for how Python development is done. he said. "'Just use GitHub' is the answer for an awful lot of projects, but I don't think it's right for Python." OpenStack uses Gerrit for its code review because it is better suited to a large, distributed development team. There are some good ideas in the OpenStack workflows that might be applied to CPython development.

Brett Cannon said that Coghlan has "strong opinions" on the matter, which is part of why Cannon has been put in charge of making the decision between the two PEPs. He hopes to make a decision by the beginning of May, so those with an opinion should be trying to convince him of the right approach before then. With a grin, Coghlan agreed, "I am thoroughly biased, which is why I don't decide" the issue. Cannon said that he doesn't particularly care that GitHub is closed-source, so long as Python can get its data out if it ever needs to.

It is important to be able to accept GitHub pull requests, Coghlan said, as well as those from Bitbucket, GitLab, and potentially others. Mozilla has also adopted this practice. Mozilla agrees that open source needs open infrastructure, but projects have to go where their developers are.

Jacob Kaplan-Moss said he would try to "channel Donald [Stufft]" a bit. He noted that Django switched to GitHub and has tried to optimize it for the Django workflow as well as community contributions. GitHub is better for the contributors, though, it is just "OK" for the core developers. It can be made to work, but is not optimal. There is an existential question, he said, about whether a project focuses on the workflow for its core developers or for its contributors. Both are perfectly valid choices, but a choice needs to be made.

Patch backlog

Barry Warsaw is concerned that turning more toward contributor workflows will cause a loss of core developers. Coghlan noted that there are 2000 unmerged patches on bugs.python.org. The bottleneck is not on contributions, he said, but on review. It doesn't make sense to make it easier for someone to send a patch if the project is going to essentially ignore it, Thomas Wouters said.

The review process is being held back by the current workflow options, though, Coghlan said. You can't just take five minutes to review a patch, see that it passes the CI tests, and then say go ahead and merge it. Stufft's original proposal was GitHub-only, but he has added Phabricator into the mix since then, which addresses Coghlan's concerns about PEP 481. Coghlan would prefer his option, but can live with Stufft's.

The choice of Phabricator is a good one from a workflow perspective; if you are looking for good workflow design, Facebook (which created Phabricator) is a good place to look, Coghlan said. He personally doesn't want to work in PHP, which is what Phabricator is written in, but he can work on other parts of the problem if that is the direction chosen.

Cannon asked Kaplan-Moss about the bug backlog in Django after the switch to GitHub. At first, he said that Django had a huge backlog before the switch and still does today. After looking a little deeper, though, he noted that the bug backlog had been cut by a third since the switch, but he is "not sure if they are related".

The huge patch backlog indicates that the workflow for core developers needs to be fixed before the contributor workflow, Cannon pointed out. Contributors may not like it, but they seem to be willing to deal with the existing Python workflow, which is completely different than that of any other project. Once code can get reviewed and merged easily, other changes can follow. "As of now, no one is happy", he said. One important piece is to not lose the "things that we like" about the current workflow, Warsaw said, though he didn't go into any detail.

Reusing Python's choice

Jython developer Jim Baker asked about other projects that might want to piggyback on the choice made. It would be great if Jython could simply use the infrastructure and workflow that CPython decides on, he said. Cannon said that he is "just trying to be the guy that makes the decision", but that any choice will be one that other projects can pick up if they wish.

Coghlan expanded on that noting that the containerization choices that have become available relatively recently will make that all a lot easier. There is a lot of hype and "marketing rubbish" around Docker, but there is some good technology too. Docker has put a nice user experience on top of a bunch of Linux tools, which provides a packaging solution for Linux that application developers don't hate. It will make it easier for anyone that wants to run their own version of the infrastructure that is adopted. His goal is help make it so that "open source projects don't have to live with crappy infrastructure anymore". Cannon pointed out that members of the PSF board have told him that there would be money available to help make some of these things happen once a decision is made.

One of the advantages of a pull request is that it identifies unambiguously which version of the code a patch will apply to, an attendee said. Is it possible to automate turning the existing patch backlog into pull requests or to at least give hints on the version a patch is targeting, he wondered. Perhaps a Google Summer of Code (GSoC) project could be aimed at this problem.

Another problem is the lack of a test farm, which is at least partly due to the fragility of the tests themselves. If there were a build and test farm, the systems in the farm would never agree on the test results, or at least not reliably.

Coghlan and another attendee said that there are some efforts to get GSoC students involved in solving some of these problems. One project is to put a REST API on Roundup, which may help doing some of the automated processing of the patch backlog.

An OpenStack developer said that he was in favor of fixing the core developer workflow as a way to make things better for the whole community. While it is important to consider GitHub because everyone uses it, the most important thing is to try to ensure that patches land in the mainline within a reasonable time frame. Both Kallithea and Phabricator are good tools, but neither existed when OpenStack was looking for something, so it chose Gerrit. The project is making headway on making its review and CI systems more reusable by others, as well.

The final comment was that whatever happens, some people will complain about it. But that shouldn't make the project afraid to make a change.

Python 3 adoption

By Jake Edge
April 14, 2015

"If you can't measure it, you can't migrate it" was the title of the next presentation, which came from Glyph Lefkowitz, who is the maintainer of the Twisted event-driven network framework. "I famously have issues with Python 3", he said, but what he mainly wants is for there to be one Python. If that is Python 2.8 and Python 3 is dropped, that would be fine, as would just having Python 3.

Nobody is working on Python 2 at this point. Interested developers cannot just go off and do the work themselves, since the core developers (which he and others often refer to as "python-dev" after the main CPython-development mailing list) are actively resisting any significant improvements to Python 2 at this point. That is because of a "fiat decision" by python-dev, not because there are technical reasons why it couldn't be done.

Beyond that, "nobody uses Python 3", at least for some definition of "nobody". There are three versions of Python in use at this point: 2.6, 2.7, and everything else. Based on some Python Package Index (PyPI) data that he gathered (which was a few months old; he also admitted the methodology he used was far from perfect), Python 2.7 makes up the majority of the downloads, while 2.6 has a significant but far smaller chunk, as well. All the other Python versions together had a smaller slice than even 2.6.

He pointed to the "Can I use Python 3?" web site, which showed 9,900 of 55,000 packages ported. But he typed in one he uses (Sentry) and it is blocked by nine dependencies that have not yet been ported to Python 3. There is a "long tail problem", in that there are lots of packages that need porting and, because of interdependencies, plenty of things won't work until most of those packages are ported. But there are lots of small packages that are essentially unmaintained at this point; they work fine for Python 2 so the developers aren't putting out updates, much less porting them to Python 3.

Lefkowitz said he spends a "lot of time worrying about Python 3", but other maintainers are just giving up. New programmers who are learning Python 3 get "really mad about packages that don't work". They go on Reddit and Hacker News to scream about it. That causes maintainers to just quietly drop out of the community sometimes. He talked to several who did not want to be identified that were burnt out by the continual harassment from those who want Python 3 support. The problem is "not being healed from the top", he said.

There are a number of examples where the project is not communicating well about deprecated features or packages, he said. PIL (Python Imaging Library) is still mentioned in official documentation, even though it has been officially supplanted by Pillow for years. That fact is not unambiguously communicated to users, who then make their projects dependent on an outdated (and unavailable for Python 3) package.

He also has some ideas on things that could be done to make Python users happier. To start with, there needs to be a better story for build artifacts. The Go language has a great support for building and sharing programs. Basically, any Go binary that is built can be copied elsewhere and just run. But static linking as the solution for binary distribution is an idea that has been around for a long time, one attendee said.

Lefkowitz noted that 6th graders who are building a game using Pygame just want to be able to share the games they make with their friends. Users don't want a Python distribution, they just want some kind of single file they can send to others. Guido van Rossum asked if these 6th-grade Pygame programmers were switching to Go, and Lefkowitz said that they weren't. Mostly they were switching to some specialized Java teaching tool that had some limited solution to this problem (it would allow sharing with others using that same environment). "We could do that for Python", he said, so that users can share a Pygame demo without becoming experts in cross-platform dynamic linking.

Performance is another area where Python could improve. People naively think that when moving to Python 3 they will suddenly get 100x the performance, which is obviously not the case. He suggested making PyPy3 the default for Python 3 so that people "stop publishing benchmarks that make Python look terrible".

Better tools should be on the list as well, he said. The Python debugger (pdb) looks "pretty sad compared to VisualVM".

But it doesn't actually matter if the Python community adopts any of those ideas. There is no way to measure if any of them have any impact on adoption or use. To start with, python-dev needs to admit that there is a problem. In addition, the harassment of library maintainers needs to stop; it would be good if some high-profile developers stepped in once in a while to say that on Reddit and elsewhere. In terms of measurement, the project needs to decide on what "solved" looks like (in terms of metrics) then drive a feedback loop for that solution.

Another thing the project should do is to release at least ten times as often as it does now (which is 18-24 months between major releases and around six months between minor releases). The current release cadence comes from a "different geologic era". Some startups are releasing 1000x as often as the current Python pace.

The problem with too few reviewers may be alleviated by a faster release cycle, Lefkowitz said. Twisted went from one release every two years to one release per quarter, so he has direct experience with increasing the frequency of releases. What Twisted found was that people move more quickly from contributors to reviewers to core developers when those cycles are shorter.

It requires more automation and more process, in faster, lighter-weight forms. The "boring maintenance fixes" will come much faster under that model. That allows new contributors to see their code in a release that much more quickly. The "slower stuff" (new features and so on) can still come along at the same basic rate.

He offered up a few simple metrics that could be used to measure and compare how Python 3 is doing. He would like to see python-dev come to some consensus on which metrics make sense and how they should be measured. For example, the PyPI numbers might be a reasonable metric, though they may be skewed by automated CI systems constantly downloading particular versions.

Another metric might be to measure the average number of blockers as reported by caniusepython3.com. The number of projects ported per month might be another. The project could even consider user satisfaction surveys to see if people are happy with Python 3. He would like to see further discussion of this on the python-dev mailing list.

Coghlan noted that one other factor might be where users are getting their Python. Since the Linux distributions are not shipping Python 3 by default (yet, mostly), that may be holding Python 3 back some in the Linux world.

Several others wanted to discuss the packaging issue. Thomas Wouters noted that there is a place for python-dev to do something about packaging, but that any such effort probably ought to include someone who is teaching 6th graders so that their perspective can be heard. Brett Cannon pointed to the Education Summit that was scheduled for the next day as a possible place to find out what is needed. Lefkowitz said that was important to do, because many have ideas on how to create some kind of Python executable bundle, but it requires knowledge from core developers to determine which of those ideas are viable.

That is the essence of the problem, Van Rossum said. The people who know what needs to be done and the people who can do it are disjoint sets. That is as true for Language Summit attendees as it will be for Education Summit attendees. Beyond that, the Distutils special interest group (SIG) is "the tar pit of SIGs".

People are already doing similar things using tools like py2exe and others, Lefkowitz said. It would be good to get them together to agree that there is a distribution problem for Python programs. Each of the solutions has its own way of tracking imports, collecting them up, and copying them around, so it would be good to come up with something common.

Barry Warsaw described a Twitter tool and format called PEX that takes a "well-written setup.py" and turns that into a kind of executable. It contains the Python interpreter, shared libraries, and imported modules needed to run the program. It "seems like the right direction" for packaging and distributing Python programs.

Łukasz Langa said that Facebook has something similar. It is "hacky but it works". It collects all of the shared library files into a single file, collects the imported modules, zips all of that up, and prepends a Bash script onto the front so that it executes like any other program. Startup time is kind of long, however. Google also has a tool with the same intent, Wouters said.

Lefkowitz concluded by saying that he thought python-dev should provide some leadership or at least point a finger in the right direction. Getting a widely adopted solution could drive the adoption of Python 3, he said. Van Rossum suggested that someone create an informational PEP to start working on the problem.

Comments (42 posted)

The Python symbolic link

By Jake Edge
April 15, 2015

Many Python scripts are written to be executed with /usr/bin/python, which they expect to give them the proper version of the Python interpreter. In a session at the Python Language Summit—held April 8 in conjunction with PyCon—the future of the program at that path was discussed. At this point, /usr/bin/python is a symbolic link that points to Python 2 in most cases, but there are arguments for pointing it elsewhere—or removing it entirely. Several different developers offered their ideas on what you should happen with that link in the future: move it, eliminate it, or something else entirely. Perhaps surprisingly, "something else entirely" won a straw poll at the end of the discussion.

Nick Coghlan was first up in the discussion. He noted that the symbolic link will be gone from Fedora soon. Prior to that, programs that wanted Python 2 would use either "/usr/bin/python" or "/usr/bin/python2", while those wanting Python 3 would always use "/usr/bin/python3". Enough progress has been made in the Fedora tools that most installation images will only have Python 3—and no symbolic link for /usr/bin/python will be installed. Installing the Python 2 package will create the symbolic link (and point it at Python 2) but that will not be the default.

Coghlan wondered whether the "upstream recommendation" about the symbolic link (in the form of PEP 394) should change. It currently recommends that "python" point to Python 2, but that will eventually need to change. By the time Python 3.6 is released in early 2017, the unqualified symbolic link will have been gone from Fedora for more than a year, Coghlan said. The question is whether Fedora (and others) will want to bring the symbolic link back in the Python 3.6 time frame, but point it to Python 3, rather than 2, which is his preferred solution. Over that year, anything that refers to the unqualified symbolic link will break in Fedora, which should force it to get fixed. So more conservative platforms could potentially upgrade directly from the link pointing to Python 2 to pointing at Python 3.

Up next was Matthias Klose, who does not think the symbolic link should be changed at all. He noted that distributions have been dealing with upgrades like Python 2 to 3 for a long time with a variety of mechanisms. For GCC, for example, Debian and Ubuntu manually handled a symbolic link for GCC 4.9. The "alternatives" mechanism is another possibility, but that makes more sense for things like choosing an editor than it does for choosing a version of Python. "Diversions", where the program gets renamed and replaced by a newer version, can also be used, but that is not done often, he said.

There is a parallel to switching the Python symbolic link in the switch of the /bin/sh link from Bash to dash. That change was made in Ubuntu in 2006 and in Debian in 2009 but there are still complaints about shell scripts that won't run on Ubuntu (because they contain Bash-isms). He showed that there are still unresolved bugs in the Debian bug tracker from the switch. That change was made more than eight years ago and problems are still trickling in, he said.

The /bin/sh program has a "concrete meaning" as a POSIX shell, but the Python symbolic link lacks that. Some distributions have already switched the link to Python 3, which has caused "breakage in the wild", Klose said. It will take years to track down all of the breakage and fix it, so it is just easier not to change the symbolic link at all. Programs that care should simply specify Python 2 or 3.

Barry Warsaw said that he was aligned with Klose. PEP 394 should clearly state that programs should be explicit and choose either python2 or python3. Coghlan asked, what about programs that don't care what version they get? Warsaw said that programs shipped by distributions should care, thus should be explicit.

There is a different problem for what users get when they sit down at the terminal and type "python", though. The Bash "command not found" functionality could be used to suggest python3 to users, Warsaw said. For distribution-supplied programs, though, the "#!/usr/bin/python" or "#!/usr/bin/env python" lines (also known as "shebang" lines) should be changed to be explicit on which version the program needs. If they don't care, they should "use Python 3".

Monty Taylor is "somewhere in between" Coghlan and Klose/Warsaw. He would like to see the symbolic link continue to exist until Python 2 fades away. That would come in something like five years, he said, not six months. He would like to not to have to care about Python 2.6, which is now past its end of life but, because Red Hat is still supporting it for certain RHEL releases, he still needs to support it for his packages. Those kinds of situations are likely to persist. Someday, there will be only one Python, but that is not true yet.

Thomas Wouters asked about PyPy. Does the symbolic link always mean CPython? No one seemed interested in switching the link in that direction, however.

The version a user gets when they type python is an important question that should be decoupled from the question about the shebang line, Glyph Lefkowitz said. Having an unqualified python on the shebang line should give a warning now, but for the command-line case, something different could be done. He suggested creating some sort of tool that gives users a menu of choices when they type "python". Warsaw suggested that some kind of configuration parameter could be used to govern whether users got the menu or a particular version of Python. That is what Apple does for programs of this sort, Lefkowitz said.

The various tutorials and other documentation typically just specify "python" for the command line (or the shebang line), so distributions will need to provide something there, one attendee noted. Users are likely to just want whatever the default is, which is Python 2 for now, but that will change.

Larry Hastings conducted a straw poll to see which of the four options was most popular. It was an informal poll that explicitly allowed people to vote more than once, but the outcome was interesting. Seven developers thought that python should point to Python 3 in the 3.6 time frame; 11 thought the symbolic link should not be changed; 19 thought it should be switched at the point where there is only one Python; and 27 agreed with Lefkowitz's idea of a new program that would get run when users type python.

Comments (3 posted)

Type hints

By Jake Edge
April 15, 2015

One of the headline features targeted at Python 3.5 (which is due in September) is type hinting (or type hints). Guido van Rossum gave an introduction to the feature at a Python Language Summit session. Type hints are aimed at static analysis, so several developers of static analysis tools were involved in the discussion as well.

The current proposal for type hints for Python is laid out in PEP 484. It uses "function annotations as I originally envisioned them", Van Rossum said. Those annotations were proposed back in 2000 with the idea that they would provide information to the interpreter to "generate super good code". But the annotations don't really help with code generation, so they are not meant to help the interpreter at this point.

Instead, type hints are designed to be ignored by the interpreter and to "not slow it down in most cases". The feature is targeted at being used by a "lint on steroids", he said.

He put up a slide with example code that he said showed some of the problems with the annotation syntax, but gave a reasonable flavor of how it would work. Here is an excerpt:

    from typing import List, Tuple, Callable

    def zip(xx: List[int], yy: List[int]) -> List[Tuple[int, int]]:
        ...
    def zipmap(f: Callable[[int, int], int], xx: List[int],
               yy: List[int]) -> List[Tuple[int, int, int]]:
	...

In the example, zip() takes two arguments that are lists of integers and returns a list of two-integer tuples. zipmap() takes a function that takes two integer arguments and returns an integer along with two lists of integers; it returns a list of three-integer tuples. There is also support for generic types, so that the annotations can go "beyond concrete types", he said.

Stub files are "boring stuff that is nevertheless important", Van Rossum said. They provide a mechanism to annotate functions in C extension modules without becoming a burden on Argument Clinic, which is a domain-specific language for specifying arguments to Python built-ins. Stubs are also useful for things that you can't or don't want to annotate. The stubs are stored in .pyi files corresponding to the Python extension (e.g. base64.pyi) using the same function annotation syntax. There is one addition, though, of an @overload decorator for overloaded functions.

For 3.5, he is hoping to get typing.py added to the standard library. That is the entirety of the changes needed for this proposal, as there are no changes to CPython or to the Python syntax. The addition is "pure Python", but there are "a lot of metaclasses" and other scary stuff in typing.py. There are no plans for annotations for the standard library in 3.5, though he does anticipate some third-party stubs for standard library modules. The mypy tool that served as inspiration for the PEP already has some stubs for the standard library.

Putting typing.py into the standard library sends a signal that this is what the core Python developers want in terms of type hints. It encourages everyone who thinks that type hints are a good thing to use the same syntax. For example, the PyCharm IDE has its own notion of stubs and Google has a bunch of tools that it has released as open source (or will); both of those could benefit from a single standard type hint syntax.

There are no plans to force this feature on anyone that doesn't want to use it. He would like to get it into 3.5 before the feature freeze that accompanies the first beta release (due in May). That target will help "focus the PEP design". The typing.py module would be added as a provisional package, which means that it can still evolve as needed during the rest of the release cycle.

Some have wondered why there isn't a new syntax being designed for type hints. One reason is that typing.py will still work with earlier versions of Python 3, Van Rossum said. Those who are interested can just install it from the Python Package Index (PyPI). New syntax is also a "tar pit of bikeshedding". For 3.6, core developers "might muster up the courage" to add syntax for variable types (rather than use comments as is proposed with PEP 484).

There is a problem with forward references right now that the PEP solves by using string literals rather than types:

    class C:
        def foo(self) -> 'List[C]':
	    ...

Łukasz Langa is working on a way to get around that problem using a __future__ import:

    from __future__ import annotations

That would turn all annotations into string values as they are parsed, which would neatly avoid the problem that the CPython parser can't handle the forward references, while the static analyzers can.

At that point, Mark Shannon from Semmle, which is a company that makes static analyzers for Python and other languages, stepped up to talk about the proposal. He had a number of questions and concerns about the PEP, though syntax was not among them. Shannon said that he didn't care what the syntax was, his worries were about the semantics of the annotations.

Shannon is concerned about the lack of a distinction between classes and types. Also, the scope of type declarations is not well-defined. There is not much support for duck typing, either. Van Rossum admitted that duck typing is not supported, mostly because it doesn't fit well with static type analysis. The intended scope of type declarations is clear in Van Rossum's mind, but it may not be in the PEP, he said.

Shannon said that it was important to stop thinking about programs as code to run. Instead, for static analysis purposes, they should be looked at as a bit of text to analyze. He also suggested that any tools have two modes: "linting" mode to report when the type being used is not the same as what is required and "strict" mode that reports when the tool is unable to prove that the proper type is being used.

Van Rossum invited Shannon to co-author the PEP with him if he was willing to commit to the 3.5 time frame. Shannon said he was willing to work on it under that constraint.

The hope is that there will be new syntax for variable types in the 3.6 time frame, Van Rossum said. Jython developer Jim Baker was in favor of that. It would allow access to the variable annotations from the standard abstract syntax tree (ast) module.

Larry Hastings wondered why the PEP was trying to avoid using Argument Clinic. It is, he said, the perfect place to put this kind of information. Van Rossum said that there must have been some kind of misunderstanding at one point, so he apologized and agreed that Argument Clinic should be used.

The basic idea behind PEP 484 is to create a common notation, Van Rossum said. He was mostly hoping that the assembled developers would not be too unhappy with that notation, which seemed to be true. Thomas Wouters noted that he had not mentioned exceptions, which Van Rossum acknowledged. He has heard about some bad experiences with Java exception checking, so he avoided dealing with that for now. Langa, who is another co-author of the PEP, agreed that exceptions are "out of scope for now".

A PyCharm developer spoke up to note that the project has been doing type inference on Python programs for four years or so. The type system in the PEP is similar to what PyCharm uses, so "we feel it fits the development needs well". PyCharm can infer types for 50-60% of variables in user code, but can't get further than that without getting type annotations for function parameters.

Steve Dower said that the PEP should work well with Visual Studio, though there were still some issues to think about. It currently works by inferring types from the docstrings but could take advantage of the annotations. Other projects and companies also seemed happy with the changes.

Langa noted that at Facebook, at least, having optional typing available (as the Hack language does) eventually led to a cultural shift at the company. At some point, not having the annotations became a red flag during code review, so the company's code is moving toward type annotations everywhere.

Direct onion services over Tor

By Nathan Willis
April 15, 2015

Although it is best known for safeguarding the anonymity of Internet users on the client side, the Tor project has long supported hidden services as well. A hidden service is a mechanism that lets administrators run an Internet server entirely within the Tor network—thus protecting the server owner's anonymity as well as the client's. Now the project is exploring another service option that would be tailored to a different use case. Direct Onion Services, as the idea is currently (and, indications are, temporarily) known, would offer the client-side user the privacy-protecting features already available with hidden services, but with reduced overhead on the server side. The scheme would mean that the server gives up its anonymity, but in doing so it gains improved speed and ease-of-use.

Traditionally, Tor hidden services provide anonymization to both the client and the server during a session. The client must connect over the Tor network, and the server is listening only on a virtual interface that is also connected to Tor (and which is reachable only through a .onion domain name). Originally, the reason for this design was that the server could remain just as anonymous as the client user. No one could determine the owner of an anonymous dissident's blog by tracing the traffic of a Tor hidden service, since that traffic is routed through multiple relays.

The trouble with anonymous .onion services

But experience has shown that there is a downside to hidden services. Configuration is hardly trivial (although the project is doing what it can to simplify the process) and, more importantly, routing the hidden service's traffic through multiple relays—three hops, by default—means increased latency and reduced bandwidth. And, as it turns out, there are quite a few hidden-service providers that care little about their own anonymity, and run their service over Tor primarily for the benefit of their users—allowing those users to access the service over a secure, anonymous connection free from prying eyes.

The main example of this scenario is a public Internet service that maintains a separate Tor entry point as an end-user convenience, such as the Facebook hidden service at https://facebookcorewwwi.onion/. The fact that the server belongs to Facebook is not secret in the least; the .onion site is there to give users an encrypted and authenticated (due to .onion URLs self-authenticating design) way to access the site when using a network that might block or intercept a normal web connection. For sites like Facebook, the multi-hop routing of traffic adds network overhead, but no anonymity.

Consequently, the Tor project has been exploring ways to offer a better solution for "public" .onion services. George Kadianakis raised the question in a March 30 blog post that solicited ideas from the public about how hidden services could be improved. On April 9, Kadianakis sent a proposal to the Tor development list outlining what he called "direct onion services."

The proposal highlights the aforementioned well-known-public-site use case, but it offers a few other possibilities as well. Wikileaks, for example, uses a .onion service for whistleblowers to submit information, despite (like Facebook) not attempting to anonymize itself in the process. Rather, Wikileaks's use case is that the .onion entry point is a "succeed or fail hard" proposition—meaning that users can either connect to the service and know that their Tor-based connection is authentic and encrypted, or they cannot connect at all. It is impossible for a user to unknowingly connect to the upload site by insecure means.

Another example is applications that lack authentication or encryption at the protocol level. The proposal cites a plan by Freenode to offer IRC access over an .onion service, which would grant users security and anonymity that the IRC protocol itself lacks.

Public .onion services

The essence of the proposal itself is straightforward. Normally, a hidden .onion service establishes two types of entry-point connections on Tor. The first are introduction points: randomly chosen Tor nodes to which the service distributes its public key at start-up time. That key is then added to Tor's distributed hash table (DHT) from multiple sources to further evade tracing the server's location; users wanting to reach the service grab the key from the DHT, hash it, and the hash serves as hostname component of the service's .onion URL. The second variety of entry point type is the rendezvous point—a randomly chosen Tor node that the client selects to connect to the service. The client and the service each create their own circuit to the rendezvous point, rather than connecting directly.

The proposal states that a non-anonymous service needs a way to establish one-hop Tor circuits for both types of entry points, and that it must not connect to guard nodes (a special class of Tor entry node). Ideally, there will be a way for users to enable these configuration parameters in a simple manner, such as by setting a specific option in the .torrc configuration file.

In the original hidden-service design, each circuit between the server and an entry point can be multiple hops long. Reducing those excess hops decreases round-trip time and, in the case of high-traffic services, it also reduces the amount of overall network traffic sent over Tor.

The guard-node issue is slightly different. Nodes are assigned the "guard" flag by Tor's bandwidth-monitoring servers; a guard is a high-bandwidth node that is designated as a good entry point to the Tor network for clients. When other Tor nodes see that a node has been designated a guard, they reduce the number of intermediary connections they establish through it. Thus, a high-traffic .onion service could have an undue crippling effect on multiple Tor users if it sends its higher-than-average traffic through a guard.

Kadianakis based the proposal on an earlier, unimplemented idea from Roger Dingledine. Dingledine's idea did not address guard nodes, and it posited doing away with rendezvous points, but the end goal remains essentially the same.

Kadianakis also asked whether or not the project should provide special Tor builds tailored for public .onion services (since it already provides special builds for Tor-to-web gateways). David Goulet replied that this would likely not be useful, since it would limit the ability of service operators to choose between .onion service types on the fly.

Jacob Haven addressed a more fundamental issue, noting that, if the public .onion service operator was not concerned about their own anonymity, the introduction points and rendezvous points themselves may be unnecessary. The service could advertise itself in some simpler manner and users could connect to it directly, thus reducing Tor network load even further.

Kadianakis replied that such simplifications would indeed be likely to provide additional speed improvements, but that they would require changes to the hidden-service codebase. There is also a downside, he added, in that the rendezvous-point connection protocol is able to punch through NAT, while listening for direct connection requests would potentially be blocked by NAT.

On the whole, though, there appears to be rough consensus that the idea is well worth pursuing, and there has indeed been some preliminary development work by Alec Muffett. Amusingly enough, the big unresolved question at this point appears to be what to call the new feature. Kadianakis cautioned in his original email that the name "direct onion service" would likely need revisiting—it is not particularly descriptive, and the acronym DOS has an unfortunate name collision with "denial of service." So, too, does his follow-up suggestion "dangerous direct onion service" as well as several of the ideas proposed in the discussion thread (such as Matt "Speak Freely"'s suggestion "peeled onion service").

Then again, the name Tor itself has never been especially new-user-friendly either. In reality, what matters most is that Tor can provide the anonymity and privacy safeguards that its users—client or server—depend on. This proposal looks to be further meeting the needs of users in both categories.

Security quotes of the week

Criticizing [J. Alex] Halderman and [Vanessa] Teague for identifying security flaws in an Internet voting system is like criticizing your friend for pointing out that the lock on your front door doesn’t work. While moving to Internet voting may sound reasonable to folks who haven't paid any attention to the rampant security problems of the Internet these days, it's just not feasible now. As Verified Voting notes: "Current systems lack auditability; there’s no way to independently confirm their correct functioning and that the outcomes accurately reflect the will of the voters while maintaining voter privacy and the secret ballot." Indeed, the researchers' discovery was not the first indication that New South Wales was not ready for an Internet voting system. Australia’s own Joint Standing Committee on Electoral Matters concluded last year, “Australia is not in a position to introduce any large-scale system of electronic voting in the near future without catastrophically compromising our electoral integrity.”

— Farbod Faraji for the Electronic Frontier Foundation

We show that, while the attack infrastructure is co-located with the Great Firewall, the attack was carried out by a separate offensive system, with different capabilities and design, that we term the “Great Cannon.” The Great Cannon is not simply an extension of the Great Firewall, but a distinct attack tool that hijacks traffic to (or presumably from) individual IP addresses, and can arbitrarily replace unencrypted content as a man-in-the-middle.

The operational deployment of the Great Cannon represents a significant escalation in state-level information control: the normalization of widespread use of an attack tool to enforce censorship by weaponizing users. Specifically, the Cannon manipulates the traffic of “bystander” systems outside China, silently programming their browsers to create a massive DDoS attack. While employed for a highly visible attack in this case, the Great Cannon clearly has the capability for use in a manner similar to the NSA’s QUANTUM system, affording China the opportunity to deliver exploits targeting any foreign computer that communicates with any China-based website not fully utilizing HTTPS.

— Citizenlab

As long as our leaders are scared of the terrorists, they're going to continue the security theater. And we're similarly going to accept whatever measures are forced upon us in the name of security. We're going to accept the National Security Agency's surveillance of every American, airport security procedures that make no sense and metal detectors at baseball and football stadiums. We're going to continue to waste money overreacting to irrational fears.

We no longer need the terrorists. We're now so good at terrorizing ourselves.

— Bruce Schneier

apache: information leak

Package(s):

apache

CVE #(s):

CVE-2014-5704

Created:

April 13, 2015

Updated:

April 16, 2015

Description:

From the CVE entry:

The DISH Anywhere (aka com.sm.SlingGuide.Dish) application 3.5.10 for Android does not verify X.509 certificates from SSL servers, which allows man-in-the-middle attackers to spoof servers and obtain sensitive information via a crafted certificate.

Alerts:

Gentoo

201504-03

apache

2015-04-11

apport: privilege escalation

Package(s):

apport

CVE #(s):

CVE-2015-1318

Created:

April 14, 2015

Updated:

April 17, 2015

Description:

From the Ubuntu advisory:

Stéphane Graber and Tavis Ormandy independently discovered that Apport incorrectly handled the crash reporting feature. A local attacker could use this issue to gain elevated privileges.

Alerts:

Ubuntu	USN-2569-2	apport	2015-04-16
Ubuntu	USN-2569-1	apport	2015-04-14

asterisk: SSL server spoofing

Package(s):

asterisk

CVE #(s):

CVE-2015-3008

Created:

April 15, 2015

Updated:

July 21, 2015

Description:

From the CVE entry:

Asterisk Open Source 1.8 before 1.8.32.3, 11.x before 11.17.1, 12.x before 12.8.2, and 13.x before 13.3.2 and Certified Asterisk 1.8.28 before 1.8.28-cert5, 11.6 before 11.6-cert11, and 13.1 before 13.1-cert2, when registering a SIP TLS device, does not properly handle a null byte in a domain name in the subject's Common Name (CN) field of an X.509 certificate, which allows man-in-the-middle attackers to spoof arbitrary SSL servers via a crafted certificate issued by a legitimate Certification Authority.

Alerts:

Debian	DSA-3700-1	asterisk	2016-10-25
Debian-LTS	DLA-455-1	asterisk	2016-05-03
Fedora	FEDORA-2015-5948	asterisk	2015-07-21
Mandriva	MDVSA-2015:206	asterisk	2015-04-27
Mageia	MGASA-2015-0153	asterisk	2015-04-15

chrony: multiple vulnerabilities

Package(s):

chrony

CVE #(s):

CVE-2015-1821 CVE-2015-1822 CVE-2015-1853

Created:

April 13, 2015

Updated:

December 22, 2015

Description:

From the Debian advisory:

CVE-2015-1821: Using particular address/subnet pairs when configuring access control would cause an invalid memory write. This could allow attackers to cause a denial of service (crash) or execute arbitrary code.

CVE-2015-1822: When allocating memory to save unacknowledged replies to authenticated command requests, a pointer would be left uninitialized, which could trigger an invalid memory write. This could allow attackers to cause a denial of service (crash) or execute arbitrary code.

CVE-2015-1853: When peering with other NTP hosts using authenticated symmetric association, the internal state variables would be updated before the MAC of the NTP messages was validated. This could allow a remote attacker to cause a denial of service by impeding synchronization between NTP peers.

Alerts:

Scientific Linux	SLSA-2015:2241-3	chrony	2015-12-21
Oracle	ELSA-2015-2241	chrony	2015-11-23
Red Hat	RHSA-2015:2241-03	chrony	2015-11-19
Gentoo	201507-01	chrony	2015-07-05
Fedora	FEDORA-2015-5809	chrony	2015-04-24
Mageia	MGASA-2015-0163	chrony	2015-04-23
Fedora	FEDORA-2015-5816	chrony	2015-04-22
Debian-LTS	DLA-193-1	chrony	2015-04-12
Debian	DSA-3222-1	chrony	2015-04-12

das-watchdog: privilege escalation

Package(s):

das-watchdog

CVE #(s):

CVE-2015-2831

Created:

April 13, 2015

Updated:

April 15, 2015

Description:

From the Debian advisory:

Adam Sampson discovered a buffer overflow in the handling of the XAUTHORITY environment variable in das-watchdog, a watchdog daemon to ensure a realtime process won't hang the machine. A local user can exploit this flaw to escalate his privileges and execute arbitrary code as root.

Alerts:

Debian-LTS	DLA-194-1	das-watchdog	2015-04-12
Debian	DSA-3221-1	das-watchdog	2015-04-12

dpkg: integrity-verification bypass

Package(s):

dpkg

CVE #(s):

CVE-2015-0840

Created:

April 10, 2015

Updated:

June 15, 2015

Description:

From the Debian advisory:

Jann Horn discovered that the source package integrity verification in dpkg-source can be bypassed via a specially crafted Debian source control file (.dsc). Note that this flaw only affects extraction of local Debian source packages via dpkg-source but not the installation of packages from the Debian archive.

Alerts:

Fedora	FEDORA-2015-7342	dpkg	2015-05-12
Fedora	FEDORA-2015-7296	dpkg	2015-05-12
Mageia	MGASA-2015-0197	dpkg	2015-05-06
Debian-LTS	DLA-220-1	dpkg	2015-05-15
Ubuntu	USN-2566-1	dpkg	2015-04-09
Debian	DSA-3217-1	dpkg	2015-04-09
openSUSE	openSUSE-SU-2015:1058-1	dpkg,	2015-06-12

drupal7-webform: unspecified vulnerability

Package(s):

drupal7-webform

CVE #(s):

Created:

April 9, 2015

Updated:

April 15, 2015

Description:

Update to drupal7-webform 4.7 (notes) that may or may not include a security fix. The Fedora advisory includes a bug report reference from the 4.4 series. Whether the update fixes this older bug or another from the 4.7 release cycle is not specified.

Alerts:

Fedora	FEDORA-2015-4994	drupal7-webform	2015-04-09
Fedora	FEDORA-2015-5055	drupal7-webform	2015-04-09

echoping: denial of service

Package(s):

echoping

CVE #(s):

Created:

April 10, 2015

Updated:

April 16, 2015

Description:

From the Red Hat bug report:

echoping segfaults all the time.

[ Which is evidently due to a bad build back in 2013. ]

Alerts:

Fedora	FEDORA-2015-2600	echoping	2015-04-10
Fedora	FEDORA-2015-2584	echoping	2015-04-10

firefox: multiple vulnerabilities

Package(s):

firefox

CVE #(s):

CVE-2015-0798 CVE-2015-0799

Created:

April 9, 2015

Updated:

April 22, 2015

Description:

From the CVE entries:

CVE-2015-0798: The Reader mode feature in Mozilla Firefox before 37.0.1 on Android, and Desktop Firefox pre-release, does not properly handle privileged URLs, which makes it easier for remote attackers to execute arbitrary JavaScript code with chrome privileges by leveraging the ability to bypass the Same Origin Policy.

CVE-2015-0799: The HTTP Alternative Services feature in Mozilla Firefox before 37.0.1 allows man-in-the-middle attackers to bypass an intended X.509 certificate-verification step for an SSL server by specifying that server in the uri-host field of an Alt-Svc HTTP/2 response header.

Alerts:

Gentoo	201512-10	firefox	2015-12-30
Mageia	MGASA-2015-0342	iceape	2015-09-08
Fedora	FEDORA-2015-5723	firefox	2015-04-21
Fedora	FEDORA-2015-5702	firefox	2015-04-09

icecast: denial of service

Package(s):

icecast

CVE #(s):

CVE-2015-3026

Created:

April 13, 2015

Updated:

August 19, 2015

Description:

From the Arch Linux advisory:

The bug can only be triggered if "stream_auth" is being used. This means, that all installations that use a default configuration are NOT affected.The default configuration only uses <source-password>. Neither are simple mountpoints affected that use <password>. A workaround, if installing an updated package is not possible, is to disable "stream_auth"and use <password> instead. As far as we understand the bug only leads to a simple remote denial of service. The underlying issue is a null pointer dereference. For clarity: No remote code execution should be possible, server just segfaults.

An attacker could kill, with triggering the server with a special URL, the icecast-server due to a null pointer dereference.

The problem has been fixed upstream in version 2.4.2.

Alerts:

Fedora	FEDORA-2015-13083	icecast	2015-08-19
Fedora	FEDORA-2015-13077	icecast	2015-08-19
Gentoo	201508-03	icecast	2015-08-15
Debian	DSA-3239-1	icecast2	2015-04-29
openSUSE	openSUSE-SU-2015:0728-1	icecast	2015-04-16
Arch Linux	ASA-201504-12	icecast	2015-04-11

java: multiple vulnerabilities

Package(s):

java-openjdk

CVE #(s):

CVE-2005-1080 CVE-2015-0460 CVE-2015-0469 CVE-2015-0477 CVE-2015-0478 CVE-2015-0480 CVE-2015-0488

Created:

April 15, 2015

Updated:

January 14, 2016

Description:

From the Oracle CVE entries:

CVE-2005-1080: A directory traversal flaw was found in the way the jar tool extracted JAR archive files. A specially crafted JAR archive could cause jar to overwrite arbitrary files writable by the user running jar when the archive was extracted.

CVE-2015-0460: A flaw was found in the way the Hotspot component in OpenJDK handled phantom references. An untrusted Java application or applet could use this flaw to corrupt the Java Virtual Machine memory and, possibly, execute arbitrary code, bypassing Java sandbox restrictions.

CVE-2015-0469: An off-by-one flaw, leading to a buffer overflow, was found in the font parsing code in the 2D component in OpenJDK. A specially crafted font file could possibly cause the Java Virtual Machine to execute arbitrary code, allowing an untrusted Java application or applet to bypass Java sandbox restrictions.

CVE-2015-0477: A flaw was discovered in the Beans component in OpenJDK. An untrusted Java application or applet could use these flaws to bypass certain Java sandbox restrictions.

CVE-2015-0478: It was found that the RSA implementation in the JCE component in OpenJDK did not follow recommended practices for implementing RSA signatures.

CVE-2015-0480: A directory traversal flaw was found in the way the jar tool extracted JAR archive files. A specially crafted JAR archive could cause jar to overwrite arbitrary files writable by the user running jar when the archive was extracted.

CVE-2015-0488: A flaw was found in the way the JSSE component in OpenJDK parsed X.509 certificate options. A specially crafted certificate could cause JSSE to raise an exception, possibly causing an application using JSSE to exit unexpectedly.

Alerts:

SUSE	SUSE-SU-2016:0113-1	java-1_6_0-ibm	2016-01-13
Gentoo	201603-11	oracle-jre-bin	2016-03-12
SUSE	SUSE-SU-2015:2168-2	java-1_7_1-ibm	2015-12-14
SUSE	SUSE-SU-2015:2216-1	java-1_7_0-ibm	2015-12-07
SUSE	SUSE-SU-2015:2182-1	java-1_7_1-ibm	2015-12-03
SUSE	SUSE-SU-2015:2192-1	java-1_6_0-ibm	2015-12-03
SUSE	SUSE-SU-2015:2168-1	java-1_7_1-ibm	2015-12-02
SUSE	SUSE-SU-2015:2166-1	java-1_6_0-ibm	2015-12-02
Debian	DSA-3316-1	openjdk-7	2015-07-25
SUSE	SUSE-SU-2015:1161-1	java-1_6_0-ibm	2015-06-30
SUSE	SUSE-SU-2015:1086-4	java-1_7_0-ibm	2015-06-27
SUSE	SUSE-SU-2015:1086-3	Java	2015-06-24
SUSE	SUSE-SU-2015:1138-1	IBM Java	2015-06-24
SUSE	SUSE-SU-2015:1086-2	IBM Java	2015-06-22
SUSE	SUSE-SU-2015:1086-1	IBM Java	2015-06-18
SUSE	SUSE-SU-2015:1085-1	IBM Java	2015-06-18
Red Hat	RHSA-2015:1007-01	java-1.7.0-ibm	2015-05-13
Red Hat	RHSA-2015:1006-01	java-1.6.0-ibm	2015-05-13
SUSE	SUSE-SU-2015:0833-1	java-1_7_0-openjdk	2015-05-07
Red Hat	RHSA-2015:1020-01	java-1.7.1-ibm	2015-05-20
Debian-LTS	DLA-213-1	openjdk-6	2015-04-30
Mandriva	MDVSA-2015:212	java-1.7.0-openjdk	2015-04-27
openSUSE	openSUSE-SU-2015:0773-1	java-1_8_0-openjdk	2015-04-27
openSUSE	openSUSE-SU-2015:0774-1	java-1_7_0-openjdk	2015-04-27
Debian	DSA-3235-1	openjdk-7	2015-04-24
Debian	DSA-3234-1	openjdk-6	2015-04-24
Ubuntu	USN-2574-1	openjdk-7	2015-04-21
Ubuntu	USN-2573-1	openjdk-6	2015-04-21
Arch Linux	ASA-201504-23	jre8-openjdk-headless	2015-04-20
Arch Linux	ASA-201504-22	jre8-openjdk	2015-04-20
Arch Linux	ASA-201504-21	jdk8-openjdk	2015-04-20
Arch Linux	ASA-201504-17	jre7-openjdk-headless	2015-04-17
Arch Linux	ASA-201504-16	jre7-openjdk	2015-04-17
Arch Linux	ASA-201504-15	jdk7-openjdk	2015-04-17
Red Hat	RHSA-2015:0857-01	java-1.7.0-oracle	2015-04-20
Red Hat	RHSA-2015:0858-01	java-1.6.0-sun	2015-04-20
Red Hat	RHSA-2015:0854-01	java-1.8.0-oracle	2015-04-17
Scientific Linux	SLSA-2015:0809-1	java-1.8.0-openjdk	2015-04-15
Scientific Linux	SLSA-2015:0806-1	java-1.7.0-openjdk	2015-04-15
Scientific Linux	SLSA-2015:0807-1	java-1.7.0-openjdk	2015-04-15
Scientific Linux	SLSA-2015:0808-1	java-1.6.0-openjdk	2015-04-15
Mageia	MGASA-2015-0158	java-1.7.0-openjdk	2015-04-15
Red Hat	RHSA-2015:0809-01	java-1.8.0-openjdk	2015-04-15
Red Hat	RHSA-2015:0806-01	java-1.7.0-openjdk	2015-04-15
Red Hat	RHSA-2015:0807-01	java-1.7.0-openjdk	2015-04-15
Red Hat	RHSA-2015:0808-01	java-1.6.0-openjdk	2015-04-15
CentOS	CESA-2015:0809	java-1.8.0-openjdk	2015-04-15
CentOS	CESA-2015:0809	java-1.8.0-openjdk	2015-04-15
CentOS	CESA-2015:0807	java-1.7.0-openjdk	2015-04-15
CentOS	CESA-2015:0806	java-1.7.0-openjdk	2015-04-15
CentOS	CESA-2015:0806	java-1.7.0-openjdk	2015-04-15
CentOS	CESA-2015:0808	java-1.6.0-openjdk	2015-04-15
CentOS	CESA-2015:0808	java-1.6.0-openjdk	2015-04-15
CentOS	CESA-2015:0808	java-1.6.0-openjdk	2015-04-15
Oracle	ELSA-2015-0809	java-1.8.0-openjdk	2015-04-15
Oracle	ELSA-2015-0806	java-1.7.0-openjdk	2015-04-15
Oracle	ELSA-2015-0808	java-1.6.0-openjdk	2015-04-15
Red Hat	RHSA-2015:1021-01	java-1.5.0-ibm	2015-05-20

kernel: information leak

Package(s):

kernel

CVE #(s):

CVE-2015-2041

Created:

April 9, 2015

Updated:

April 15, 2015

Description:

From the Ubuntu advisory:

An information leak was discovered in the Linux kernel's handling of userspace configuration of the link layer control (LLC). A local user could exploit this flaw to read data from other sysctl settings.

Alerts:

openSUSE	openSUSE-SU-2016:0301-1	kernel	2016-02-01
openSUSE	openSUSE-SU-2015:1382-1	kernel	2015-08-14
SUSE	SUSE-SU-2015:1376-1	kernel-rt	2015-08-12
SUSE	SUSE-SU-2015:1478-1	kernel	2015-09-02
SUSE	SUSE-SU-2015:1224-1	kernel	2015-07-10
Mageia	MGASA-2015-0219	kernel-tmb	2015-05-13
Debian-LTS	DLA-246-1	linux-2.6	2015-06-17
SUSE	SUSE-SU-2015:0812-1	kernel	2015-04-30
Debian	DSA-3237-1	kernel	2015-04-26
Ubuntu	USN-2561-1	linux-ti-omap4	2015-04-08
Ubuntu	USN-2564-1	linux-lts-utopic	2015-04-09
Ubuntu	USN-2562-1	linux-lts-trusty	2015-04-08
Ubuntu	USN-2565-1	kernel	2015-04-09
Ubuntu	USN-2563-1	kernel	2015-04-08
Ubuntu	USN-2560-1	kernel	2015-04-08
SUSE	SUSE-SU-2015:1071-1	kernel	2015-06-16
Debian-LTS	DLA-246-2	linux-2.6	2015-06-17

libdbd-firebird-perl: buffer overflow

Package(s):

libdbd-firebird-perl

CVE #(s):

CVE-2015-2788

Created:

April 13, 2015

Updated:

April 20, 2015

Description:

From the Debian advisory:

Stefan Roas discovered a way to cause a buffer overflow in DBD-FireBird, a Perl DBI driver for the Firebird RDBMS, in certain error conditions, due to the use of the sprintf() function to write to a fixed-size memory buffer.

Alerts:

Fedora	FEDORA-2015-5601	perl-DBD-Firebird	2015-04-18
Fedora	FEDORA-2015-5552	perl-DBD-Firebird	2015-04-18
Mageia	MGASA-2015-0159	perl-DBD-Firebird	2015-04-18
Debian	DSA-3219-1	libdbd-firebird-perl	2015-04-11

libx11: code execution

Package(s):

libx11

CVE #(s):

CVE-2013-7439

Created:

April 13, 2015

Updated:

April 15, 2015

Description:

From the Debian advisory:

Abhishek Arya discovered a buffer overflow in the MakeBigReq macro provided by libx11, which could result in denial of service or the execution of arbitrary code.

Alerts:

Debian-LTS	DLA-199-1	libx11	2015-04-14
Ubuntu	USN-2568-1	libx11, libxrender	2015-04-13
Debian	DSA-3224-1	libx11	2015-04-12

mediawiki: multiple vulnerabilities

Package(s):

mediawiki

CVE #(s):

CVE-2015-2931 CVE-2015-2932 CVE-2015-2933 CVE-2015-2934 CVE-2015-2935 CVE-2015-2936 CVE-2015-2937 CVE-2015-2938 CVE-2015-2939 CVE-2015-2940 CVE-2015-2941 CVE-2015-2942

Created:

April 10, 2015

Updated:

April 20, 2015

Description:

From the Arch Linux advisory:

CVE-2015-2931 (cross-side scripting) It was discovered that MIME types were not properly restricted, allowing a way to circumvent the SVG MIME blacklist for embedded resources. This allowed an attacker to embed JavaScript in a SVG file.

CVE-2015-2932 (cross-side scripting) The SVG filter to prevent injecting JavaScript using animate elements was incorrect. The list of dangerous parts of HTML5 is supposed to include all uses of 'animate attributename="xlink:href"' in SVG documents.

CVE-2015-2933 (cross-side scripting) A persistent XSS vulnerability was discovered due to the way attributes were expanded in MediaWiki's HTML class, in combination with LanguageConverter substitutions.

CVE-2015-2934 (cross-side scripting) It was discovered that MediaWiki's SVG filtering could be bypassed with entity encoding under the Zend interpreter. This could be used to inject JavaScript.

CVE-2015-2935 (external resource loading) A way was discovered to bypass the style filtering for SVG files to load external resource. This could violate the anonymity of users viewing the SVG. This issue exists because of an incomplete fix for CVE-2014-7199.

CVE-2015-2936 (denial of service) It was discovered that MediaWiki versions using PBKDF2 for password hashing (the default since 1.24) are vulnerable to DoS attacks using extremely long passwords.

CVE-2015-2937 (denial of service) It was discovered that MediaWiki is vulnerable to "Quadratic Blowup" denial of service attacks.

CVE-2015-2938 (cross-side scripting) It was discovered that the MediaWiki feature allowing a user to preview another user's custom JavaScript could be abused for privilege escalation. This feature has been removed.

CVE-2015-2939 (cross-side scripting) It was discovered that function names were not sanitized in Lua error backtraces, which could lead to XSS.

CVE-2015-2940 (cross-side request forgery) It was discovered that the CheckUser extension did not prevent CSRF attacks on the form allowing checkusers to look up sensitive information about other users. Since the use of CheckUser is logged, the CSRF could be abused to defame a trusted user or flood the logs with noise.

CVE-2015-2941 (cross-side scripting) It was discovered that XSS is possible in the way api errors were reflected under HHVM versions before 3.6.1. MediaWiki now detects and mitigates this issue on older versions of HHVM.

CVE-2015-2942 (denial of service) It was discovered that MediaWiki's SVG and XMP parsing running under HHVM was susceptible to "Billion Laughs" DoS attacks.

Alerts:

Gentoo	201510-05	mediawiki	2015-10-31
Fedora	FEDORA-2015-5569	mediawiki	2015-04-18
Fedora	FEDORA-2015-5570	mediawiki	2015-04-18
Mandriva	MDVSA-2015:200	mediawiki	2015-04-10
Mageia	MGASA-2015-0142	mediawiki	2015-04-10
Arch Linux	ASA-201504-11	mediawiki	2015-04-10

mysql: unspecified vulnerabilities

Package(s):

mysql

CVE #(s):

CVE-2015-0385 CVE-2015-0409

Created:

April 13, 2015

Updated:

April 15, 2015

Description:

From the CVE entries:

Unspecified vulnerability in Oracle MySQL Server 5.6.21 and earlier allows remote authenticated users to affect availability via unknown vectors related to Pluggable Auth. (CVE-2015-0385)

Unspecified vulnerability in Oracle MySQL Server 5.6.21 and earlier allows remote authenticated users to affect availability via unknown vectors related to Optimizer. (CVE-2015-0409)

Alerts:

Gentoo

201504-05

mysql

2015-04-11

powerpc-utils-python: code execution

Package(s):

powerpc-utils-python

CVE #(s):

CVE-2014-8165

Created:

April 9, 2015

Updated:

November 3, 2016

Description:

From the CVE entry:

scripts/amsvis/powerpcAMS/amsnet.py in powerpc-utils-python uses the pickle Python module unsafely, which allows remote attackers to execute arbitrary code via a crafted serialized object.

Alerts:

Red Hat	RHSA-2016:2607-02	powerpc-utils-python	2016-11-03
Fedora	FEDORA-2015-4201	powerpc-utils-python	2015-04-09
Fedora	FEDORA-2015-4143	powerpc-utils-python	2015-04-09

qemu: denial of service

Package(s):

qemu

CVE #(s):

CVE-2015-1779

Created:

April 13, 2015

Updated:

October 28, 2015

Description:

From the Red Hat bugzilla:

It was found that the QEMU's websocket frame decoder processed incoming frames without limiting resources used to process the header and payload. An attacker able to access a guest's VNC console could use this flaw to trigger a denial of service on the host by exhausting all available memory and CPU.

Alerts:

SUSE	SUSE-SU-2016:1318-1	xen	2016-05-17
openSUSE	openSUSE-SU-2016:0995-1	xen	2016-04-08
SUSE	SUSE-SU-2016:0955-1	xen	2016-04-05
openSUSE	openSUSE-SU-2016:0914-1	xen	2016-03-30
SUSE	SUSE-SU-2016:0873-1	xen	2016-03-24
Gentoo	201602-01	qemu	2016-02-04
CentOS	CESA-2015:1943	qemu-kvm	2015-10-28
Scientific Linux	SLSA-2015:1943-1	qemu-kvm	2015-10-27
Oracle	ELSA-2015-1943	qemu-kvm	2015-10-27
Red Hat	RHSA-2015:1943-01	qemu-kvm	2015-10-27
Ubuntu	USN-2608-1	qemu, qemu-kvm	2015-05-13
SUSE	SUSE-SU-2015:0870-1	kvm	2015-05-13
Debian	DSA-3259-1	qemu	2015-05-13
SUSE	SUSE-SU-2015:0896-1	qemu	2015-05-18
Mandriva	MDVSA-2015:210	qemu	2015-04-27
Mageia	MGASA-2015-0149	qemu	2015-04-15
Fedora	FEDORA-2015-5482	qemu	2015-04-13

ruby: man-in-the-middle attack

Package(s):

ruby

CVE #(s):

CVE-2015-1855

Created:

April 14, 2015

Updated:

May 19, 2015

Description:

From the Arch Linux advisory:

After reviewing RFC 6125 and RFC 5280, multiple violations were found of matching hostnames and particularly wildcard certificates.

Ruby’s OpenSSL extension will now provide a string-based matching algorithm which follows more strict behavior, as recommended by these RFCs. In particular, matching of more than one wildcard per subject/SAN is no-longer allowed. As well, comparison of these values are now case-insensitive.

This change will take affect Ruby’s OpenSSL::SSL#verify_certificate_identity behavior.

Specifically:

Only one wildcard character in the left-most part of the hostname is allowed.
IDNA names can now only be matched by a simple wildcard (e.g. ‘*.domain’).
Subject/SAN should be limited to ASCII characters only.

A remote attacker can make use of the overly permissive hostname matching during certificate verifications to perform a man-in-the-middle attack by spoofing SSL servers via a crafted certificate.

Alerts:

Mandriva	MDVSA-2015:224	ruby	2015-05-04
Mageia	MGASA-2015-0178	ruby	2015-05-03
Debian	DSA-3247-1	ruby2.1	2015-05-02
Debian	DSA-3246-1	ruby1.9.1	2015-05-02
Debian	DSA-3245-1	ruby1.8	2015-05-02
Fedora	FEDORA-2015-6377	ruby	2015-04-28
Arch Linux	ASA-201504-13	ruby	2015-04-14
Debian-LTS	DLA-224-1	ruby1.8	2015-05-18
Debian-LTS	DLA-235-1	ruby1.9.1	2015-05-30

socat: denial of service

Package(s):

socat

CVE #(s):

CVE-2015-1379

Created:

April 15, 2015

Updated:

April 15, 2015

Description:

From the Mageia advisory:

In socat before 2.0.0-b8, signal handler implementations are not async-signal-safe and can cause crash or freeze of socat processes. Mostly this issue occurs when socat is in listening mode with fork option and a couple of child processes terminate at the same time

Alerts:

Mageia

MGASA-2015-0144

socat

2015-04-15

http://seclists.org/oss-sec/2015/q1/776

varnish: heap buffer overflow

Package(s):

varnish

CVE #(s):

Created:

April 13, 2015

Updated:

April 15, 2015

Description:

From the Red Hat bugzilla:

A heap-based buffer overflow flaw was reported (including a reproducer) in varnish, a high-performance HTTP accelerator:

Alerts:

Fedora

FEDORA-2015-4079

varnish

2015-04-11

wesnoth: information leak

Package(s):

wesnoth-1.10

CVE #(s):

CVE-2015-0844

Created:

April 13, 2015

Updated:

April 27, 2015

Description:

From the Debian advisory:

Ignacio R. Morelle discovered that missing path restrictions in the "Battle of Wesnoth" game could result in the disclosure of arbitrary files in the user's home directory if malicious campaigns/maps are loaded.

Alerts:

Fedora	FEDORA-2015-6280	wesnoth	2015-04-26
Fedora	FEDORA-2015-6295	wesnoth	2015-04-26
Debian-LTS	DLA-202-1	wesnoth-1.8	2015-04-17
Mageia	MGASA-2015-0154	wesnoth	2015-04-15
Debian	DSA-3218-1	wesnoth-1.10	2015-04-10

xen: multiple vulnerabilities

Package(s):

xen

CVE #(s):

CVE-2015-2752 CVE-2015-2756 CVE-2015-2751

Created:

April 13, 2015

Updated:

April 15, 2015

Description:

From the CVE entries:

The XEN_DOMCTL_memory_mapping hypercall in Xen 3.2.x through 4.5.x, when using a PCI passthrough device, is not preemptable, which allows local x86 HVM domain users to cause a denial of service (host CPU consumption) via a crafted request to the device model (qemu-dm). (CVE-2015-2752)

QEMU, as used in Xen 3.3.x through 4.5.x, does not properly restrict access to PCI command registers, which might allow local HVM guest users to cause a denial of service (non-maskable interrupt and host crash) by disabling the (1) memory or (2) I/O decoding for a PCI Express device and then accessing the device, which triggers an Unsupported Request (UR) response. (CVE-2015-2756)

Xen 4.3.x, 4.4.x, and 4.5.x, when using toolstack disaggregation, allows remote domains with partial management control to cause a denial of service (host lock) via unspecified domctl operations. (CVE-2015-2751)

Alerts:

Debian-LTS	DLA-479-1	xen	2016-05-18
Mageia	MGASA-2016-0098	xen	2016-03-07
SUSE	SUSE-SU-2015:1479-2	xen	2015-09-02
SUSE	SUSE-SU-2015:1479-1	xen	2015-09-02
openSUSE	openSUSE-SU-2015:1094-1	xen	2015-06-22
openSUSE	openSUSE-SU-2015:1092-1	xen	2015-06-22
Ubuntu	USN-2608-1	qemu, qemu-kvm	2015-05-13
Debian	DSA-3259-1	qemu	2015-05-13
openSUSE	openSUSE-SU-2015:0732-1	xen	2015-04-20
Gentoo	201504-04	xen	2015-04-11
Fedora	FEDORA-2015-5402	xen	2015-04-11
Fedora	FEDORA-2015-5208	xen	2015-04-11
SUSE	SUSE-SU-2015:0923-1	xen	2015-05-21

xterm: denial of service

Package(s):

xterm

CVE #(s):

Created:

April 9, 2015

Updated:

April 15, 2015

Description:

From the Red Hat bug report:

Buffer overflow leading to application crash.

Alerts:

Fedora	FEDORA-2015-3218	xterm	2015-04-09
Fedora	FEDORA-2015-3201	xterm	2015-04-09

Kernel release status

The 4.0 kernel was released on April 12. In the announcement, Linus noted:

Feature-wise, 4.0 doesn't have all that much special. Much have been made of the new kernel patching infrastructure, but realistically, that not only wasn't the reason for the version number change, we've had much bigger changes in other versions. So this is very much a 'solid code progress' release.

Beyond the (incomplete) live-patching mechanism, this release includes the removal of the remap_file_pages() system call, improved persistent memory support, the lazytime mount option, and the kernel address sanitizer.

The 4.1 merge window is now open; see the article below for a summary of what has been merged thus far.

Stable updates: 3.19.4, 3.14.38, and 3.10.74 were released on April 13.

Quotes of the week

Every time a vendor supplies a 32-bit time_t OS to a 32-bit computer user, it creates a potential circumstance someone might be running that close to one of the numerous issues coming close to 2038 (there are a number of them, read the wikipedia page, and pay close attention to NTP as well). The potential is low today, but increases as time_t ticks...

— Theo de Raadt

YOU’RE PING. WHY DO YOU EVEN CARE WHAT KERNEL VERSION IS RUNNING.

— Dave Jones

Next, the police stops the bus and wants to know who's on it. As the programmers usually don't respond when being spoken to, especially if it is the police, the bus conductor hands out a list of all the passport copies he gathered. That is called introspection that is not backed by cooperative bus members. The conductor makes a copy of each OF THE PASSPORT of the people entering the bus, to help the police (i.e. debuggers) determine who is on the bus.

— Daniel Mack (paraphrased by Greg Kroah-Hartman) explains kdbus

This thread has made me realize that even as I am able to carve out more time to work on things like IB maintaintership, I no longer have the desire to spend my time maintaining the IB subsystem. Since my current level of activity is clearly hurting the community, I've decided to step down as maintainer.

— InfiniBand maintainer Roland Dreier moves on (thanks to Yann Droneaud)

Comments (5 posted)

A mailing list for year-2038 issues

A new mailing list has been set up for developers working on preventing the year-2038 apocalypse. "Please join if you are interested in discussing these or want to send patches. The intention at the moment much to keep this list explicitly open to newbies, so we will get a lot of incorrect patches there along with other patches. Patches can get sent here for an initial review before they get sent to the real maintainers, or you can put the list on Cc when sending a patch that should be applied and you already know what you are doing."

Obstacles for kdbus

By Jonathan Corbet
April 15, 2015

The kdbus patch set, which adds a D-Bus-like messaging facility to the kernel, has been through several rounds of review over the course of the last year. The number of comments has been dropping with each review cycle, and the code seemed like it could be on track for a relatively easy merge into the 4.1 kernel. A closer look, though, reveals that there was some residual unhappiness from the last rounds that was always likely to flare up into active opposition when an attempt to merge kdbus was made. And, indeed, that is exactly what happened when Greg Kroah-Hartman sent a pull request to Linus on April 13.

This conversation is in full swing as of this writing, so an attempt to fully summarize it would be futile. In brief, though, the complaints take a number of forms. There is unhappiness with the performance of kdbus — a bit surprising, since performance is one of the motivating factors behind this development. There are a number of security-related concerns, especially around how the bus collects and transmits metadata about connected processes. Kdbus is still said to not play well with containers. Some developers find the complexity daunting. And so on.

The core of the disagreement, arguably, can be found in this message from Greg. There, he agreed that the design was "unfortunate" (though he later retracted that statement), and said that kdbus needed to be taken in its current form even if it is not ideal:

D-Bus is a specification that has been out there for over a decade, and we are not designing anything new here, but rather implementing it as designed. We have to be compatible to the existing users of the DBus system, and don't have the luxury of being able to change core things like this and expect the world to be able to change just because the design is not as clean as it should/could be.

Again, just like getting horrid hardware to work properly, sometimes we have to write odd code. Or having to implement a network protocol that doesn't seem to be designed "perfectly", yet is used by a few hundred million systems so we have to remain compatible. This is all that we are doing here for stuff like this.

Remember, this is called kDBUS, not kGENERICIPC, no matter how much we would have liked that to happen from a kernel standpoint. :)

It is probably fair to say that those who are opposed to kdbus in its current form would rather that it were, indeed, kGENERICIPC. They seem to feel that it should be able to support what is needed to implement D-Bus efficiently, but the D-Bus-specific parts, perhaps, should go into user space. After all, there are only so many interprocess communication mechanisms that can be merged into the kernel; the one that goes in, many developers think, should be free of known flaws and should be able to do more than reimplement the D-Bus protocol.

It is hard to say at this point how this discussion will play out or what Linus will decide to do in the end. The chances are good, though, that enough high-profile developers have expressed opposition to derail the merging of kdbus in this development cycle. Complete consensus is not always required to get code into the kernel, but getting code merged when there is serious opposition is still quite hard. This story, it seems, may go on for a while yet.

Comments (58 posted)

4.1 Merge window, part 1

By Jonathan Corbet
April 15, 2015

Linus started merging changes for the 4.1 development cycle on April 13; as of this writing, a total of 3,643 non-merge changesets have been pulled into the mainline. In other words, things are just getting started. Still, some interesting changes have found their way in, though many of them will be of interest mainly to kernel developers.

Some of the more interesting, user-visible changes merged so far include:

Basic support for live kernel patching has been added to the S/390 architecture. What has been removed from S/390, instead, is support for the 31-bit mode, once needed to get past that pesky 16MB memory limit.
KVM virtualization on the MIPS architecture has gained support for the floating-point unit and the SIMD mode. KVM on ARM now supports interrupt injection via irqfd().
Load tracking in the CPU scheduler has been reworked to make the calculated process loads be independent of CPU speeds. That will enable better load-balancing decisions in the presence of frequency scaling and improve support for asymmetric systems like big.LITTLE where different types of CPUs are found in the same package.
New hardware support includes:
- I2C: Digicolor I2C controllers, Ingenic JZ4780 I2C controllers, and Broadcom XLP9xx/XLP5xx I2C controllers.
- IIO: Capella CM3323 color light sensors and Measurement Specialties MS5611 pressure sensors.
- Input: Broadcom keypad controllers, MAXIM MAX77843 haptic controllers, iPAQ h3100/h3600/h3700 buttons, Semtech SX8654 I2C touchscreens, Qualcomm PM8941 performance management IC (PMIC) power keys, Broadcom IPROC touchscreens, and ChipOne icn8318 I2C touchscreen controllers.
- Miscellaneous: Nuvoton NCT7904 hardware-monitoring chips, Broadcom IPROC SD/MMC and PCIe controllers, Dialog DA9150 charger and fuel-gauge controllers, X-Powers AXP288 fuel gauges, Nokia modems implementing the CMT speech protocol, Silicon Motion SM750 framebuffers, Ilitek ILI9163 LCD controllers, and Freescale Management Complex buses.
- Multi-function device: Wolfson Microelectronics WM8280/WM8281 controllers, MediaTek MT6397 PMICs, Maxim Semiconductor MAX77843 PMICs, Intel Quark controllers, and Skyworks Solutions SKY81452 controllers.
- Pin control: Marvell Armada 39x pin controllers, NVIDIA Tegra210 pinmux controllers, Broadcom Cygnus IOMUX controllers, Mediatek mt8135 pin controllers, AMD platform pin controllers, and Intel Sunrisepoint pin controllers.
- USB: AltusMetrum ChaosKey random-number generators, TI dm816x USB PHYs, and Allwinner sun9i USB PHYs.

Changes visible to kernel developers include:

The kernel self-test code has gained an install target that installs test binaries into a special directory in the kernel tree. There is also a new set of timer self tests in the test suite.
The new efi=debug boot option causes extra information to be printed at boot time on systems with EFI firmware.
The long-deprecated IRQF_DISABLED interrupt flag has finally been removed from the kernel.
The "tracefs" virtual filesystem has been added. Tracefs contains the usual set of directories and files to control tracing, but it has the advantage that it can be mounted independently of debugfs. It thus allows system administrators to enable tracing without bringing in the other, potentially dangerous knobs found in debugfs. By default, tracefs will be mounted in the usual place (/sys/kernel/debug/tracing) when debugfs is mounted.
The new TRACE_DEFINE_ENUM() macro can be used to output values from enum types in tracepoints.
As usual, the perf tool has seen a long list of additions and improvements; see the top-level merge commit for details. Some of the more significant features include the ability to attach BPF programs to kernel probes, support for Intel's upcoming processor trace functionality ("a hardware tracer on steroids"), support for Intel's upcoming cache quality-of-service monitoring feature, and more.
The I2C subsystem can now function in "slave" mode, responding to a master controller elsewhere on the bus; see Documentation/i2c/slave-interface for details. The I2C layer has also gained a new quirk mechanism that can be used to describe the limitations of specific controllers.

Unless something surprising happens, the merge window can be expected to stay open through April 27. There will likely be a lull in the middle while Linus travels, but that has tended to not slow things down too much in the past. As usual, we will continue to track and report on the significant changes merged for the 4.1 development cycle.

Persistent memory support progress

By Jonathan Corbet
April 15, 2015

Persistent memory (or non-volatile memory) has a number of nice features: it doesn't lose its contents when power is cycled, it is fast, and it is expected to be available in large quantities. Enabling proper support for this memory in the kernel has been a topic of discussion and development for some years; it was, predictably, an important topic at this year's Linux Storage, Filesystem, and Memory Management Summit. The 4.1 kernel will contain a new driver intended to improve support for persistent memory, but there is still a fair amount of work to be done.

At a first glance, persistent memory looks like normal RAM to the processor, so it might be tempting to simply use it that way. There are, though, some good reasons for not doing that. The performance characteristics of persistent memory are still not quite the same as RAM; in particular, write operations can be slower. Persistent memory may not wear out as quickly as older flash arrays did, but it is still best to avoid rewriting it many times per second, as could happen if it were used as regular memory. And the persistence of persistent memory is a valuable feature to take advantage of in its own right — but, to do so, the relevant software must know which memory ranges in the system are persistent. So persistent memory needs to be treated a bit differently.

The usual approach, at least for a first step, is to separate persistent memory from normal RAM and treat it as if it were a block device. Various drivers implementing this type of access have been circulating for a while now. It appears that this driver from Ross Zwisler will be merged for the 4.1 release. It makes useful reading as it is something close to the simplest possible example of a working block device driver. It takes a region of memory, registers a block device to represent that memory, and implements block read and write operations with memcpy() calls.

In his pull request to merge this driver, Ingo Molnar noted that a number of features that one might expect, including mmap() and execute-in-place, are not supported yet, and that persistent-memory contents would be copied in the page cache. What Ingo had missed is that the DAX patch set providing direct filesystem access to persistent memory was merged for the 4.0 release. If a DAX-supporting filesystem (ext4 now, XFS soon) is built in a persistent memory region, file I/O will avoid the page cache and operations like mmap() will be properly supported.

That said, there are a few things that still will not work quite as expected. One of those is mlock(), which, as Yigal Korman pointed out, may seem a bit strange: data stored in persistent memory is almost by definition locked in memory. As noted by Kirill Shutemov, though, supporting mlock() is not a simple no-op; the required behavior depends on how the memory mapping was set up in the first place. Private mappings still need copy-on-write semantics, for example. A perhaps weirder case is direct I/O: if a region of persistent memory is mapped into a process's address space, the process cannot perform direct I/O between that region and an ordinary file. There may also be problems with direct memory access (DMA) I/O operations, some network transfers, and the vmsplice() system call, among others.

Whither struct page?

In almost all cases, the restrictions with persistent memory come down to the lack of page structures for that memory. A page structure represents a page of physical memory in the system memory map; it contains just about everything the kernel knows about that page and how it is being used. See this article for the gory details of what can be found there. These structures are used with many internal kernel APIs that deal with memory. Persistent memory, lacking corresponding page structures, cannot be used with those APIs; as a result, various things don't work with persistent memory.

Kernel developers have hesitated to add persistent memory to the system memory map because persistent-memory arrays are expected to be large — in the terabyte range. With the usual 4KB page size, 1TB of persistent memory would need 256 million page structures which would occupy several gigabytes of RAM. And they do need to be stored in RAM, rather than in the persistent memory itself; page structures can change frequently, so storing them in memory that is subject to wear is not advisable. Rather than dedicate a large chunk of RAM to the tracking of persistent memory, the development community has, so far, chosen to treat that memory as a separate type of device.

At some point, though, a way to lift the limitations around persistent memory will need to be found. There appear to be two points of view on how that might be done. One says that page structures should never be used with persistent memory. The logical consequence of this view is that the kernel interfaces that currently use page structures need to be changed to use something else — page-frame numbers, for example — that works with both RAM and persistent memory. Dan Williams posted a patch removing struct page usage from the block layer in March. It is not for the faint of heart: just over 100 files are touched to make this change. That led to complaints from some developers that getting rid of struct page usage in APIs would involve a lot of high-risk code churn and remove a useful abstraction while not necessarily providing a lot of benefit.

The alternative would be to bite the bullet and add struct page entries for persistent memory regions. Boaz Harrosh posted a patch to that end in August 2014; it works by treating persistent memory as a range of hot-pluggable memory and allocating the memory-map entries at initialization time. The patch is relatively simple, but it does nothing to address the memory-consumption issue.

In the long run, the solution may take the form of something like a page structure that represents a larger chunk of memory. One obvious possibility is to make a version of struct page that refers to a huge page; that has the advantage of using a size that is understood by the processor's memory-management unit and would integrate well with the transparent huge page mechanism. An alternative would be a variable-size extent structure as is used by more recent filesystems. Either way, the changes required would be huge, so this is not something that is going to happen in the near future.

What will happen is that persistent memory devices will work on Linux as a storage medium for the major filesystems, providing good performance. There will be some rough edges with specific features that do not work, but most users are unlikely to run into them. With 4.1, the kernel will have a level of support for persistent-memory devices to allow that hardware to be put to good use, and to allow users to start figuring out what they actually want to do with that much fast, persistent storage.

Comments (3 posted)

Load tracking in the scheduler

April 15, 2015

This article was contributed by Preeti U Murthy.

The scheduler is an essential part of an operating system, tasked with, among other things, ensuring that processes get their fair share of CPU time. This is not as easy as it may seem initially. While some processes perform critical operations and have to be completed at high priority, others are not time-constrained. The former category of processes expect a bigger share of CPU time than the latter so as to finish as quickly as possible. But how big a share should the scheduler allocate to them?

Another factor that adds to the complexity in scheduling is the CPU topology. Scheduling on uniprocessor systems is simpler than scheduling on the multiprocessor systems that are more commonly found today. The topology of CPUs is only getting more complex with hyperthreads and heterogeneous processors like the big.LITTLE taking the place of symmetric processors. Scheduling a process on the wrong processor can adversely affect its performance. Thus, designing a scheduling algorithm that can keep all processes happy with the computing time allocated to them can be a formidable challenge.

The Linux kernel scheduler has addressed many of these challenges and matured over the years. Today there are different scheduling algorithms (or "scheduling classes") in the kernel to suit processes having different requirements. The Completely Fair Scheduling (CFS) class is designed to suit a majority of today's workloads. The realtime and deadline scheduling classes are designed for latency-sensitive and deadline-driven processes respectively. So we see that the scheduler developers have answered a range of requirements.

The Completely Fair Scheduling class

The CFS class is the class to which most tasks belong. In spite of the robustness of this algorithm, an area that has always had scope for improvement is process-load estimation.

If a CPU is associated with a number C that represents its ability to process tasks (let's call it "capacity"), then the load of a process is a metric that is expressed in units of C, indicating the number of such CPUs required to make satisfactory progress on its job. This number could also be a fraction of C, in which case it indicates that a single such CPU is good enough. The load of a process is important in scheduling because, besides influencing the time that a task spends running on the CPU, it helps to estimate overall CPU load, which is required during load balancing.

The question is how to estimate the load of a process. Should it be set statically or should it be set dynamically at run time based on the behavior of the process? Either way, how should it be calculated? There have been significant efforts at answering these questions in the recent past. As a consequence, the number of load-tracking metrics has grown significantly and load estimation itself has gotten quite complex. This landscape appears quite formidable to reviewers and developers of CFS. The aim of this article is to bring about clarification on this front.

Before proceeding, it is helpful to point out that the granularity of scheduling in Linux is at a thread level and not at a process level. However, the scheduling jargon for thread is "task." Hence throughout this article the term "task" has been used to mean a thread.

Scheduling entities and task groups

The CFS algorithm defines a time duration called the "scheduling period," during which every runnable task on the CPU should run at least once. This way no task gets starved for longer than a scheduling period. The scheduling period is divided among the tasks into time slices, which are the maximum amount of time that a task runs within a scheduling period before it gets preempted. This approach may seem to avoid task starvation at first. However it can lead to an undesirable consequence.

Linux is a multi-user operating system. Consider a scenario where user A spawns ten tasks and user B spawns five. Using the above approach, every task would get ~7% of the available CPU time within a scheduling period. So user A gets 67% and user B gets 33% of the CPU time during their runs. Clearly, if user A continues to spawn more tasks, he can starve user B of even more CPU time. To address this problem, the concept of "group scheduling" was introduced in the scheduler, where, instead of dividing the CPU time among tasks, it is divided among groups of tasks.

In the above example, the tasks spawned by user A belong to one group and those spawned by user B belong to another. The granularity of scheduling is at a group level; when a group is picked to run, its time slice is further divided between its tasks. In the above example, each group gets 50% of the CPU's time and tasks within each group divide this share further among themselves. As a consequence, each task in group A gets 5% of the CPU and each task in group B gets 10% of the CPU. So the group that has more tasks to run gets penalized with less CPU time per task and, more importantly, it is not allowed to starve sibling groups.

Group scheduling is enabled only if CONFIG_FAIR_GROUP_SCHED is set in the kernel configuration. A group of tasks is called a "scheduling entity" in the kernel and is represented by the sched_entity data structure:

    struct sched_entity { 
	struct load_weight load;
	struct sched_entity *parent;
	struct cfs_rq *cfs_rq;
	struct cfs_rq *my_rq;
	struct sched_avg avg;
	/* ... */
    };

Before getting into the details of how this structure is used, it is worth considering how and when groups of tasks are created. This happens under two scenarios:

Users may use the control group ("cgroup") infrastructure to partition system resources between tasks. Tasks belonging to a cgroup are associated with a group in the scheduler (if the scheduler controller is attached to the group).
When a new session is created through the set_sid() system call. All tasks belonging to a specific session also belong to the same scheduling group. This feature is enabled when CONFIG_SCHED_AUTOGROUP is set in the kernel configuration.

Outside of these scenarios, a single task becomes a scheduling entity on its own. A task is represented by the task_struct data structure:

    struct task_struct {
	struct sched_entity se;
	/* ... */
    };

Scheduling is always at the granularity of a sched_entity. That is why every task_struct is associated with a sched_entity data structure. CFS also accommodates nested groups of tasks. Each scheduling entity contains a run queue represented by:

    struct cfs_rq {
	struct load_weight load;
	unsigned long runnable_load_avg;
	unsigned long blocked_load_avg;
	unsigned long tg_load_contrib;
	/* ... */
    };

Each scheduling entity may, in turn, be queued on a parent scheduling entity's run queue. At the lowest level of this hierarchy, the scheduling entity is a task; the scheduler traverses this hierarchy until the end when it has to pick a task to run on the CPU.

The parent run queue on which a scheduling entity is queued is represented by cfs_rq, while the run queue that it owns is represented by my_rq in the sched_entity data structure. The scheduling entity gets picked from the cfs_rq when its turn arrives, and its time slice gets divided among the tasks on my_rq.

Let us now extend the concept of group scheduling to multiprocessor systems. Tasks belonging to a group can be scheduled on any CPU. Therefore it is not sufficient for a group to have a single scheduling entity; instead, every group must have one scheduling entity for each CPU. Tasks belonging to a group must move between the run queues in these per-CPU scheduling entities only, so that the footprint of the task is associated with the group even during task migrations. The data structure that represents scheduling entities of a group across CPUs is:

    struct task_group {
	struct sched_entity **se;
	struct cfs_rq **cfs_rq;
	unsigned long shares;
	atomic_long_t load_avg;
	/* ... */
    };

For every CPU c, a given task_group tg has a sched_entity called se and a run queue cfs_rq associated with it. These are related as follows:

    tg->se[c] = &se;
    tg->cfs_rq[c] = &se->my_rq;

So when a task belonging to tg migrates from CPUx to CPUy, it will be dequeued from tg->cfs_rq[x] and enqueued on tg->cfs_rq[y].

Time slice and task load

The concept of a time slice was introduced above as the amount of time that a task is allowed to run on a CPU within a scheduling period. Any given task's time slice is dependent on its priority and the number of tasks on the run queue. The priority of a task is a number that represents its importance; it is represented in the kernel by a number between zero and 139. The lower the value, the higher the priority. A task that has a stricter time requirement needs to have higher priority than others.

But the priority value by itself is not helpful to the scheduler, which also needs to know the load of the task to estimate its time slice. As mentioned above, the load must be the multiple of the capacity of a standard CPU that is required to make satisfactory progress on the task. Hence this priority number must be mapped to such a value; this is done in the array prio_to_weight[].

A priority number of 120, which is the priority of a normal task, is mapped to a load of 1024, which is the value that the kernel uses to represent the capacity of a single standard CPU. The remaining values in the array are arranged such that the multiplier between two successive entries is ~1.25. This number is chosen such that if the priority number of a task is reduced by one level, its gets 10% higher share of CPU time than otherwise. Similarly if the priority number is increased by one level, the task will get a 10% lower share of the available CPU time.

Let us consider an example to illustrate this. If there are two tasks, A and B, running at a priority of 120, the portion of available CPU time given to each task is calculated as:

    1024/(1024*2) = 0.5

However if the priority of task A is increased by one level to 121, its load becomes:

    (1024/1.25) = ~820

(Recall that higher the number, lesser is the load). Then, task A's portion of the CPU becomes:

    820/(1024+820)) = ~0.45

while task B will get:

    (1024/(1024+820)) = ~0.55

This is a 10% decrease in the CPU time share for Task A.

The load value of a process is stored in the weight field of the load_weight structure (which is, in turn, found in struct sched_entity):

    struct load_weight {
	unsigned long weight;
    };

A run queue (struct cfs_rq) is also characterized by a "weight" value that is the accumulation of weights of all tasks on its run queue.

The time slice can now be calculated as:

    time_slice = (sched_period() * se.load.weight) / cfs_rq.load.weight;

where sched_period() returns the scheduling period as a factor of the number of running tasks on the CPU. We see that the higher the load, the higher the fraction of the scheduling period that the task gets to run on the CPU.

Runtime and task load

We have seen how long a task runs on a CPU when picked, but how does the scheduler decide which task to pick? The tasks are arranged in a red-black tree in increasing order of the amount of time that they have spent running on the CPU, which is accumulated in a variable called vruntime. The lowest vruntime found in the queue is stored in cfs_rq.min_vruntime. When a new task is picked to run, the leftmost node of the red-black tree is chosen since that task has had the least running time on the CPU. Each time a new task forks or a task wakes up, its vruntime is assigned to a value that is the maximum of its last updated value and cfs_rq.min_vruntime. If not for this, its vruntime would be very small as an effect of not having run for a long time (or at all) and would take an unacceptably long time to catch up to the vruntime of its sibling tasks and hence starve them of CPU time.

Every periodic tick, the vruntime of the currently-running task is updated as follows:

    vruntime += delta_exec * (NICE_0_LOAD/curr->load.weight);

where delta_exec is the time spent by the task since the last time vruntime was updated, NICE_0_LOAD is the load of a task with normal priority, and curr is the currently-running task. We see that vruntime progresses slowly for tasks of higher priority. It has to, because the time slice for these tasks is large and they cannot be preempted until the time slice is exhausted.

Per-entity load-tracking metrics

The load of a CPU could have simply been the sum of the load of all the scheduling entities running on its run queue. In fact, that was once all there was to it. This approach has a disadvantage, though, in that tasks are associated with load values based only on their priorities. This approach does not take into account the nature of a task, such as whether it is a bursty or a steady task, or whether it is a CPU-intensive or an I/O-bound task. While this does not matter for scheduling within a CPU, it does matter when load balancing across CPUs because it helps estimate the CPU load more accurately. Therefore the per-entity load tracking metric was introduced to estimate the nature of a task numerically. This metric calculates task load as the amount of time that the task was runnable during the time that it was alive. This is kept track of in the sched_avg data structure (stored in the sched_entity structure):

    struct sched_avg {
	u32 runnable_sum, runnable_avg_period;
	unsigned long load_avg_contrib;
    };

Given a task p, if the sched_entity associated with it is se and the sched_avg of se is sa, then:

    sa.load_avg_contrib = (sa.runnable_sum * se.load.weight) / sa.runnable_period;

where runnable_sum is the amount of time that the task was runnable, runnable_period is the period during which the task could have been runnable.

Therefore load_avg_contrib is the fraction of the time that the task was ready to run. Again, the higher the priority, the higher the load.

So tasks showing peaks of activity after long periods of inactivity and tasks that are blocked on disk access (and thus non-runnable) most of the time have a smaller load_avg_contrib than CPU-intensive tasks such as code doing matrix multiplication. In the former case, runnable_sum would be a fraction of the runnable_period. In the latter, both these numbers would be equal (i.e. the task was runnable throughout the time that it was alive), identifying it as a high-load task.

The load on a CPU is the sum of the load_avg_contrib of all the scheduling entities on its run queue; it is accumulated in a field called runnable_load_avg in the cfs_rq data structure. This is roughly a measure of how heavily contended the CPU is. The kernel also tracks the load associated with blocked tasks. When a task gets blocked, its load is accumulated in the blocked_load_avg metric of the cfs_rq structure.

Per-entity load tracking in presence of task groups

Now what about the load_avg_contrib of a scheduling entity, se, when it is a group of tasks? The cfs_rq that the scheduling entity owns accumulates the load of its children in runnable_load_avg as explained above. From there, the parent task group of cfs_rq is first retrieved:

    tg = cfs_rq->tg;

The load contributed by this cfs_rq is added to the load of the task group tg:

    cfs_rq->tg_load_contrib = cfs_rq->runnable_load_avg + cfs_rq->blocked_load_avg;
    tg->load_avg += cfs_rq->tg_load_contrib;

The load_avg_contrib of the scheduling entity se is now calculated as:

    se->avg.load_avg_contrib =
	  (cfs_rq->tg_load_contrib * tg->shares / tg->load_avg);

Where tg->shares is the maximum allowed load for the task group. This means that the load of a sched_entity should be a fraction of the shares of its parent task group, which is in proportion to the load of its children.

tg->shares can be set by users to indicate the importance of a task group. As is clear now, both the runnable_load_avg and and blocked_load_avg are required to estimate the load contributed by the task group.

There are still drawbacks in load tracking. The load metrics that are currently used are not CPU-frequency invariant. So if the CPU frequency increases, the load of the currently running task may appear smaller than otherwise. This may upset load-balancing decisions. The current load-tracking algorithm also falls apart in a few places when run on big.LITTLE processors. It either underestimates or overestimates the capacity of these processors. There are efforts ongoing to fix these problems. So there is good scope for improving the load-tracking heuristics in the scheduler. Hopefully this writeup has laid out the basics to help ease understanding and reviewing of the ongoing improvements on this front.

Comments (3 posted)

Ima Sheep Linux 4.0 released ?

Alexandre Oliva GNU Linux-libre 4.0-gnu is now available ?

Greg KH Linux 3.19.4 ?

Greg KH Linux 3.14.38 ?

Kamal Mostafa Linux 3.13.11-ckt19 ?

Jiri Slaby Linux 3.12.40 ?

Greg KH Linux 3.10.74 ?

Stefan Agner ARM: vf610m4: Add Vybrid Cortex-M4 support ?

Bintian Wang arm64,hi6220: Enable Hisilicon Hi6220 SoC ?

Kumar Gala Add smp booting support for Qualcomm ARMv8 SoCs ?

Philipp Tomsich ILP32 for ARM64 ?

Michael Mueller s390x cpu model implementation ?

Andi Kleen Updated RD/WRFS/GSBASE patchkit ?

Jiang Liu Convert x86 to hierarchy irqdomain and stacked irqchip ?

Tom Zanussi tracing: 'hist' triggers ?

Aleksa Sarai cgroups: add pids subsystem ?

Al Viro new locking primitive (pulled from fs_pin) ?

Mathieu Desnoyers sys_membarrier(): system-wide memory barrier (x86) ?

Thomas Gleixner [patch 00/39] hrtimer/tick: Optimizations, cleanups and solutions for various issues ?

Daniel Wagner hwlat_detector: Detect hardware-induced latencies ?

Lee Jones mfd: watchdog: rtc: New driver for ST's LPC IP ?

Boris Brezillon crypto: add new driver for Marvell CESA ?

LABBE Corentin crypto: Add Allwinner Security System crypto accelerator ?

Ong Boon Leong Intel Quark X1000 DTS thermal driver ?

Chih-Chiang Chang ASoC: Add support for NAU8825 codec to ASoC ?

Bjorn Andersson Qualcomm Shared Memory Driver ?

Bjorn Andersson Qualcomm 8974 RPM & Regulator drivers ?

Rajendra Nayak Add support for QCOM GDSCs ?

Amir Vadai net/mlx5: ConnectX-4 100G Ethernet driver ?

Hyungwon Hwang Add drivers for Exynos5433 display ?

Jemma Denson Add support for TechniSat Skystar S2 ?

Roger Quadros USB: OTG/DRD Core functionality ?

Nobuo Iwata usbip: features to USB over WebSocket ?

Sergei Shtylyov Renesas Ethernet AVB driver ?

Daniel Baluta Introduce support for MMC35240 magnetic sensor ?

serjk@netup.ru [media] NetUP Universal DVB PCIe card support ?

Johannes Berg mac80211: enable fast-xmit and some offloads ?

Laurent Pinchart Add live source objects to DRM ?

Zach Brown simple copy offloading system call ?

NeilBrown fscache/cachefiles versus btrfs ?

Theodore Ts'o ext4 encryption patches ?

Brian Behlendorf spl/zfs-0.6.4 released ?

Chris Mason btrfs: reduce block group cache writeout times during commit ?

Beata Michalska Generic file system events interface ?

Matias Bjørling Support for Open-Channel SSDs ?

Richard Weinberger Remove execution domain support ?

Mel Gorman Parallel memory initialisation ?

Jiri Pirko tc: introduce OpenFlow classifier ?

Pablo Neira Ayuso Netfilter/nf_tables ingress support ?

Patrick McHardy netfilter: nf_tables: concatenation support ?

Patrick McHardy netfilter: nf_tables: dynamic stateful expression instantiation ?

Eric Auger KVM-VFIO IRQ forward control ?

Philip P. Moltmann VMware balloon: Large page ballooning and VMCI support ?

Namhyung Kim perf kmem: Implement page allocation analysis (v7) ?

Sukadev Bhattiprolu Add support for JSON event files. ?

Stephen Hemminger iproute2 4.0.0 ?

DNF replacing yum

By Nathan Willis
April 15, 2015

Fedora has been planning to replace the yum package manager for quite some time. The designated replacement, DNF, is a rewrite that offers several advantages, including faster and better dependency resolution. But users (arguably Linux users in particular) tend to get attached to their tools, so the prospect of switching over all at once can be a hard sell. The Fedora Engineering Steering Committee (FESCo) decided in June 2014 that the switch would finally take place in Fedora 22. Now that Fedora 22 is on the verge of release, though, the practical difficulties involved in changing such a low-level program have bubbled up to the surface again.

On April 1, FESCo member Kevin Fenzi outlined the changes in package management that would arrive in Fedora 22, noting that there had previously been some confusion about the plan. DNF will be installed by default, as will the transitional package dnf-yum, which turns /usr/bin/yum into a script that redirects to dnf. Nevertheless, users can still install yum manually, and the yum package will still be installed if the user installs anything that lists yum as a dependency. In addition, as Jan Silhan added, a migrate plugin has been developed that will migrate the machine's yum history and metadata to DNF.

But there is a bit more to it than that. The Fedora 22 "yum" package will also install dnf-yum as a dependency, and the yum package's executable will be /usr/bin/yum-deprecated (with /usr/bin/yum still just wrapping DNF). Furthermore, running yum will issue a warning message to the user:

    Yum command has been deprecated, use dnf instead.
    See 'man dnf' and 'man yum2dnf' for more information.
    To transfer transaction metadata from yum to DNF, run 'dnf migrate'

Below the warning, it will also notify the user that the yum command entered is being redirected to the corresponding DNF command.

Not everyone embraced this transition plan with enthusiasm, however. In reply to Fenzi, Nico Kadel-Garcia said the plan was "unnecessarily confusing" and that adding hooks for the /etc/alternatives framework would be better than renaming or redirecting binaries and making the two packages forcibly compete with each other. "This approach is old and effective for Java, for /usr/sbin/sendmail, and for numerous other tools. Why would you want to reinvent the wheel and prevent people from having either as needed?"

Jan Zelený argued that /etc/alternatives was meant to choose between active alternative tools, whereas yum is being deprecated and will be phased out. What followed next was a lengthy debate that focused largely on the relative merits of yum and DNF. Bruno Wolff questioned DNF's compatibility with yum (specifically with respect to yum's --skip-broken flag), saying that he would prefer to keep using yum than to undertake the workarounds that are necessary to make DNF behave like yum.

To that, Zelený replied that full compatibility between the two tools had never been promised:

From the very beginning, we were sending a clear message that we will be as compatible as possible in terms of CLI but we never wanted to have just another yum.

The ensuing discussion made a few viewpoints clear. First, not everyone involved has the same definition of "as compatible as possible." Zelený agreed only that the most common package-management use cases will be covered; some less-frequently-needed yum commands may never be implemented in DNF at all (in the FESCo ticket that deals with the transition plan, Josh Boyer pointed to a number of yum features that the DNF team had no plans to implement). While this is perhaps sensible from an engineering standpoint, others argued that this stance results in mixed messages for the user. As Przemek Klosowski said:

The updating problem is complex enough so that yum ecosystem is not handling it well. You are making an argument that dnf reimplemented this ecosystem in a qualitatively different, more correct way, that is so different that it merits a clean break, justifying the project name change. I am fine with that, but this leads to confusion when there are observable changes in behavior. The implication is that dnf is better and more correct, but on the other hand it's clear that dnf is still in development and exhibits faults. So, now we have a problem: if dnf behaves differently from yum, is it an improvement or a regression? We don't know, and it's not clear to me how to tell; every divergence is a potential bug in dnf, and therefore should be reported as such.

I would venture a comment that it was a mistake to declare dnf a separate project, because it leads to a different approach to such differences. If it was an evolutionary change in yum, it would be natural to expect it to behave in a compatible way, and consequently detect and explain in more detail the divergent behavior. By declaring a clean break, you are basically saying that there is no need to explain the diffs, but the flip side of it is that unless I can clearly understand why the difference is for the better, I must suspect this to be a regression and report it.

The --skip-broken flag, first brought up in the thread by Wolff, is indicative in many respects of the irritation some users feel about the yum-to-DNF transition, but it has wider implications as well. A yum update operation that used --skip-broken would skip all updates for packages with broken dependencies. DNF skips such broken updates—silently—by default. As several in the thread pointed out, there are ways to get DNF to report the existence of broken packages, such as running dnf check-update after running an update.

To Steve Clark, however, that simply means that DNF requires more work to use. To Vít Ondruch, it means that dnf update is an unreliable command. Moreover, although the DNF team seemed to regard --skip-broken as a kludge used to work around situations that are best avoided in the first place (such as mixing packages from several different repositories or package maintainers pushing broken updates), a number of users found it naive to expect that users will not have to deal with broken packages. Tom Hughes noted that feedback about broken packages from yum is one of the main ways packagers learn about packaging problems.

Zelený capped off the discussion—at least for now—by inviting community members to work on DNF plugins to implement the functionality that they want. Perhaps predictably, that answer did not sit well with everyone who participated in the thread. But regardless of how the masses may feel about the differences between DNF and yum, the long-awaited transition appears to finally be underway.

In some respects, the push to migrate Fedora from yum to DNF even through a transition some will find painful is just the most recent example of a recurring issue for free-software projects. Whether it is a compiler, a compositor, or desktop panel, making a clean break from the past is rarely easy, but sometimes it must be done. This migration may prove to be a contentious one for Fedora, but eventually the distribution will emerge on the other side.

Comments (20 posted)

Debian project leader election results

This year's Debian project election leader election has concluded, with Neil McGovern winning by a conclusive margin.

Full Story (comments: 1)

Linux Mint LMDE 2

Linux Mint Debian Edition 2 "Betsy" is available in Cinnamon and MATE editions. "Life on the LMDE side can be exciting. There are no point releases in LMDE 2, except for bug fixes and security fixes base packages stay the same, but Mint and desktop components are updated continuously. When ready, newly developed features get directly into LMDE 2, whereas they are staged for inclusion on the next upcoming Linux Mint 17.x point release. Consequently, Linux Mint users only run new features when a new point release comes out and they opt-in to upgrade to it. LMDE 2 users don’t have that choice, but they also don’t have to wait for new packages to mature and they usually get to run them first. It’s more risky, but more exciting." The release notes contain more details.

Distribution newsletters

DistroWatch Weekly, Issue 605 (April 13)
5 things in Fedora this week (April 10)
Tails report (March)
Ubuntu Weekly Newsletter, Issue 412 (April 12)

SuperX OS Greases the Classic Linux Wheel (LinuxInsider)

LinuxInsider has a review of SuperX, a Debian/Ubuntu derivative with a customized KDE desktop. "What caught my eye in this KDE variant distro is its desktop responsiveness. Grace [SuperX 3.0] gives more priority to application performance. The Grace engine compresses unused memory pages within RAM rather than swapping them out to the swap partition. This makes the OS responsive even when the system memory is low. Commonly used applications are preloaded and cached in memory for faster startup of favorite applications."

The five biggest changes in Ubuntu 15.04, Vivid Vervet (ZDNet)

Steven J. Vaughan-Nichols takes Ubuntu 15.04 for a test drive. He notes that most of the changes are under-the-hood and not user visible. "Ubuntu's developers decided that, even though they were a few days past the feature release freeze date, they would switch 15.04's default to systemd. The change will affect "Ubuntu desktop/server/cloud and the flavors like Kubuntu, but *NOT* ubuntu-touch."

Comments (2 posted)

Plotting tools for networks, part I

April 15, 2015

This article was contributed by Lee Phillips

In the first two installments in this series on plotting tools (which covered gnuplot and matplotlib), we introduced tools for creating plots and graphs, and used the terms interchangeably to refer to the typical scientific plot relating one set of quantities to another. In this article we use the term "graph" in its mathematical, graph-theory context, meaning a set of nodes connected by edges. There is a strong family resemblance among graph-theory graphs, flowcharts, and network diagrams—so much so that some of the same tools can be coerced into creating all of them. We will now survey several mature free-software systems for building these types of visualizations. At least one of these tools will likely be useful if you are ever in need of an automated way to diagram source-code interdependencies, make an organizational chart, visualize a computer network, or organize a sports tournament. We will start with a graphical charting tool and a flexible graphing system that can easily be called by other programs.

Flowcharting with Dia

A flowchart is a diagram of a process, algorithm, workflow, or something similar. Flowcharts for different fields often employ a specialized graphical language of symbols that represent entities common to the field. For example, a circuit diagram is a type of flowchart that uses special symbols for diodes, resistors, and other circuit elements. There are flowchart languages for logic circuits, chemical engineering, software design, and much more.

Dia, a free (in all senses) diagram editor for Linux and other systems, comes with symbol libraries encompassing all of these examples, plus many others, both common and exotic. And, if that's not sufficient, the program allows you to make your own symbols.

Dia is a GUI program that uses the GTK+ libraries. You use it somewhat like Inkscape or other drawing programs. However, to make effective use of the program you should remember that you are not creating a drawing, but, rather, defining a set of relationships between entities. These relationships are represented by lines and curves (perhaps with arrowheads or labels), and the entities take the forms of the various symbolic shapes we mentioned above, often with their own text labels.

The trick to defining these relationships through the graphical interface is to make the connections in the right way. Since it takes a while to extract these techniques from the documentation, we'll outline the steps here.

After selecting the desired shape from the panel and dragging it out in the canvas to the approximate size you think it should be, immediately press Return and type the text label for the shape. The label will be properly centered, the shape will grow as required to accommodate it, and the label will be permanently attached to the shape and move with it. To connect two entities with a line, draw the line between the centers of the entities; you know you've hit the correct spot when the shape glows yellow.

To attach a text label to a line (such as the "yes" and "no" labels in the screenshot) you need to follow a different procedure: with object snapping turned on, create a text entity using the text tool that looks like a "T", then drag it by its handle, connecting it to the line's attachment point. This point is indicated by a small "x" and is at the center of the line. A red glow will signal that you've made the attachment.

If you've defined all your labels, entities, and connections using these techniques, then you'll be able to move the nodes around at will on the canvas until the chart is neat and easy to follow. The topology of the graph, which carries the actual information in the flowchart, won't change but, by moving things around, you can change a tangle of crossed lines into a neat diagram where the flow is clear.

Dia saves your work in an XML file (a compressed one by default, though there is also an option to save it uncompressed), and can export it into a wide variety of image formats, including vector formats such as SVG.

The program should be available in your package manager. Development is steady but moves at a slow pace, so it's likely you'll get the current version even from a conservative distribution. If you need to, however, you can download sources or binaries from Dia headquarters.

Graphviz: infrastructure for graphs

Dia is a useful and versatile tool for creating and laying out a graph by hand. Sometimes, however, we begin with a (possibly large) set of data that we want to visualize as a network-style graph or flowchart. We may also want to experiment with different types of visualizations or to produce different graph styles that present the same data for different purposes.

Graphviz solves these problems by providing a declarative language, called "dot," that represents nodes and the connections between them as text. The dot language can accommodate a large set of visual and logical attributes of many types of nodes, their relationships, and their interconnections. Nevertheless, it's intuitive, with an easy-to-remember and readable syntax. Here is an almost-minimal example of a dot file that defines a simple graph:

      strict digraph "example" {
        A -> {B C};
        D [shape = box];
        C -> D;
        D -> C [color = blue];
      }

The keyword strict at the beginning means that no redundant edges are allowed; a digraph is a directed graph (meaning that the edges have a direction, often represented by an arrowhead at the end and, perhaps, at the beginning). The second line says that node A is connected with both nodes B and C, in the direction starting from A. The next line declares a new node called D and defines an attribute that specifies how D should be drawn. Then, we declare that C is connected to D, and that D is connected back to C. This last edge has an attribute specifying its color.

The Graphviz infrastructure also comes with several layout engines that interpret dot files and produce the actual graphs. Some of the engines are for directed graphs, some are for undirected graphs, and some handle both types. The problem of taking a graph specification—with perhaps thousands of nodes—and producing a usable visual representation is not trivial, and is the subject of continuing research. Each of Graphviz's engines has a mathematical theory [PDF] behind it, and each will generate a different type of graph.

For simple directed graphs such as the one represented in the dot file above, the engine called "dot" is usually best. We invoke it on the command line:

    dot -o simpledot.png -Tpng simpledot.dot

This generates a PNG output file (one of many choices), using simpledot.dot as the graph specification. If we store the code snippet above into this file, we get the output shown here:

It's clear how the definitions of nodes and edges have been translated into a picture. If we apply a different layout engine to the same dot file, for example fdp:

    fdp -o simpledot.png -Tpng simpledot.dot

we get the same information, but depicted in a different style:

A brief summary of the various layout engines that come with the system is provided in the dot man page. The dot engine produces a simple, hierarchical layout, whereas fdp, sfdp, and neato all treat the edges as springs. That is, they attempt to arrive at a neat arrangement by starting with a random layout and allowing the system to relax to a minimum energy configuration. The different engines will produce distinct results, as they are all based on different algorithms.

The use of a language to define graphs means that Graphviz can serve as the graphical engine for other systems or programs; they merely need to format their output in the dot language. There are many examples of this. Snakefood is a program that analyzes Python programs and determines their dependencies. You can point it at a directory of Python files and it will return the interdependencies among the files and to external modules, in its own format, which is a collection of Python tuples.

This output can be piped to a Snakefood utility that translates it into dot, which can then be processed by any of the Graphviz engines that can handle directed graphs—usually the dot engine is the best choice. Here is the result of applying the process to a directory holding the files for the bsddb package, an interface to the Berkeley DB library:

The dot file corresponding to this graph is available for further exploration.

You can also use Graphviz without using the dot language directly, by using one of its programming language interfaces. For example, the pygraphviz library for Python allows you to define a graph object, add nodes and edges to it, and create a graph image by calling the draw() method on the object. The Graphviz layout engine is selected with an argument passed to the draw() method.

We've barely scratched the surface of what Graphviz can do. Facilities for subgraphs, record nodes, adding labels in HTML, and more make this a general-purpose powerhouse for any type of automated graph creation. Graphviz is free software (as is Snakefood) and should be available through your package manager.

Stay tuned

In part 2 of this article we'll continue our survey of network graphing tools by looking at two sophisticated libraries. One is used within the LaTeX document-processing system to allow you to embed diagrams into your books and papers. The other is used with Python to allow you to explore the properties of general networks and plot the results.

Comments (31 posted)

Quotes of the week

I've heard some complain of how long the GPL is but but this PDF of Apple's proprietary Software License Agreement for iOS 8.1 is 471 pages: http://images.apple.com/legal/sla/docs/iOS81.pdf

— Jason Self

Well, in their defense, it takes a lot of words to restrict your freedoms and tell you all the ways they will sell and use your private information.

— Charles Stanhope, in reply to Jason.

Comments (2 posted)

KDE Ships Plasma 5.3 Beta

A beta version of Plasma 5.3 has been released. This release features enhanced power management, better Bluetooth capabilities, improved Plasma widgets, a tech preview of Plasma Media Center, big steps towards Wayland support, and lots of bug fixes.

Comments (19 posted)

Kallithea 0.2 available

Version 0.2 of the Kallithea source-code-management framework has been released. Many changes are included. "Notably, pull requests system have been improved, making contributing changes more robust. The visual appearance has also been refined: modern font-based symbolic icons from FontAwesome and GitHub Octicons have replaced the previously used bitmap icons, and revision graphs are now drawn with HiDPI display support."

Geoclue 2.2.0 released

GeoClue 2.2.0 has been released. Among the many changes are speed and heading reporting on each location update, GPS fixes, and fixes for GeoIP location on machines without WiFi hardware.

KDE Frameworks 5.9.0 released

KDE FRameworks version 5.9.0 is now available. Included are a new ModemManagerQt module, new features for KActivities, KIO, and KIconThemes, and bugfixes for many other modules.

Development newsletters from the past week

What's cooking in git.git (April 14)
LLVM Weekly (April 13)
OCaml Weekly News (April 14)
OpenStack Community Weekly Newsletter (April 10)
Perl Weekly (April 13)
PostgreSQL Weekly News (April 12)
Python Weekly (April 9)
Ruby Weekly (April 9)
This Week in Rust (April 13)
Tor Weekly News (April 8)
Tor Weekly News (April 15)
Wikimedia Tech News (April 13)

Turon: Fearless Concurrency with Rust

Aaron Turon has posted a lengthy introduction to concurrency in the Rust programming language. "Every data type knows whether it can safely be sent between or accessed by multiple threads, and Rust enforces this safe usage; there are no data races, even for lock-free data structures. Thread safety isn't just documentation; it's law."

Comments (23 posted)

Hubička: Link time and inter-procedural optimization improvements in GCC 5

Jan Hubička has posted a lengthy discussion of the optimization improvements found in the upcoming GCC 5.0 release. "Identical code folding is a new pass (contributed by Martin Liška, SUSE) looking for functions with the same code and variables with the same constructors. If some are found, one copy is removed and replaced one by an alias to another where possible. This is especially important for C++ code bases that tend to contain duplicated functions as a result of template instantiations."

Comments (10 posted)

The Document Liberation, one year after

The Document Foundation's project Document Liberation looks at its progress during the past year. "During 2014, members of the project released a new framework library, called librevenge, which contains all the document interfaces and helper types, in order to simplify the dependency chain. In addition, they started a new library for importing Adobe PageMaker documents, libpagemaker, written as part of Google Summer of Code 2014 by Anurag Kanungo. Existing libraries have also been extended with the addition of more formats, like libwps with the addition of Microsoft Works Spreadsheet and Database by Laurent Alonso. He is now working on adding support for Lotus 1-2-3, which is one of the most famous legacy applications for personal computers. Laurent has also added support for more than twenty legacy Mac formats to libmwaw."

Full Story (comments: 4)

Linux Foundation to host Let's Encrypt

The Linux Foundation (LF) has announced that it will serve as host of the Let's Encrypt project, as well as the Internet Security Research Group (ISRG). Let's Encrypt is the free, automated SSL/TLS certificate authority that was announced in November 2014 by the Electronic Frontier Foundation (EFF) to provide TLS certificates for every domain on the web. ISRG is the non-profit organization created to spearhead efforts like Let's Encrypt (which, as of now, is ISRG's only public project). In the LF announcement, executive director Jim Zemlin notes that "by hosting this important encryption project in a neutral forum we can accelerate the work towards a free, automated and easy security certification process that benefits millions of people around the world."

Comments (18 posted)

Hate DRM? Tell the world on May 6th

The Free Software Foundation's Defective By Design campaign has announced the International Day Against DRM on May 6. "We'll be gathering, protesting, making, and sharing, showing the world and the media that we insist on a future without DRM."

OIN Expands the Linux System Definition

Open Invention Network (OIN) has announced that it has updated its Linux System patent non-aggression coverage. "For this update, 115 new packages will be added to the Linux System, out of almost 800 proposed by various parties. Key additions are the reference implementations of the popular Go and Lua programming languages, Nginx, Openshift, and development tools like CMake and Maven. This update will represent an increase of approximately 5% of the total number of packages covered in the Linux System, a reflection of the incremental and disciplined nature of the update process."

Full Story (comments: 3)

X.org election results

As was discussed in this LWN article, the X.Org Foundation recently held an election to choose four board members and decide whether to change the organization's by-laws to enable it to become a member of Software in the Public Interest (SPI). The results are now available. The board members elected are Peter Hutterer, Martin Peres, Rob Clark, and Daniel Vetter. The measure to change the by-laws did not pass, though, despite receiving only two "no" votes, because the required two-thirds majority was not reached.

The GNU Make Book -- New from No Starch Press

No Starch Press has released "The GNU Make Book" by John Graham-Cumming.

EuroPython 2015: Call for Proposals has been extended

EuroPython will take place July 20-26 in Bilbao, Spain. The call for papers deadline has been extended until April 28.

CFP Deadlines: April 16, 2015 to June 15, 2015

The following listing of CFP deadlines is taken from the LWN.net CFP Calendar.

Deadline	Event Dates	Event	Location
April 17	June 11 June 12	infoShare 2015	Gdańsk, Poland
April 28	July 20 July 26	EuroPython 2015	Bilbao, Spain
April 30	August 7 August 9	GNU Tools Cauldron 2015	Prague, Czech Republic
May 1	August 17 August 19	LinuxCon North America	Seattle, WA, USA
May 1	September 10 September 13	International Conference on Open Source Software Computing 2015	Amman, Jordan
May 1	August 19 August 21	KVM Forum 2015	Seattle, WA, USA
May 1	August 19 August 21	Linux Plumbers Conference	Seattle, WA, USA
May 2	August 12 August 15	Flock	Rochester, New York, USA
May 3	August 7 August 9	GUADEC	Gothenburg, Sweden
May 3	May 23 May 24	Debian/Ubuntu Community Conference Italia - 2015	Milan, Italy
May 8	July 31 August 4	PyCon Australia 2015	Brisbane, Australia
May 15	September 28 September 30	OpenMP Conference	Aachen, Germany
May 17	September 16 September 18	PostgresOpen 2015	Dallas, TX, USA
May 17	August 13 August 17	Chaos Communication Camp 2015	Mildenberg (Berlin), Germany
May 23	August 22 August 23	Free and Open Source Software Conference	Sankt Augustin, Germany
May 23	May 23 May 25	Wikimedia/MediaWiki European Hackathon	Lyon, France
May 31	October 2 October 4	PyCon India 2015	Bangalore, India
June 1	November 18 November 22	Build Stuff 2015	Vilnius, Lithuania
June 1	July 3 July 5	SteelCon	Sheffield, UK
June 5	August 20 August 21	Linux Security Summit 2015	Seattle, WA, USA
June 6	September 29 September 30	Open Source Backup Conference 2015	Cologne, Germany
June 11	June 25 June 28	Linux Vacation Eastern Europe 2015	Grodno, Belarus

If the CFP deadline for your event does not appear here, please tell us about it.

2nd Swiss Postgres Conference

There will be a Swiss PostgreSQL Conference June 25-26 in Rapperswil, Switzerland. Registration is open and early bird rates are available until May 15.