Leading items
Lightbeam for Firefox
In early 2012, Mozilla released Collusion, an experimental extension for Firefox that recorded third-party web tracking and rendered its information in a graph-like visualization. Since then, the add-on has undergone further development, and Mozilla announced its first major update on October 25. In a move that will come as no shock to Mozilla watchers, the project has been renamed, but it also now offers additional ways to dig into the tracking data that the extension collects, and users can now choose to contribute their tracking statistics to a data survey Mozilla is conducting on web privacy.
The rebranded extension is now called Lightbeam, and is compatible with Firefox 18 and newer. Lightbeam works by recording third-party HTTP requests in the pages visited with the browser, noting requests that match a list of web tracking services and advertisers. The list itself originally came from privacychoice.org. Each tracker encountered is saved in a local record that notes which domain linked to the tracker, whether cookies were set by the request, whether the request was made via HTTP or HTTPS, whether the user has actually visited the tracker domain in question, and a few other details about the connection.
You're on candid camera
The first visualization of this data (and the only option in the Collusion-branded release) was a connected graph, with edges connecting the domains that a user has intentionally visited to all of the tracker domains that come along for the ride. Not only does that visualization immediately show which sites are the worst third-party tracking offenders, but it also reveals how many independent sites are sending data back to the same tracking service. This is an educational experience; most users are aware that large web service companies like Google and Facebook do web tracking, but seeing the full list of third-party, business-to-business web-tracking sites without household names is another thing altogether.
The graph is updated in real time, which makes for interesting (and arguably creepy) viewing in a separate window while one goes about the day's surfing. For example, over the past 48 hours, Lightbeam indicates that I have visited 94 sites, which have in turn collectively reached out to 177 third-party sites. Not all of those third-party sites are trackers, of course: those that are known to be tracking services are rendered as blue-outlined nodes, while others are rendered with white outlines. The graph shows visited sites as circles and third-party sites as triangles.
![[Lightbeam]](https://static.lwn.net/images/2013/10-lightbeam-graph-sm.png)
Interested users should note, though, that Lightbeam is still on the buggy side; the project's GitHub issue tracker indicates that quite a few users are encountering compatibility problems with other extensions—particularly those that also deal with third-party web tracking or privacy protection. There also appear to be several Firefox privacy preference settings that break Lightbeam. Ironically, of course, the users most interested in Lightbeam are likely to also be the most interested in tweaking privacy settings and in installing other web-tracking countermeasures, so these unresolved issues greatly impact Lightbeam's usefulness.
When it does work, the graph visualization allows one to focus in on each individual site, which shuffles around the graph connections so that the site of interest is in the center. Clicking on a node also opens a side pane showing the server location and the list of third-party sites that have been contacted. While the graph is revealing from a big-picture perspective, it is also not necessarily the easiest way to study in depth. Luckily, Lightbeam offers two other views on the same data. The "clock" provides a time-based look at the visits and third-party connections made, and there is also a straightforward list.
Users can also save the extension's current data locally, or reset it to begin a new capture session. That capability would be most useful to do something like compare the effect that changing the "Do Not Track" preference has on third-party trackers, but in my tests enabling the "Do Not Track" setting was one of the preferences that stopped Lightbeam from working altogether.
Spies like us
Better visualization features are nice, but arguably the biggest change in the revamped extension is the fact that Lightbeam users can opt-in to send their data to Mozilla. That might sound a tad oxymoronic, but Mozilla insists that the data collected from users will be anonymized and will only be published in aggregate. For instance, only the domain and subdomain of visited sites are recorded, not the path component of URLs, so in most cases usernames and other personally-indentifiable data will not be included.
![[Lightbeam]](https://static.lwn.net/images/2013/10-lightbeam-clock-sm.png)
In September 2012, Mozilla's David Ascher wrote
about the data-collection effort, saying that it would look at the
" There indeed may be a lot about third-party tracking (both
commercial tracking services and tracking that is performed
surreptitiously) that site owners in general are generally in the dark
about. Still, it would be nice to have more detail available about
exactly what the data collection effort at Mozilla will look like before
opting in to it. In the meantime, though, Lightbeam definitely does
"
Wolfram Sang is worried that the number of kernel maintainers is not
scaling with the number of patches flowing into the mainline. He has
collected some statistics to quantify the problem and he reported those
findings at the 2013 Embedded
Linux Conference Europe. There is not an imminent collapse in the
cards, according to his data, but he does show that the problem is already
present and he forecasts that it will only get worse.
Sang's slides
[PDF] had numerous graphs (some of which are reproduced here). The
first simply showed the number of patches that went into each kernel from
3.0 to 3.10. As one would guess, the trend is an increase in the number of
patches. Companies are working more with the upstream kernel than they
have in the past, he said, which is great, but leads to more patches.
There are also, unsurprisingly, more contributors. Using the tags in Git
commits, Sang counted the number of authors and contrasted that with the
number of "reviewers" (using the "Committer" tag) in his second graph.
Over the 3.0–3.10 period, the number of authors rose by around 200, which
is, coincidentally, roughly the (largely static) number of reviewers. That
means that the gap between authors and reviewers is getting larger over
time, which is a basic outline of the scaling problem, he said. If we want
to maintain Linux with the quality we have come to expect, it is a problem
we need to pay attention to.
His statistics are based on accepted patches and don't include superseded
patches or bogus patches that might require a lengthy explanation to the
author. There is also a fair amount of education that maintainers often do
for new developers. All of that takes additional time beyond what
a raw number patches will show, he said. While his graphs start at 3.0, he
does not want to give the impression that the maintainer workload was in a
good state at that time, "it was challenging already" and it is getting worse.
Trond Myklebust gave the best definition of a maintainer that Sang
knows of. According to that definition, the job is made of five separate
roles: software
architect, software developer, patch reviewer, patch committer, and
software maintainer. It is not easy to rip out any of those tasks to
distribute them to other people. The "number one rule is to get more
maintainers who like to do all of these jobs at once".
The right maintainer will enjoy all of those jobs, which makes good candidates
fairly
rare. Sang suggested that developers remember that maintainers have all
those roles when interacting with them. He doesn't mean that developers
should obey maintainers all
the time, he said, but they should keep in mind that the maintainer may be
wearing their
architect hat so they may be looking beyond the direct problem the
developer is trying to solve.
The "Reviewed-By" and "Tested-By" tags are quite helpful to him as a maintainer
because they indicate that the patch is useful to others beyond just its
author. That led him to look at the stats for those tags. He plotted the
number of reviewers and testers who were not also committers to try to
gauge that pool. That graph appears above at right, and shows that there
are around 200 reviewers and 200 testers for each kernel cycle as well.
The trend is much like that of maintainers, so there is still an increasing
gap. The reviewers and testers "are doing great work", but more of them
are needed as well.
Using a diagram of the different pieces in a typical ARM system on chip
(SoC), Sang showed that there are many different subsystems that go into a
kernel for a particular SoC. He wanted to look at those subsystems to see
how well they are functioning in terms of how quickly patches are being
merged. He also wanted to compare the i2c subsystem that he maintains to
see how it measured up.
Using the "AuthorDate" and "CommitDate" tags on patches from 3.0 to 3.10, he
measured the latency of patches for several different subsystems (the
combined graph is shown at right). That metric can be inaccurate as
a true measure of latency if there are a lot of merge commits (as the
CommitDate may not reflect when the maintainer added the patch), but that
was not really a problem for the subsystems he looked at, he said.
He started with the
drivers/net/ethernet subsystem, which had some 5000 patches over the
kernel releases measured. It has a fairly low latency, with 85% of the
patches being merged within 28 days. This is what developers want, he
said, a prompt response to their patches. In fact, 70% of patches were
merged within one week for that subsystem.
Looking at the mtd subsystem shows a different story. Sang was
careful to point out that he was not "bashing" any subsystem or maintainer
as they all do "great work" and do what they can to maintain their trees.
After 28 days, mtd had merged just over 50% of its patches. Those
might be more complicated patches so they take more review time, he said.
That is difficult to measure. After one three-month kernel cycle, about
80% of the patches were merged.
Someone from the audience spoke up to say that many of the network driver
patches get merged without much review because there is no dedicated
maintainer. That makes the latency lower, but things get merged with lots
of bugs. Sang said that is one way to deal with a lack of reviewers: if
the network driver patch looks "halfway reasonable" and no one complains,
it will often just be merged. If anyone is unhappy with that state of
affairs, they should volunteer to help, another audience member suggested.
Sang agreed, and said that taking on the maintenance of a single driver is
a good way to learn.
His subsystem is somewhere in between net/ethernet and
mtd. That is "not bad", he said, for someone doing the
maintenance in their spare time. But for a vendor trying to get SoC
support upstream, it may not be quick enough.
The dream, he said, would be for all subsystems to be more like the
Ethernet drivers without accepting junk patches. His belief is that over
time the latency in most subsystems is getting worse. In fact, his
"weather forecast" is that we will see more and more problems over time,
either with increased latency or questionable patches going into the trees.
So, what can be done to help out maintainers? For users, by which he means
people who are building kernels for customers, not developers, necessarily,
or regular users, he recommends giving more feedback to subsystem
maintainers. Commenting on patches, testing them, and describing any
problems found will help. Add a Tested-By tag if you have done
that, as well. If there is no reaction to a patch on the mailing list and
seemingly no interest, it makes his job of deciding whether the
patch is worthwhile difficult. If you are using a patch that hasn't been merged,
consider resending it, but check to see if there are open issues from when
the patch was posted. Sometimes there are simple style changes needed that
can be easily fixed.
For developers, he recommends trying to get the patch right the first time
and thus reducing the number of superseded patches. Not knowing the subsystem and
making mistakes that way is reasonable, but sloppy patches are not. In
addition, if you know a patch is a suboptimal solution to the problem, be
honest about it. Don't try to sell him something that you know is bad.
Sometimes a lesser solution is good enough, but a straight explanation
should accompany it.
He also recommends that developers take part in the patch QA process by
reviewing other patches. In fact, he said, you should also review your own
patches as if they came from someone else—it is surprising what can be
found that way. Taking part in the mailing list discussions, especially
those that are about the architecture of the subsystem, is important as
well. It is difficult to determine which way to go, at times, without
people stating their opinions.
Maintainers should not necessarily work harder, Sang said, as most are
working hard already. It is important to watch out for burnout as no one
wins if that happens. SoC vendors are "constantly pressing the
fast-forward button" by releasing hardware faster and faster, so you may reach a
point where you simply can't keep up. That may be time to look for a
co-maintainer.
Having the right tools is an important part of being a maintainer. There
is no "ready-made toolbox" that is handed out to new maintainers, but if
you talk to other maintainers, they may have useful tools. Keyboard
shortcuts, Git hooks for doing auto-testing, tools to handle and send out
email, and so on are all time savers. "Pay attention to the boring and
repetitive tasks" and try to automate them.
Organizations like the Linux Foundation (LF), Linaro, SoC makers, and others
have a role to play as well. If they already have developers, it is
important to allow those developers to review patches and otherwise
participate in kernel QA. That will improve the developers' skills, which
will help the organization, and it will improve the kernel too.
It is important to educate new kernel developers internally about the
basics of kernel submissions. He is much more lenient with someone who he
knows is working on his own than he is with those working at companies
where there are multiple folks who already know about submitting patches
and could have passed the knowledge on.
Increasing the number of maintainers would help as well. It might be
easier for people to take on maintainer responsibilities if it were part or
all of their job to do so. Sang believes that being a maintainer should
ideally be a full-time paid position, but that is often not the case. He
does it on his own time, as do others, and some do it as part, but usually not
all, of their job. A neutral party like the LF might be desirable as the
employer of (more) maintainers, but other organizations or companies could
also help out. In his mind, it is the single most important step that
could be taken to improve the kernel maintainer situation.
He went back to the SoC diagram he showed early on, but this time colored
the different subsystems based on whether the maintainer was being paid to do
that work. Red meant that the maintainer was doing it in their spare time,
and there were quite a few subsystems in that state. That is somewhat risky for
SoC vendors trying to get their code upstream. Ideally, most or all of the
diagram would be green (maintainer paid to do it) or yellow (part of the
maintenance time is paid for). Sang ended by saying that having full-time
maintainers was really something whose time had come and he is optimistic
that more of that will be happening soon.
[I would like to thank the Linux Foundation for travel assistance to
Edinburgh for the Embedded Linux Conference Europe.]
The third Automotive Linux Summit (ALS) was held in Edinburgh,
Scotland, concurrently with the Embedded Linux Conference and Kernel
Summit. As has been the case with the previous ALS events, there was
a lot of talk from automakers about developing Linux-based in-vehicle
infotainment (IVI) and embedded control systems—talk that, for
the most part, dealt with efforts that are still several years away
from reaching the showroom floor. Such is simply the nature of the
car business; new car models require a multi-year development cycle
for numerous reasons. But what was more interesting in the schedule
of the Edinburgh event was tracking the progress of the many
sub-projects on which a Linux-based IVI system depends. Over the past
two years, the community has seen the topics move steadily from what
an IVI system should include to active projects with working
code.
The leading example of this progress comes from GENIVI, the car-industry alliance
working on a Linux-based IVI middleware layer. In 2012, GENIVI launched its first open source projects:
an audio routing manager, a graphics layer manager, and a diagnostic
logging tool. Since that time, the stable of GENIVI projects has
grown to fifteen. GENIVI's Philippe Gicquel delivered one of the keynote talks on the
first day of ALS, starting off with a description of how GENIVI
sees itself fitting into the expanding community of Linux-driven
automotive software projects. The alliance's target, he said, is the
non-differentiating components of IVI stack: those bits on which car
makers and tier-one suppliers would rather not compete head-to-head
since they are largely invisible to consumers. In particular, though,
GENIVI is pursuing a specific set of domains: multimedia and graphics,
connectivity with consumer electronics devices (e.g., phones),
location-based services, and integrating Linux with existing
automotive standards (such as diagnostics). Each domain has its own
working group at GENIVI, which are collectively coordinated by
GENIVI's System Architecture and Baseline Integration teams.
Gicquel then briefly discussed the active software projects.
GENIVI's standard practice is to work upstream, he said, and it
contributes to Wayland and other projects, but it does maintain its
own IVI-specific branches of projects when a full merge upstream is
not possible. For example, he said, the IVI Layer
Manager (which was one of the first three projects) project
maintains a patch set that makes Wayland GENIVI compliant, since its
use case is not important to 95% of Wayland systems.
The newer GENIVI projects include several that build on upstream
work, such as AF_BUS,
which is a latency-reducing optimization of D-Bus. Node
Startup Controller (NSC) and Node State
Manager (NSM) are components used to manage application lifecycles in
the vehicle. NSC extends systemd to handle rapid startup and shutdown
of applications and system services (where often the maximum allowable
time is a legal requirement). NSM provides a state machine to track
applications' lifecycles.
Other projects are original. Persistence
Management is a library to handle storage of data that needs to
persist across reboots, and, like NSC, is designed to cope with the rapid
system shutdowns expected in automotive environments. IPC CommonAPI C++,
is a set of C++ language bindings that abstracts several
inter-process communication APIs into one (the "common API").
Currently D-Bus, SOME/IP, and Controller Area
Network (CAN) bus are the IPC mechanisms targeted, but support for
others may be added later. Smart Device
Link (SDL) is a framework for smartphone applications to run
remote user interfaces on a car's dash head unit.
There are also several developer tools in GENIVI's projects. YAMACIA is a plugin for
the Eclipse IDE that supports the IPC CommonAPI and the Franca
interface definition language used by the CommonAPI, while LXCBench is an
analysis tool that benchmarks the performance of applications run in
Linux containers.
Finally, there are several "proof of concept" (POC) projects that
implement basic functionality to demonstrate APIs or features that
GENIVI expects vehicle OEMs to replace. The browser POC is a
Qt-and-WebKit based browser component. Similarly, the Tuner Station
Manager demonstrates the use of the radio tuner API, and the Point-Of-Interest
Service is a demonstration of the location-based service's group
point-of-interest (POI) API. Web API
Vehicle is an HTML5 application interface toolkit designed as a
proof-of-concept to show application developers how to access various
W3C web APIs in a vehicle.
Gicquel estimated that there were about 75 active contributors to
the various GENIVI projects at present, more than 20 code
repositories, and that they had written 500,000 lines of code so far.
The alliance recently appointed longtime GENIVI developer Jeremiah
Foster as the developer community manager, and is looking to open up
even more of its development processes in the coming months. There
are mailing lists for every project, he said, but the alliance is
still in the process of "moving from silos to a community."
The other major development community showing off new work at ALS
was the Linux Foundation's Automotive Grade
Linux (AGL) workgroup, which builds its software stack on top of
the IVI flavor of Tizen.
How AGL, GENIVI, and Tizen IVI fit together (or don't) was
certainly a source of confusion in years past, but as each of the
projects has released more code, the picture has become clearer.
Intel's Brett Branch presented a session on the first day of ALS that
dealt with that very question (among others); as he explained, GENIVI
provides a middleware layer, but vehicle OEMs will use GENIVI code in
conjunction with a Linux base system to build a final
product—and will likely include other components as well, such
as AUTOSAR-compliant
components for safety-critical systems.
Indeed, while GENIVI now offers
two distinct baseline Linux distributions built on different platforms
(an
x86-based system built on Baserock
and an ARM
system built with Yocto), Tizen IVI rolls a
single release, which includes components from GENIVI projects, code
from elsewhere, and code written within the Tizen project itself.
At ALS 2013, there were a variety of talks about Tizen IVI
components. Intel's Ossama Othman discussed the project's work
implementing several new standards for mobile device integration. The
driving force, he said, is consumers' desire to seamlessly move
between their smartphone screen and their IVI unit for common
applications like music playback or mapping. Othman's primary focus
is implementing the display integration, where, he said, there are
three main standards: SDL, MirrorLink, and Miracast. MirrorLink and Miracast both work by cloning the smartphone's
display to the IVI head unit, but they differ considerably otherwise.
First, MirrorLink is a closed specification created by the Car
Connectivity Consortium (CCC), while Miracast is an open standard created by
the WiFi Alliance. Second, MirrorLink explicitly handles sending
input events from the IVI head unit (e.g., gestures and touch input)
back to the smartphone, but Miracast is a one-way, display-only
standard. Miracast applications are thus left on their own to
send data from the head unit back to the mobile device through some
other means. Third,
MirrorLink requires a USB connection between the device and head unit,
whereas Miracast (as one might expect) is wireless—although it
consumes a lot of bandwidth. Miracast also mandates use of the
non-free H.264 video codec, which makes a "clean" free software
implementation impossible. But MirrorLink requires every
MirrorLink-compatible application to go through a certification
process.
But car makers might be willing to foot the bill for H.264 licenses
or MirrorLink certification, so Tizen IVI explores both. At the
moment, Othman has implemented as much of Miracast as he can; many of
the necessary pieces are out there already (such as support for
WiFi Direct in the kernel and in wpa_supplicant), but for others he is still in the process of persuading
people to release the necessary code under an open license. The Tizen
IVI releases do ship with hardware support for H.264 playback, he
said, and they have support for some WiFi Direct hardware.
The MirrorLink situation is far more bleak; there appear to be no
open source implementations at all, but Othman said he is still
looking. He also noted that it was unclear to him whether there were
ways to get access to MirrorLink documentation without paying for
membership in the CCC.
The third standard, SDL, is the most feasible to implement. The
open source code (which is hosted at GENIVI) was a donation from Ford,
but Othman said it was difficult to integrate. So far, Ford has not
participated in the project since making the initial code
contribution, and the existing code is "packaged weirdly" making it a
pain to integrate. For example, it includes hardcoded links to
libraries not included in Tizen and to specific
executables (like Google Chrome), it statically links all of the
libraries it uses, and it generates a number of symbol conflicts and
other errors when compiled. Othman said he has not been able to
figure out which version of g++ Ford used to compile it internally,
but that it is clearly an out-of-date one. Nevertheless, he was able to patch
it and it integrated in the latest Tizen releases.
The good news is that SDL is the most functional of the three; it
implements a full remote display (not simply cloning the device
screen), handles input events, and allows application developers to
implement separate interfaces for the device and IVI screens.
Patrick Ohly gave a related presentation (which Othman encouraged
his audience to attend) about the work he has done in Tizen IVI to
integrate the synchronization of personal information (PIM) data like
calendars and contacts databases between smartphones and IVI head
units.
The Tizen IVI PIM stack is based on GNOME tools: Evolution
Data Server, Folks, and
SyncEvolution. This is a different stack than the one used in the
Tizen smartphone releases, which does not address synchronization.
Adapting the GNOME stack to Tizen IVI did require several changes, he
said, such as storing all data in an SQLite database, and he has added
several other useful tools, such as libphonenumber, an
Android library for normalizing phone numbers. SyncEvolution is
configured to support a configurable set of address books on different
devices; it can thus present a unified address book for all phones
synced to the IVI unit, but can keep them synchronized separately. He
is currently working on CardDAV and CalDAV support (including
supporting Google's PIM services), as well as support for the
Bluetooth Phone Book Access Profile.
To the smartphone development community, address book
synchronization may sound like old news. But it and the other
projects on display at ALS are significant because they reveal how
much progress has been made in recent months. At the 2012 ALS, the
biggest news was the availability of AGL Demonstrator, a demo IVI
system that sported many UI mock-ups that would be filled by real
applications today. Indeed, between the Tizen IVI milestone releases
and GENIVI's base systems, there were three Linux-based IVI systems
available this year.
There were still carmakers on stage discussing IVI systems that
they had built but that were not yet available, but the tenor of the
discussion has changed. In his keynote address, Jaguar Land Rover's
Matt Jones commented that being in the IVI business meant that the
company was expected to pony up membership fees for a wide assortment
of industry alliances: AUTOSAR, CEA, DLNA, Bluetooth SIG, ERTICO, and
many more. Together the membership fees add up to $800,000 or more,
he said, but through its involvement with GENIVI and Tizen he has seen
far return for the money that the company has put into open source
projects. He ended that discussion by inviting anyone with an
automotive open source project to come talk to him about it, since
that is what he would prefer to invest his budget in. It will be
interesting to see just how much bigger the playing field is at the
next ALS, whether from Jones's investment or elsewhere.
[The author would like to thank the Linux Foundation for travel assistance to Edinburgh for ALS.]
different kinds of uses of shared third-party HTTP
requests
", so that users can better understand what types of
request are in their interest and which are not. He also said that
Mozilla intended to use the data to work with site publishers. The
blog post announcing the October Lightbeam release followed up on this
idea, albeit with few specifics. The post says only that "
Once
the open data set has time to mature, we’ll continue to explore how
publishers can benefit from additional insights into the interaction
of third parties on their sites.
"
pull back the curtain
" (as the blog post puts it) on web
tracking for individual users. Hopefully as Mozilla pursues a broader
effort, the results will be enlightening for users of the web in
general—most of whom know that "web tracking" exists in some
form, but for whom its full extents and methods remain a mystery.
The kernel maintainer gap
Automotive Linux projects getting in gear
GENIVI opening up
Automotive Grade Linux
Down the road
Page editor: Jonathan Corbet
Next page:
Security>>