Leading items

Michael Meeks talks about LibreOffice and the Document Foundation

By Jake Edge
September 28, 2010

A group of OpenOffice.org developers has announced the creation of an independent foundation - called the Document Foundation - to guide the further development of the office suite, which is provisionally named LibreOffice. At the heart of this effort is longtime OpenOffice.org developer Michael Meeks. We had the good fortune to discuss the LibreOffice effort with Michael; read on for his comments on this new initiative

LWN: Probably the first question that will come to mind for most of our readers is "Why?" — why fork OpenOffice.org? And why now?

Well, it has been ten years since a foundation was promised as part of the original OpenOffice.org announcement, and there is now a confluence of circumstances to realise that goal. We want a vendor neutral body that lots of companies and non-profits can contribute to as peers. That foundation is called the Document Foundation, and for trademark reasons our product will be called LibreOffice.

LWN: What do you see as the advantages of LibreOffice for OpenOffice.org users? developers? distributions?

For developers, we are open for business - we have a realistic view of the code-base and as such we are interested in including people's fixes and improvements quickly. When we can get people working to clean up the code, translate German comments, remove dead code, fix ergonomic nits, write unit tests and so on - we are optimistic that we can produce a far better product, and one that (as developers) we can be proud of.

Linux distributions should find LibreOffice easier to package, as the development team has a vast amount of Linux distribution experience.

All of that of course leads to getting a better, more stable, and featureful office suite into users' hands.

LWN: Do you plan to require copyright assignment or contributor agreements? If so, what would those entail? And if not, why not?

There are no plans to require copyright assignment, clearly it is important to determine the origin of all code, so we will use a clear signing off / attribution trail, and familiar git tooling to make that easy.

Having to sign formal paperwork before contributing code is clearly a formidable barrier to entry, even if the rights end up with a well-governed non-profit. In contrast I believe LibreOffice needs an "All Contributions Welcome and Valued" sign outside, that says come in and help, there is a place for you here.

LWN: What are the near-term technical and community goals for the project? What about the longer-term?

In the near term, we expect to clean-up the code; we have a set of janitors tasks that require (in some cases) no previous programming experience whatsoever e.g. removing commented out code that was just left lying around (presumably due to a lack of faith in revision control). If you want to get the eternal glory of having your name in the LibreOffice code-base, now is a great time to get involved.

We also want to target tackling many of the problems that have traditionally made it hard to develop with, such as the arcane and monolithic build system.

Finally - there are a lot of ergonomic nits in OpenOffice, that individually are easy to fix but collectively add up to a big problem. We want to start tackling these in the short term.

Longer term - we are developing a plan, but somehow our press experts persuaded us to delay announcing it, expect to hear more around the Linux Plumbers Conference.

LWN: When might we expect the first LibreOffice release? Presumably it will incorporate the patches that go-oo has been maintaining, but are there patches from elsewhere that might make their way into the first release or two? Any exciting features on the horizon that we haven't seen in go-oo yet?

We have already released a beta. It is a distinct piece of code from go-oo for several reasons, most importantly being that we don't want to maintain patches anymore. Go-oo was maintained as patches, such that features could be enabled per-platform or per distribution simply by not applying them but this brings maintenance, and development problems of its own.

Instead with LibreOffice we will have several flat git repositories, such that the git diff output will be your patch, and committing is as simple as a git push. Of course many of the go-oo features have been merged, some are still pending review, and going forward go-oo will be obsoleted by LibreOffice.

LWN: Does LibreOffice plan to track OpenOffice development and incorporate changes from that code base or does it plan to go completely in its own direction? Or will there be a gradual shift from one to the other?

Clearly we are going to merge all (suitably licensed) code into the project from anywhere we can get it. Previously we would work from whatever Oracle released, but in future we will pick and choose the best changes and features from wherever they come.

LWN: Are you at all concerned about maintaining such a large body of code without the resources of a large company like Sun or Oracle behind the effort?

Clearly Oracle's contribution is real and substantial, and we would dearly like them to participate in the Document Foundation, a warm welcome is extended to them. Nevertheless - both Novell and Red Hat have support capabilities around OpenOffice.org and are confident that we can fix and improve the code. Clearly, having dependence on any single company to support or drive the project is a huge risk factor. There is a perception out there that the code is terribly tangled and impossible to develop with, but the reality is that it is just code. Sure you have to read some parts quite carefully, and empathise deeply with the authors before altering them, but this is true of all large pieces of code.

LWN: There have been occasional hints that Sun had patents on some StarOffice/OpenOffice components and we have seen that Oracle is not terribly shy about patent litigation; does the project have any concerns about patents or patented technology in the codebase?

The OpenOffice.org code-base that LibreOffice is derived from is licensed under the LGPLv3 - which gives us all a strong explicit patent license, and a good copyright license, so no. Clearly for new code we would want a plus ["or any later version"] license, so we are considering recommending a LGPLv3+ / MPL combination for entirely new code.

LWN: Who is involved with this new LibreOffice project? Undoubtedly there were individuals besides yourself, along with companies, and perhaps other groups, what can you tell us about who they are and what their roles will be?

Oh certainly, I, and Novell are only a small part of this effort, a large proportion of the non-Oracle OpenOffice.org community is of like mind, and are instrumental in helping to create LibreOffice. I anticipate the Foundation we create ultimately looking more like the GNOME Foundation than the Mozilla Foundation, i.e. with only a small staff for co-ordination, rather than for central development. I hope we will have similar elections of contributors for representatives and so on.

There is a list of people behind the foundation on the LibreOffice web-site, if I start naming them all we will run out of space pretty quickly. Of course, there are also a good number of heroes who managed somehow to get their code and fixes into an OpenOffice product in the past, that should find it a pleasure to contribute in future.

LWN: Have you had any discussions with Oracle about any of this? You are inviting them to join forces with the new project, have they expressed any interest, either formally or informally?

Clearly we have informed Oracle's StarDivision management ahead of time, as is only polite. As to their reaction - I have many developer friends in StarDivision whom I respect and have loved collaborating with in the past. My hope is, that we will work together again.

[ We would like to thank Michael for taking the time to answer our questions. ]

Comments (59 posted)

The impact of the HDCP master key release

September 29, 2010

This article was contributed by Nathan Willis

On September 13, a file appeared on the Pastebin clipboard-sharing site claiming to contain the "master key" for the High-bandwidth Digital Content Protection (HDCP) encryption system used to restrict digital audio and video content over participating HDMI (High-Definition Multimedia Interface), DisplayPort, and other connections. Intel, which developed the HDCP system internally and now sells licenses to it through its subsidiary Digital Content Protection (DCP), confirmed to the press that the key is legitimate on the 17th. What the development means for open source software is not clear, however. It stands as yet another example of why digital content-restriction schemes consistently offer "protection" that they cannot deliver, but it is not an open door for free access to media that comes in encrypted formats, such as Blu-ray discs.

Primarily this is because HDCP is not the encryption scheme used to scramble content delivery — either on optical disc or delivered to the home via satellite or cable. Rather, HDCP is used exclusively to encrypt the video output signal from the playback source (such as an optical disc player or a cable converter box) to the display. HDCP "protects" the signal both by encrypting it during transmission, and by allowing each device to perform an authentication check against the device on the other end of the connection. A side effect of the scheme is that home theater enthusiasts complain of sometimes lengthy delays when switching from one HDCP-compliant video source to another while the devices step through the HDCP handshake process.

HDCP under the hood

Computer scientist Edward Felten posted an explanation of the HDCP security model on Princeton's Freedom to Tinker blog shortly before Intel verified that the key was indeed genuine. In a nutshell, the HDCP handshake process begins with a key exchange protocol using Blom's scheme. Each licensed HDCP device has a public key and a private key; all of the private keys are generated (in advance) from the public key combined with a secret master key kept by DCP.

That key was the array posted to Pastebin on September 13. It allows anyone to generate a perfectly valid private key at their leisure. Therefore, anyone can correctly perform the handshake, exchange keys with a licensed HDCP device, and decrypt the video signal sent over the cable. No "key revocation" or blacklisting scheme can prevent such an attack, as all would-be attackers can now generate every possible key.

The fact that the secret master key was exposed does not necessarily mean that some ne'er-do-well stole it, however. As far back as 2001, three years before HDCP received regulatory approval by the FCC, two teams of cryptographers announced that the system was fatally flawed, and that an attacker could discover the master key simply by capturing the public keys — something that all HDCP-compliant devices freely report — from as few as 40 legitimate devices .

One researcher, Niels Ferguson, declined to publish his finding citing the threat of prosecution under the US Digital Millennium Copyright Act (DMCA). The other group, Scott Crobsy et al., did publish their paper [PDF], which also notes the amusing property that reverse-engineering the secret master key can be done with no prior knowledge of the algorithm used to generate keys.

Ferguson noted on his site in 2001, however, that "someday, someone, somewhere will duplicate my results. This person might decide to just publish the HDCP master key on the Internet. Instead of fixing HDCP now before it is deployed on a large scale, the industry will be confronted with all the expense of building HDCP into every device, only to have it rendered useless." On September 14, he updated his HDCP page, saying: "My only question is: what took them so long?"

Practicality

Now that HDCP's authentication requirements and content encryption are irrevocably broken, the question many in the open source software community are asking is whether free software media projects will now have an easier time working around HDCP's restrictions. The short answer is that there is little to no practical advantage gained from a broken HDCP, because it is an encryption measure applied only on the raw video signal sent to the display — i.e., over HDMI, DVI, or DisplayPort cabling.

At that stage, the original source media has been decompressed from its delivery format into an audio stream and a sequence of full-resolution video frames. The bandwidth requirements for the current generation of high-definition content are very high (1920 by 1080 pixels, 24 bits per pixel, 30 frames per second, or approximately 1.49 Gbps for video alone). The open source projects that include video capture, such as MythTV, VLC, VDR, and Freevo, focus either on the capture of standard MPEG-based broadcasts or on supporting embedded hardware that performs MPEG-conversion or other compression of analog signals via a dedicated chip.

One of those devices, the Hauppauge HD PVR, does capture full-resolution, high-definition raw video over component inputs. In theory it would be possible to build a similar device that accepted HDCP-locked HDMI input instead, but such a device would either perform the same hardware compression the current devices do (in which case the "bit perfect" copy is lost), or have extremely large, extremely fast storage attached. MythTV's Robert McNamara described the possibility as infeasible.

Doing the same thing with generic PC hardware would not be much easier; there are a few HDMI video capture devices on the market, but the only manufacturer with any Linux driver support at the moment, Blackmagic Design, supplies only binary drivers that do not allow capturing HDCP-copy-protected content.

More importantly, the ability to capture full-resolution, uncompressed video from the HDMI output of a high-definition video player is a moot point considering that the content scrambling schemes employed on the compressed contents of optical discs like HD DVD and Blu-ray are broken as well.

The initial scheme deployed on HD DVD and Blu-ray is called Advanced Access Content System (AACS), and it has suffered numerous key discoveries that allow its decryption. AACS incorporates a key revocation scheme that can lock up new releases with new keys, and which is currently believed to be in either the 17th or 18th round of revocation and key replacement.

Some newer Blu-ray discs are encrypted with a different system called BD+ centered around a small virtual machine in the player, which runs VM code included on the disc. The VM code can perform integrity checks to make sure that the player has not been tampered with, force player firmware upgrades, and other security tasks. Nevertheless, at least two proprietary companies sell BD+-stripping software, and there is an open source effort to reverse-engineer the BD+ VM, spearheaded by developers at the Doom9 forums.

High-definition cable and satellite transmissions are protected by other schemes sold by proprietary vendors, including DigiCipher 2, PowerVu, and Nagravision. There appears to be no large-scale interest in reverse-engineering any of these schemes in open source software.

Legal threats

When it verified publicly that the Pastebin key was in fact the HDCP secret master key, Intel spokesman Tom Waldrop levied ominous-sounding threats of legal action against anyone who incorporated the master key into a product, saying "There are laws to protect both the intellectual property involved as well as the content that is created and owned by the content providers, [...] Should a circumvention device be created using this information, we and others would avail ourselves, as appropriate, of those remedies."

Which laws those are were not specified. The key itself could probably be considered a trade secret under US law, and if anyone with access to it disclosed it, he or she could face a civil breach-of-contract lawsuit. Both Waldrop and independent cryptographer Paul Kocher have publicly opined that the key was probably calculated through reverse engineering as Ferguson and Crosby predicted, however.

Nevertheless, any hardware manufacturer that currently produces HDCP equipment has purchased a license from DCP, which would presumably prohibit it from producing a competing product using the leaked master key. What remains unclear at this stage is whether DCP asserts any patents on HDCP, which could be used to mount a legal challenge to any HDCP-bypassing device even from a non-licensee. DCP's web site and the license agreements offered there mention patents among other broad "intellectual property" claims, but do not specify any particular patent grants. The opacity of patent filings and the difficulties of performing an adequate patent search are but two of the flaws in the US patent system already familiar to most readers.

The anti-circumvention provisions of the DMCA are yet another possible legal avenue; section 103 states that "No person shall circumvent a technological measure that effectively controls access to a work protected under this title." Whether or not the completely broken HDCP scheme would be ruled as "effectively" controlling access to a work is a matter of speculation. In recent years, the copyright office has expanded the regulatory list of allowed exceptions to section 103, including specific examples of copying CSS-protected DVD content, but individual court cases continue to rule both ways on whether fair use permits circumvention.

The future

Many people are speculating that the broken-beyond-repair HDCP scheme will lead to new hardware devices, perhaps monitors or video switches that can connect to HDCP content sources but that can ignore the restrictions imposed from HDCP content sources on the other end of the cable. That is certainly a possibility, though it could be a while before such products reach the market, and they may initially come from overseas suppliers far from the reach of DCP's legal threats.

From the software angle, however, it is difficult to come up with a scenario in which sidestepping HDCP constitutes a major gain. For video capture applications, it occurs way too close to the final display to be valuable — working around the on-disc scrambling schemes is far faster, and the raw output that might be captured over HDMI must immediately be compressed again to be practically stored. Given that no content sources (cable, satellite, or optical disc) originate in uncompressed formats, this would be a "recompression" anyway, not likely to provide any discernible quality improvement. Perhaps playback applications could fake being a licensed HDCP source, but what good is that, when HDCP is broken? In addition, display devices are all considered downstream from content sources; adding HDCP encryption would not make a signal more widely viewable, only less. Nevertheless, on September 29, two developers posted some BSD-licensed code implementing HDCP in software, so time will tell if the global software community finds it useful.

In conclusion, as the world says goodbye to HDCP, it is probably worth noting that the technology did little or nothing to actually prevent the unauthorized copying of digital audio and video content, so it is logically befitting that its passing will probably have little effect either. Whether the consumer electronics and entertainment industries learn a lesson from its brief lifespan or not is another matter entirely. DCP is already promoting a newer product called HDCP 2.0, which it advertises as being based on public key RSA authentication and AES 128 encryption, targeting wireless audio/video transmission standards. I have not yet found any serious cryptanalysis of HDCP 2.0 (there are several white papers promoting the standard, however), but then again the technologies that implement it — Digital Interface for Video and Audio (DiiVA), NetHD, Wireless Home Digital Interface (WHDI), and Wireless HD (WiHD) — have yet to reach the mass-market.

Comments (11 posted)

GSM security testing: where the action is

By Jonathan Corbet
September 27, 2010

Over the years, there has been a lot of interest in the security of the TCP/IP protocol suite. But there is another set of protocols - the GSM mobile telephony suite - which is easily as widely deployed as TCP and for which security is just as important, but a lot fewer people have ever taken a deep look at GSM. Harald Welte, along with a small group of co-conspirators, is out to change that; in a fast-paced Linux-Kongress talk (slides [PDF]), he outlined what they have been up to and how far they have gotten.

While they may be hard to find, the specifications for the GSM protocols are available. But the industry around GSM is very closed, Harald says, and closed-minded as well. There are only about four implementations (closed, naturally) of the GSM stack; everybody else licenses one of them. There are also no documents released for GSM hardware - at least, none which have been released intentionally. There are very few companies making GSM or 3G chips, and they buy their software from elsewhere. Only the biggest of handset manufacturers get to buy these chips directly, and even they don't get comprehensive documentation or source code.

On the network side, there are, once again, just a few companies making GSM-related equipment. Beyond the major manufacturers, there are a couple of nano/femtocell companies, and a shadowy group of firms making equipment for law-enforcement agencies. These companies have a small number of customers - the cellular carriers - and the quantities sold are low. So, in other words, prices for this equipment are very high. That means that anybody wanting to do GSM protocol research needs to set up a network dedicated to that purpose, and that is an expensive proposition.

Even the cellular operators don't know all that much about what is going on under the hood; they outsource almost everything they do to others. These companies, Harald says, are more akin to banks than technology companies; the actual operation of the network equipment is outsourced to the companies which sold that equipment in the first place. As a result, there are very few people who know much about the protocols or the equipment which implements them.

This state of affairs has some significant implications. Protocol knowledge is limited to the small number of manufacturers out there. There is almost no protocol-level security research happening; most of what is being done is very theoretical and oriented around cryptographic technology. The only other significant research is at the application level, which is several layers up the stack from the area that Harald is interested in. There are also no open-source protocol implementations, which is a problem: these implementations are needed to help people learn about the protocols. The lack of open reference implementations also restricts innovation in the GSM space to the manufacturers.

So how should an aspiring GSM security researcher go about it? One possibility is to focus on the network side, but, as was mentioned before, that is an expensive way to go. The good news is that the protocols on the network side are relatively well documented; that has helped the OpenBSC and OpenBTS projects to make some progress in this area. If, instead, one wanted to look at GSM from the handset side, there is a different set of obstacles to deal with. The firmware and protocol code used in handset baseband processors is, naturally, closed and proprietary. The layer-1 and signal-processing hardware and software is equally closed. There is also a complete lack of documented interfaces between these layers; we don't even know how they talk to each other. There have been some attempts to make things better - the TSM30 and MADos projects were mentioned - but things are still in an early state.

Nonetheless, the handset side is where Harald and company decided to work. The bootstrap process was a bit painful; it involved wading through over 1000 documents (full documents - not pages) to gradually learn about the protocols and how they interact with each other. Then it's necessary to get some equipment and start messing with it.

Harald gave a whirlwind tour of the protocols and acronyms found in cellular telephony. On the network side, there is the BTS (the cell tower), which talks with the base station controller (BSC), which can handle possibly hundreds of towers. The BSC, in turn, talks to the network subsystem (NSS), which is responsible for making most of the details of mobile telephone work. The protocol for talking with the handsets is called UM. It breaks down into several layers, starting with layer 1 (the radio layer, TS 04.04), up to layer 2 (LAPDm, TS 04.06), and layer 3, with names like "radio resource," "mobility management," and "call control." The layer 3 specification is TS 04.08 - the single most important spec, Harald says, for people interested in how mobile telephony works.

Various people, looking at the specifications, have already turned up a few security problems. There is, for example, no mutual authentication between the handset and the cellular tower, making tower-in-the-middle attacks possible. Cryptographic protocols are weak - and optional at that - and there is no way for the user to know what kind of encryption, if any, is in use. And so on.

On the handset side, these protocols are handled by a dedicated baseband processor; it is usually some sort of ARM7 or ARM9 processor running a real-time operating system. Evidently L4 microkernels are in use on a number of these processors. The CPU has no memory protection, and the software is written in C or assembly. There are no security features like stack protection, non-executable memory, or address-space layout randomization. It's a huge amount of software running in an unforgiving environment; Harald has written up a description of how this processor works in this document [PDF].

What an aspiring GSM security researcher needs is a baseband processor under his or her control. There are a couple of approaches which could be taken to get one of those, starting with simply building one from generic components. With a digital signal processor and a CPU, one would eventually get there, but it would be a lot of work. The alternative is to use an existing baseband chipset, working from information gained from reverse engineering or leaked documentation. That approach might be faster, but it still leads to working with custom, expensive hardware.

So the OsmocomBB hackers took neither of those approaches, choosing instead the "alternative, lazy approach" of repurposing an existing handset. There is a clear advantage to working this way: the hardware is already known to work. There is still a fair amount of reverse engineering to be done, and hardware drivers need to be written, but the job is manageable. The key is to find the right phone; a good candidate would be as cheap as possible, readily available, old and simple, and, preferably, contain a baseband chipset with at least some leaked information.

The team settled on the TI Calypso chipset, which actually has an open-source GSM stack available for it. Actually, it's not open source, but it did sit on SourceForge for four years until TI figured out it was there; naturally, the code is still available for those who look hard enough. The chipset is past the end of its commercial life, but phones built on this chipset are easy to find on eBay. As an added bonus, the firmware is not encrypted, so there are no DRM schemes to bypass.

With these devices in hand, the OsmocomBB project started in January of 2010 with the goal of creating a GSM baseband implementation from scratch. At this point, they have succeeded, in that they have almost everything required to run the phone. Their current approach involves running as little code as possible on the phone itself - debugging is much easier when the code is running on a normal computer. So the drivers and layer 1 code run on the phone; everything else is on the PC. Eventually, most of the rest of the code will move to the handset, but there seems to be no hurry in that regard.

The firmware load currently has a set of hardware drivers for the radio, screen, and other parts of the phone. The GSM layer 1 code runs with no underlying operating system - there really is no need for one. It is a relatively simple set of event-driven routines. The OsmocomBB develpers have created a custom protocol, called l1ctl, for talking with the layer 1 code. Layers 2 and 3 run on the host computer, using l1ctl to talk to the phone; they handle tasks like cell selection, SIM card emulation, and various "applications" like making calls.

The actual phones used come from the Motorola C1xx family, with the C123 and C155 models preferred for development and testing. One nice feature of these phones is that they contain the same GSM modem as the OpenMoko handset; that made life a lot easier. These phones also have a headset jack which can, under software control, be turned into an RS-232 port; this jack is how software is loaded onto the phone.

At this point, the hardware drivers for this phone are complete; the layer 1-3 implementations are also "quite complete." The OsmocomBB stack is now able to make voice calls, working with normal cellular operators. The user interface is not meant for wider use - tasks like cellular registration and dialing are command-line applications - but it all works. The code is also nicely integrated with wireshark; there are dissectors for the protocols in the wireshark mainline now.

Things which are not working include reading SIM cards, automatic power control (the phone always operates with fixed transmit power), and data transmission with GPRS. Getting GPRS working is evidently a lot of work, and there does not seem to be anybody interested in doing it, so Harald thinks there is "not much of a future" for GPRS support. Also not supported is 3G, which is quite different from GSM and which will not be happening anytime soon. There is also, naturally enough, no official approval for the stack as a whole. Even so, it's a capable system at this point; it is, Harald says, "an Ethernet card for GSM." With OsmocomBB, developers who want to build something on top of GSM have a platform they can work with.

The developers have already discovered a few "wild things" which can be done. It turns out, for example, that there is no authentication of deregistration messages. So it is possible to kick any other phone off the cellular network. There are some basic fuzzing tools available for those who would like to stress the protocols; their usefulness is limited, though, by the fact that the developers can't get any feedback from the cellular carriers.

The GSM industry, Harald says, is making security analysis difficult. So it should not be surprising that the security of existing GSM stacks is quite low. Things are going to have to change in the future; Harald hopes that OsmocomBB will help to drive that change. It is, however, up to the security community to make use of the tools which have been created for them. He hopes that community will step up to the challenge. At this point, TCP/IP security is a boring area; GSM security is where the interesting action is going to be.

Comments (43 posted)

Page editor: Jonathan Corbet
Next page: Security>>