|Benefits for LWN subscribers|
The primary benefit from subscribing to LWN is helping to keep us publishing, but, beyond that, subscribers get immediate access to all site content and access to a number of extra site features. Please sign up today!
Changing any widely used network protocol is a risky affair—more so with a protocol as ubiquitous as HTTP. But the venerable web transport protocol is expected to see its first update of the century later this year when HTTP2 is slated for release. The new revision can be hard to characterize, since it changes the way HTTP is sent over the wire, but it does not alter the existing HTTP semantics. In essence, the revisions are designed to speed up how HTTP traffic is relayed between clients, servers, proxies, and switches, but to keep intact the format and meaning of HTTP requests and responses. The upshot will be faster throughput, even if few people outside of network operators will see the actual changes to the protocol.
HTTP has not been updated since 1999, when the Internet Engineering Task Force (IETF) published version 1.1 in RFC 2616. Version 1.1, notably, added support for persistent, pipelined TCP connections between clients and servers. Thus, it was no longer necessary to open and close a separate TCP connection for each HTTP request, and clients could send multiple requests without waiting for each request's response before sending the next.
Those changes did a lot to reduce latency in HTTP, and the concepts implemented arose from the real-world problems experienced with HTTP 1.0. The HTTP2 situation is similar; the vendors that make web software and networking hardware—as well as large-scale web service providers—identified a new set of pain points as the World Wide Web continued to evolve. Several players (including Google and Microsoft) started their own research projects to develop improvements, and by 2007 there was enough interest for the IETF to charter a new working group to address a new revision.
The working group is called the HTTP Bis group—where Bis is a perhaps uncommon designation meaning "twice" that is intended to communicate that the group's charter is not to invent new methods or headers, but just to improve on existing HTTP functionality. The notion that HTTP2 is semantically compatible with HTTP 1.1 is clearly important to the project. The group has recently taken to calling revised protocol HTTP2 or HTTP/2, rather than "HTTP 2.0," just to make such a distinction more prominent, although there is not yet universal agreement on the point.
The most recent HTTP2 draft is version 10, which was published on February 13. The first draft was based directly on Google's SPDY (which we looked at in 2009), although as one would expect, it has evolved in a number of ways since then. The basic approach, however, remains the same.
HTTP2 is backward compatible with all of HTTP 1.1's request methods, responses, and headers. But it attempts to decrease the overall latency of serving a page to a client by multiplexing requests, compressing HTTP headers, and allowing the server to prioritize some requests over others.
Multiplexing requests is in many ways an extension to the pipelining feature of HTTP 1.1. Pipelining allows the client to send request #2 before request #1 has been fulfilled by the server, but it still has a weakness: request #2 is still processed by the server after request #1. So if request #1 takes a long time, it still introduces latency. This is called "head-of-line blocking." HTTP 1.1 is also hampered by the requirement that clients not open more than two simultaneous connections to the same server. HTTP2 does away with that limit, and averts head-of-line blocking by multiplexing several HTTP request connections over the same underlying TCP connection (testing of current implementations seems to indicate somewhere between four to eight streams as the best number on average). Since these multiplexed HTTP streams are independent of each other, it is far less likely for slow requests to block the delivery of the entire page.
HTTP2 also compresses the HTTP headers, which typically contain a lot of redundancy—particularly between related requests for the same page. The earliest implementations used gzip for compression, and built a shared Huffman code dictionary for all of the requests on a single connection. But that scheme was found to be vulnerable to the CRIME (Compression Ratio Info-leak Made Easy) attack, in which an attacker can discover the bytes of a session cookie by injecting test bytes into HTTP requests—if the compressed result is the same length as the unmodified request, then the attacker knows the test byte is indeed part of the cookie. The HTTP Bis group has subsequently moved to a new compression scheme called HPACK.
There is little opposition to header compression as an idea, but the same cannot be said of another HTTP2 optimization: binary encoding. As befits a compressed format, HTTP2's frame format is binary, both for the header and the payload. This decision has its critics, such as John Graham-Cumming, who said that he wondered "whether it was really necessary to add a complex, binary protocol to the HTTP suite." But, in response, Mozilla's Patrick McManus commented that the orderly message boundaries of the binary format make it considerably more straightforward—and, therefore, faster—to parse than HTTP 1.1's ASCII. Of course, it could also be argued that the binary format is not human-readable, but the counter-argument is that it is merely the binary encoding of existing ASCII request/response semantics, rather than something new.
HTTP2 adds several other mechanisms that can speed up the process of fulfilling requests, but which (unlike compression and multiplexing) require some deployment-time tweaks to produce optimized results. For instance, servers can prioritize one request stream over another. The assumption is that some content (such as page text) is more critical to deliver to the client than others (such as CSS or decorative images), but, naturally, the decision of which resources to prioritize and which to leave untouched cannot be solved in the abstract. Similarly, HTTP2 allows servers to effectively "push" some resources, initiating the transfer before the client makes the actual request. The most effective implementation might begin sending linked-in CSS or image resources after the main page is requested, but the protocol itself does not dictate how servers decide what to push.
Work continues on HTTP2, and the final draft is not expected until mid-2014. Most recently, discussion has included whether (and how) the new protocol should deal with encryption. At one time or another, dropping support for unencrypted http:// URIs was considered, as was dropping HTTPS as a separate protocol—in favor of requiring that HTTP2 run over TLS. For now, no such large steps have been taken. But the group has introduced HTTP Alternate Services (ALTSVC), a new HTTP header through which a server can move a URI request to a different host, port, or even protocol, without tearing down the connection and starting a new one.
This technique could be used to "upgrade" an http:// URI request from HTTP 1.1 to HTTP2, and thus move the connection to TLS. ALTSVC is a comparatively recent draft, which is still subject to change. Among other things, upgrading a connection to TLS relies on the presence of some specific TLS extensions, such as Server Name Indication (SNI), so while the idea of "opportunistic encryption" sounds appealing to many, there may be a fair amount of tire-kicking left before it is finalized.
The working group maintains a list of issues on GitHub and a publicly archived mailing list at the W3C site, which makes it easy to follow the development of the protocol. Just as importantly, the group tracks working implementations of HTTP2. Firefox and Chromium already support HTTP2, and Google, Twitter, and a few other large-scale application services support it on the server side. Since many of the benefits of the new protocol rely on implementations testing and getting the optimization details right, this is welcome news, and keeping in line with the IETF mantra of "rough consensus and working code." End users, of course, will notice only reduced page load times if they notice anything at all—but perhaps that is a good goal for a protocol update to set.
Copyright © 2014, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds