Toward better handling of hardware vulnerabilities
The definitive story of the response to Meltdown and Spectre has not yet been written, but a fair amount of information has shown up in bits and pieces. Intel was first notified of the problem in July 2017, but didn't get around to telling anybody in the the Linux community about it until the end of October. When that disclosure happened, Intel did not allow the community to work together to fix it; instead each distributor (or other vendor) was mostly left on its own and not allowed to talk to the others. Only at the end of December, right before the disclosure (and the year-end holidays), were members of the community allowed to talk to each other.
The results of this approach were many, and few were good. The developers charged with responding to these problems were isolated and under heavy stress for two months; they still have not been adequately thanked for the effort they put in. Many important stakeholders, including distributions like Debian and the "tier-two" cloud providers, were not informed at all prior to the general disclosure and found themselves scrambling. Different distributors shipped different fixes, many of which had to be massively revised before entry into the mainline kernel. When the dust settled, there was a lot of anger left simmering in its wake.
By all accounts, the handling of the recently disclosed L1TF vulnerability was much better. As Thomas Gleixner, who has been heavily involved in dealing with both sets of vulnerabilities, put it:
So it would appear that something has been learned in the last year. That said, the kernel community, under the reasonable impression that the last hardware vulnerability has not yet been seen, would like to find ways to make the process work even better. As is often the case, that seems to involve convincing vendors to stop sticking with procedures developed for a proprietary world and deal with the community on something closer to its own terms.
One sticking point is non-disclosure agreements (NDAs). They tend to be a fact of life in the technology industry, but NDAs do tend to run counter to how the community actually works. Resolving a complex hardware vulnerability can require bringing in more developers late in the game as the implications of the problem become clear, but the NDA process can make that difficult or impossible, even if the developers involved are willing to sign such an agreement. As Gleixner explained, that complicated the response to L1TF:
There were suggestions that a community-wide NDA should be negotiated, but actually making that happen is a daunting prospect. Even if the vendors could be convinced to join such a scheme, many developers are likely to resist. Both Kroah-Hartman and Linus Torvalds have made it clear that they will not be signing any NDAs for this purpose.
What may be more likely to happen is the approach taken by Torvalds now:
"I've had a gentleman's agreement with companies - nothing legally
binding, but over the years people have come to realize that the leaks
don't come from me
". The "gentleman's agreement" (perhaps better
called something like "gentlehacker's agreement") idea has some appeal in
both the community and the industry, it seems. It has been shown to work
with various developers over time, and it is evidently well understood that the
power of an NDA to actually compel non-disclosure is nearly nonexistent.
Dave Hansen noted
that what companies really want is to feel that the information is under
control, and that some companies have managed to find ways to get that
"warm and fuzzy feeling
" without using NDAs. It would be good
for companies that have made progress in this area to share their
experience, he said:
With luck, over time, vendors will figure out the best ways to work with the community, much like they have with contributions of software. And, for those that don't, Torvalds said in his usual blunt style, the community should just refuse to deal with them.
One issue that remains difficult to address is backporting large security fixes to the stable kernels. As a kernel release gets older, this kind of backport gets harder, to the point that nobody is willing to do it. Thus, Kroah-Hartman explained, the 4.4 kernels do not have the full L1TF fixes, and the Spectre fixes for 32-bit ARM processors do not go back past 4.18. When backports do happen, they tend to bring regressions with them. It's not clear that there will ever be a satisfactory solution to this problem, which is why developers tend to advise upgrading to more recent kernels.
One thing developers have said repeatedly is that, as noted in the
introduction, hardware vulnerabilities look an awful lot like
software vulnerabilities to kernel developers. Both must be fixed in
software, and the
community has some well-exercised procedures for fixing vulnerabilities in
software. Those procedures include finding the developers whose expertise
is needed to address a specific problem and telling them what they need to
know. Many kernel developers have helped to fix vulnerabilities under this
system, with minimal leakage of sensitive information prior to the official
disclosure. Once vendors realize that the community's process works and
can be trusted, the response to this type of problem will go much more
smoothly for everybody involved.
Index entries for this article | |
---|---|
Kernel | Development model/Security issues |
Kernel | Security/Meltdown and Spectre |
Security | Hardware vulnerabilities |
Security | Meltdown and Spectre |