Inside the mind of a Coccinelle programmer
Coccinelle developer Julia Lawall started her keynote at the Linux Security Summit in Toronto by saying that she didn't know much about security. But the tool she developed, and the patches it has generated, have clearly fixed various patterns of bugs in the kernel, some of which have had security implications. In the talk, Lawall related some of what she has learned by using the tool, while also introducing its capabilities to the roughly 90 attendees.
![Julia Lawall [Julia Lawall]](https://static.lwn.net/images/2016/lss-lawall-sm.jpg)
The principle behind Coccinelle is to "find once, fix everywhere". It is a static analyzer to find patterns in C code. So when someone spots a problem in a large code base, Coccinelle can be used to find other examples of the same kind of bug. It can also then apply transformations to the code to create patches to fix those other occurrences.
Semantic patches
Coccinelle is user-scriptable using "semantic patches", which are based on the patch notation familiar to developers. Ideally, she said, there is not much new that someone has to learn to start using the tool. The goal from the outset was to make it accessible to C developers.
Lawall showed a classic example of where Coccinelle can be used. A commit
message by Al Viro ("wmi: (!x & y) strikes again
") pointed to
a recurring problem with tests that look like:
if (!block->flags & ACPI_WMI_METHOD)The proper fix is to parenthesize the expression:
if (!(block->flags & ACPI_WMI_METHOD))The problem is that the logical negation (!) operator binds more tightly than the bitwise-and (&) leading to an incorrect test. Given the message, it is clear this problem occurs with some frequency, but it is difficult to use grep to find this kind of pattern. There are too many ! and & characters in the kernel and the test may stretch over multiple lines, which will also defeat simple searches.
But a simple Coccinelle semantic patch will find (and fix by way of a generated patch) these kinds of problems:
@@ expression E; constant C; @@ - !E & C + !(E & C)Lawall then showed an example of the same problem (spanning two lines) elsewhere in the kernel and how the code was changed to fix it.
History
Coccinelle began when Lawall was on sabbatical in 2004. At that time, the 2.6 kernel had recently been released, but many drivers were still targeting 2.4. She wanted to automate the porting of those drivers to 2.6, but that "turned out to be unrealistic". However, certain changes, which she called "collateral evolution", could be made automatically, such as changes to function calls like adding new parameters.
To solve the problem of collateral evolution, Coccinelle was initially developed by Lawall and three others at the University of Copenhagen between 2005 and 2007. The first patches to the kernel based on Coccinelle output were submitted in 2007; they addressed places where kmalloc() calls were followed by a memset() and replaced those with calls to kzalloc(). Two papers followed, in 2008 and 2011.
Currently, Coccinelle is developed by Lawall and three others at Inria. It is entirely implemented in OCaml, which may reduce the contributor base by a substantial amount, she said to some audience laughter. On the other hand, though, that does reduce the number of patches she has to review.
The tool has had a fairly large impact on the kernel since the first patches were posted in 2007. Over 4,500 patches in the kernel mention Coccinelle; 3,000 of those are from more than 500 developers outside of the Coccinelle project. There are also 56 semantic patches used as tests that are part of the kernel (accessible via make coccicheck). There are more Linux-related semantic patches available at coccinellery.org.
The semantic patches in the kernel are meant to catch various types of errors that can crop up. There are tests for generic C errors (such as testing for unsigned variables for values less than zero or null pointer dereferencing), generic Linux problems (double locks or using the iterator index after a loop), API-specific errors (using free() on a devm allocation), and API modernization (using kmemdup() rather than kmalloc() and memcpy()). Contributions of new semantic patches are welcome; contributors can either make the patch themselves or tell the team about the pattern for it to create the patch.
Applications of Coccinelle
Lawall then presented some more complex applications of Coccinelle to the kernel. The first was an effort at "devmification"—switching drivers to use the devm framework to manage their memory to avoid memory leaks. A driver's probe function often allocates memory with kzalloc() that then gets freed in the remove function. Switching to devm_kzalloc() means that the devm library will manage the memory instead.
There were three steps to the process; the first was to find the names of the probe and remove functions. For that, a regular expression could have been used, but that "seems unappealing", she said. Pointers to the functions are put into the platform_driver (and similar) structures, so their names can be extracted with a Coccinelle rule. The second step is to find the definition of the probe function and to transform the kzalloc() calls to devm_kzalloc(), while also removing the kfree() calls from any error paths. Lastly, it removes the kfree() call from the remove function.
Lawall worked on this semantic patch in 2012 and ended up submitting 39 patches to the kernel to make that change. Her semantic patch finds over 170 opportunities to make the change, but they can't just be applied blindly; there is a responsibility to look at the patches that are generated, she said. The whole process takes just 30 seconds to run (using GLIMPSE indexing), so it changes the problem from finding the pattern to checking the fixes.
Another application of Coccinelle was to detect user-level calls that might block due to a page fault (e.g. copy_*_user(), get_user()) that were being called under a spinlock. Blocking is not allowed under a spinlock, so those calls should not be made under the lock. The patch simply reported when it found one of the candidate functions between the spin_lock() (and variants) and the spin_unlock(). Normally, Coccinelle handles single functions at a time, but iteration can be used to scale it up to look at inter-function and inter-file patterns.
One confirmed bug has been found with copy_from_user(). The copy_to_user() checking found a problem but there was a "FIXME" comment next to it, which might indicate the developer plans to get back to it at some point, she said. When that was met with laughter, she replied: "I understand your lack of confidence." That checking also found a false positive. She noted that she is not searching for perfection with the semantic patches, just trying to find things of interest, so she is not interested in complicating the semantic patch to eliminate all false positives. The get_user() and put_user() checking has found one occurrence of each, but she has not yet evaluated them.
She also worked on a semantic patch for "constification"—adding the const keyword to some structures in the kernel. She used a four-step process that included two manual steps. She used Coccinelle to identify structures that only contain function pointers. Those are not the only ones that need const, she said, but they are often good candidates. In addition, protecting those structures from getting overwritten is important to avoid code execution in the kernel. From the list of such structures, she will choose one manually then use Coccinelle to change all uses of the type to add const. The last step is to build the kernel using GCC to see if it all works.
She submitted 115 patches in 2015 for constification. She does not pay attention to whether they actually get merged. None have been turned down that she remembers, but some of them may not have gotten picked up. Detecting structures of all function pointers is slow—she runs it overnight—but adding const to a type is instantaneous.
Lessons learned
Lawall has learned some lessons along the way that she wanted to pass along. The most important one is to start with something simple. The semantic patches in the kernel tree are more complex and are not representative of where to start with the tool. Start with a common case that might end up with some false positives; perhaps you get 100 results, which results in finding two bugs. Making it more complex is possible, of course, but there is a tradeoff there.
For example, during the devmification process it was hard to figure out how to get rid of the right kfree() in the remove function. She chose a simple solution by just assuming that the kernel developers would use the same name for the variable to be freed that was used in the probe function. It is "completely unsafe, but it works fairly well". The worst case, though, would be false negatives—not removing a kfree() that should have been removed. So she did a test where she used Coccinelle to delete all of the kfree() calls in the remove functions, which turns out to be the same set of files as with her rule, so it "seems probable that the rule is OK" and "simple was good enough".
Semantic patches should be developed incrementally, she said. The devmification rules resulted in 171 files, which might take some time to analyze. In addition, devm frees the memory it manages in a last-in-first-out order, which might cause differences. So she rewrote her semantic patch to avoid any value-returning function calls before the kzalloc() in the probe function or any remove functions that made calls after kfree(). That reduced the search space to 51 files.
Similarly, the checking for blocking functions under a spinlock produced lots of false positives. Conditionals in the code may cause the lock status to be inaccurate in the analysis. To avoid that, she changed the test to look for places where the blocking calls appear after the lock release, then looked closely at places where that wasn't true.
The last lesson she imparted was to exploit information from other tools. The constification effort used GCC to determine if the change would build. But, ideally, there would be ways to warn users about potentially difficult cases. In order to do that, you might want to gather information about the structures (are some already const, which might indicate that making them all const might work?) and about the kernel build with regard to the structure (are all of instances of the structures in files that are built with the current configuration?). Those would indicate structures that might be easier or harder to change and verify to help prioritize which ones to tackle. That kind of information can be gathered by Python or OCaml programs to help users narrow in on good candidates.
In conclusion she said that the tool is useful for a number of different kinds of problems. Coccinelle "compromises on both soundness and completeness, but it is still useful in practice". Developers would be well-served by adding it to their arsenal.
[I would like to thank the Linux Foundation for travel support to attend
the Linux Security Summit in Toronto.]
Index entries for this article | |
---|---|
Kernel | Development tools/Coccinelle |
Conference | Linux Security Summit/2016 |
Posted Sep 1, 2016 9:07 UTC (Thu)
by marcH (subscriber, #57642)
[Link] (24 responses)
In other words, if Academia got a clue it would start by researching itself and then move away from fiction and back to knowledge.
Posted Sep 1, 2016 13:31 UTC (Thu)
by paulj (subscriber, #341)
[Link] (23 responses)
Posted Sep 1, 2016 14:09 UTC (Thu)
by marcH (subscriber, #57642)
[Link] (22 responses)
I think it's much harder to blame politicians this time. In my experience the issue is more of a self-perpetuating system where the people who play the game the best get on top and have no interest in changing it.
[*] about as much as the people who elect them.
Posted Sep 1, 2016 15:10 UTC (Thu)
by paulj (subscriber, #341)
[Link] (21 responses)
Also, "impact" outside of academia increasingly *is* becoming a criteria that funding bodies take into account.
Note, the academics don't per se themselves like the system, but they're trapped in it and have to play within it to progress their career. E.g., the ridiculous publishing system, where academics do nearly all the hard work of reviewing papers and the editorial work of setting direction, etc., all for free on time paid for largely by tax-payer money, and then for publishing companies to charge _huge_ amounts of money to university libraries and ridiculous per-article fees to casual users - they nearly all hate it. Yet, many still submit their papers to such venues, because those venues have the prestige in their field and that's the place to publish if you want your career to go somewhere.
That system is slowly cracking, and the fact that funding bodies started to _demand_ open access to the resulting publications was the impetus for it.
Now, the publishing companies responded by simply charging ridiculous amounts to the academics to publish (which the funder has to meet), but that's another battle.
The funding bodies are the best placed to change things.
Posted Sep 1, 2016 15:45 UTC (Thu)
by deater (subscriber, #11746)
[Link] (16 responses)
There are way too many who are wed to the status quo and the ivory tower where theory reigns and actual results aren't worthy of anything more than a "footnote in Wikipedia" (as a recent anonymous reviewer told me about my research).
I've actually given up trying to get an academic publication out of my perf_fuzzer work as academic peer-reviewers seem to think that fuzzing is a solved topic and not worth investigating.
What we really need is a reputable Linux conference with a peer-reviewed publication attached. Since the demise of OLS we haven't had one, and a lot of interesting Linux related papers just aren't getting published. It would be nice if one of the tech related Linux conferences could team up with ACM/IEEE/USENIX or similar to archive the documents, which would really help with the perceived stature of the publications.
Also, while I'm making impossible suggestions, it would be nice if the Linux Foundation would offer grants to cover grad student research projects (with the stipulation that all results are open-source). That would also do wonders about getting more academics into mainstream Linux development.
Posted Sep 1, 2016 16:28 UTC (Thu)
by marcH (subscriber, #57642)
[Link] (14 responses)
How much effort to either reform or start afresh a new "Association for Computer Science" which would not accept any publication unless the results are actually *reproductible*; including open-sourced. As if it were, you know,... Science.
> Also, while I'm making impossible suggestions,
...
Posted Sep 2, 2016 8:30 UTC (Fri)
by eternaleye (guest, #67051)
[Link]
Posted Sep 2, 2016 9:02 UTC (Fri)
by paulj (subscriber, #341)
[Link] (12 responses)
Note: *Requiring* that a result be reproduced before publication would harm science though. And it is hard to establish whether a paper is reproducible until someone tries. Next, whether anyone will have the motivation to reproduce is not something the publication venue can control. That - again - is determined by the reward/compensation structures within academia, and those are determined by the funding bodies. The publication venues are downstream consumers of that money ultimately.
Again, open access science publishing became viable because funding bodies made it a requirement. Making reproduction of results a priority similarly really needs funding bodies to demand it.
Posted Sep 2, 2016 16:13 UTC (Fri)
by marcH (subscriber, #57642)
[Link]
Posted Sep 7, 2016 19:41 UTC (Wed)
by nybble41 (subscriber, #55106)
[Link] (2 responses)
Instead of requiring papers to actually be reproduced before they can be published, require that they only *refer* to previous results which are supported by two or more papers. Thus anything can be published, but one can only build on the results which have been proven to be reproducible. (The current paper can count as the second source if it's reproducing the result of a prior study.)
Posted Sep 7, 2016 23:14 UTC (Wed)
by karkhaz (subscriber, #99844)
[Link] (1 responses)
The problem is that the evaluation is incredibly time-consuming, and not at all rewarding---thus encouraging superficial reviews. What often happens is that some member of the programme committee (an academic) (or a member of a separate artefact evaluation committee) gets the artefact to review, and they generally pass it off to unfortunate students or postdocs. It is a difficult, thankless job to review somebody else's artefact...even more so than evaluating their papers.
http://popl15-aec.cs.umass.edu/about/
Posted Sep 8, 2016 3:47 UTC (Thu)
by deater (subscriber, #11746)
[Link]
What is probably needed is a new top-tier venue (conference, journal, etc) that will only accept work that's been reproduced by at least two independent teams. All three groups would get equal credit for the final publication, and papers published in such a venue should be considered the best there is.
Posted Sep 8, 2016 3:47 UTC (Thu)
by marcH (subscriber, #57642)
[Link] (7 responses)
Whether engineering would harm science or not, it would certainly put the cart before the horses.
It's neither easy nor cheap to reproduce results, however it's also easy to see when current publications can't be reproduced since... they don't even *try* to publish source and instructions!
It's like testing: testing can only prove the existence of bugs and not their absence. Yet almost everyone runs tests.
Posted Sep 8, 2016 8:19 UTC (Thu)
by paulj (subscriber, #341)
[Link] (5 responses)
Source code and instructions can give one confidence of a very basic level of reproducibility - that the data in the paper is reproducible from the system they claim to have implemented. However, it can not verify that they have implemented the claims, or implemented them correctly, or that the claims are even implementable. Source code and runnable systems (e.g. VM with the software) just are not a magic bullet that solves the reproducibility issue.
There was some discussion in an ACM article on reproducibility, and on classifying the different levels of reproducibility - amongst a number of articles on the issue, and a task force. Many people are concerned about it, but it's not as simple as just requiring publishing of source, and job done.
Posted Sep 8, 2016 14:18 UTC (Thu)
by marcH (subscriber, #57642)
[Link]
Posted Sep 8, 2016 16:20 UTC (Thu)
by farnz (subscriber, #17727)
[Link]
Is there a public link to the ACM article on reproducibility? I'd be interested in reading up on their classifications.
Posted Sep 8, 2016 19:01 UTC (Thu)
by excors (subscriber, #95769)
[Link] (2 responses)
How well do proper sciences handle this? I assume that if you want to reproduce an experimental physics paper, you don't ask the authors to ship their original apparatus to you, then connect up all the power and cooling and other environmental dependencies, find replacements for the bits they forgot to ship or weren't allowed to, fix all the bits that decayed in the years since they originally performed the experiment or that broke in transit, and once you've got it all set up, hit the "go" button and check that the numbers coming out the other end match the original paper's results, and say you've successfully reproduced it.
Hmm, maybe the important difference is that physics has to generally go top-down - we don't know the universe's underlying rules, and we're trying to figure them out from observations of complex high-level systems; whereas computer science can be understood bottom-up - we already know exactly how a computer works at the fundamental level of bytes and cycles, and we're building them up into ever-more-complex systems that provide some benefit to users and trying to understand those systems' behaviour.
That means a good experimental result in physics tells us about a fundamental truth of the universe, which can be approached from many different directions to confirm it, and then is universally applicable. But a good experimental result in computer science is about the behaviour of a specific complex system, which can only be reproduced by someone with the same complex system, and is often irrelevant outside of that system.
Because they have such limited scope, I think experimental results in CS aren't inherently valuable - what's valuable is that the research is developing some clever new idea as part of its complex system, and either that system is useful by itself (i.e. it's software with users, or may be in the future), or the idea can inspire someone else working on a different system that's useful (which will have different constraints that might make the idea more or less appropriate, so it'll always need careful testing and tweaking in its new context).
The experiments in the original research are useful merely to filter out ideas that are impossible to implement or don't work at all. It doesn't actually matter if they can't be reproduced or are occasionally wrong, because nobody else is going to depend on those results anyway; it just matters that the ideas are interesting and clearly expressed so that other people can adapt and test them in their own systems.
Posted Sep 8, 2016 19:37 UTC (Thu)
by sfeam (subscriber, #2841)
[Link]
Here, for example, are the guidelines for submission to the annual "web server" issue of Nucleic Acids Research. The same journal has an annual "database" issue with similar strict requirements for reproducible use of new code.
Here are the guidelines from another journal I have worked with (Journal of Applied Crystallography), short enough to reproduce here:
2.7. Computer Programs
A description of the purpose, strategy, computer language, machine requirements, input requirements and type of results obtained should be included. A Computer Program article should make clear the novelty of the program and its scientific context. Articles on significant updates to existing programs will also be considered, as will objective contributions discussing commercial packages. Authors would usually be expected to be those who developed the program. The utility of the program and adequacy of the documentation should normally have been proved by the successful use of the program by at least two independent users. A web page about the program (including, where applicable, documentation, contacts to developers and links to download the software itself) should also be available. For review purposes, authors should make provision for referees to be able to access the program anonymously.
Posted Sep 12, 2016 15:40 UTC (Mon)
by Wol (subscriber, #4433)
[Link]
Please define "proper sciences"!!!
It is a well established school of thought that things like Chemistry, Medicine and Biology do not qualify.
My view (which seems a minority) is that things like Theoretical Physics do not qualify.
I'm inclined to agree with a previous poster, though, in that things should not be referenced as "truth" unless they have been successfully replicated. It's okay to refer to them as hypotheses, though.
That's my beef with all this stuff about obesity and low-fat diets. To the best of my knowledge NONE of it has been replicated, and to a sceptical observer pretty much all of it belongs in the Journal of Irreproducible Results, aka the Journal of Fake Science.
Cheers,
Posted Sep 8, 2016 16:19 UTC (Thu)
by farnz (subscriber, #17727)
[Link]
I'm not convinced that requiring source and instructions is helpful; most (all?) software written as part of producing a CS paper is sufficiently poor that it won't run away from the author's systems, as it comes with too many assumptions about the environment around it. I've also seen what happens when academics get embedded in places like Google, where they'll code with the assumption that the entire Google codebase is available to them.
In spherical cow land, the methodology in the paper is all you need to reproduce the result - you don't need the code. Granted, many papers don't meet that standard, but it's the standard that papers should be aiming for, using code snippets, formal methods etc to ensure that you can reproduce in your environment.
Given what I've seen, I would expect that the consequences of demanding "corresponding source" (to steal the GPL's wording) would be to reduce the number of published results; when you can't even build something without Google's complete CitC infra behind you, and you don't understand what, of the many things provided for you by the Google environment, you need for your incremental result on top to be worthy of publishing, how are you going to publish if you have to rewrite from scratch in a minimal environment? If I'm allowed to publish just the source I wrote, how are you going to test it when step 1 of the instructions is "ensure you have access to Google's CitC infra"?
Posted Sep 2, 2016 9:19 UTC (Fri)
by paulj (subscriber, #341)
[Link]
Science and research is a bargain between society and academia, and there need to be conversations about their respective responsibilities and interests, and tweak things to keep the bargain working sufficiently well for everyone. E.g., science can say "Actually, reproducibility would be good, so could you make that something rewarded by the funding bodies you fund to fund us?" and society (in various ways) can influence academia through how it funds science. Etc.
The ACM/IEEE idea is a good one (UseNIX is a good venue too, but maybe not quite enough stature on the academic side?). ACM dues are relatively affordable, and you get access to a wealth of information via the library - I like ACM.
Posted Sep 1, 2016 16:14 UTC (Thu)
by raven667 (subscriber, #5198)
[Link] (3 responses)
You are absolutely right though that there is a problem with academia right now where the incentives for what research is done and how careers get built need major reform, but I don't think the mechanism you describe will get the job done, it will only make things worse. Across many of the sciences there are problems where funding mostly goes to safe and cheap research where you are most likely to get a positive result and papers with negative results or replication studies don't get published or don't get funded. The fact there is no prestige in replicating someone else's work, or in failure, is a change in attitude which is going to take a long time to fix and not something which can be dictated from above.
We can learn from the scientific fields which have already taken on some of these reforms and put pressure on the funding agencies and publishers to enforce higher standards for science.
Posted Sep 2, 2016 9:12 UTC (Fri)
by paulj (subscriber, #341)
[Link] (2 responses)
E.g., the increasing importance of demonstrating "Impact" in things like Horizon20:20 applications is, I gather, driven (directly or indirectly) by civil-service / political needs to see some real-world results of the funding that goes into research. In the UK, and I think EU projects too, there are also policies to encourage (even require, for larger projects I think) funding applications for projects to include some type of collaboration with industry. That, I believe, is a politically (inc civil service) set policy.
And yes, the issue is you can't publish negative results (directly) - though you can perhaps have those added as comments on the original article or follow-up letters in some venues, and there's no career gain (indeed probably harm) in work that just reproduces results. I don't see how you can fix this from below alone either. Few people can swim against the (funding body priorities) stream for long.
Posted Sep 2, 2016 9:59 UTC (Fri)
by jll (guest, #97941)
[Link] (1 responses)
Coccinelle is developed in France, where the key to getting funding is support from industry.
Posted Sep 2, 2016 16:16 UTC (Fri)
by marcH (subscriber, #57642)
[Link]
Posted Sep 2, 2016 0:19 UTC (Fri)
by lacos (guest, #70616)
[Link] (1 responses)
$ git log | grep -c -i coccinelle
and there's even a scripts/coccinelle/ directory in the git tree.
Example series:
Posted Sep 2, 2016 9:55 UTC (Fri)
by jnareb (subscriber, #46500)
[Link]
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Hm? I've had a great deal of fun reading everything I found intriguing in the hundreds of pages of USENIX archives, which not only contain papers and slides, but (more recently) audio and video as well.
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
http://conf.researchr.org/track/pldi2015/PLDI+2015+Artifa...
http://i-cav.org/2016/artifact-evaluation/
Inside the mind of a Coccinelle programmer
If research isn't being checked because it's "too hard" then it's incredibly easy to get ahead by just making up results and being vague in your methodology.
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
The state of affairs is much more satisfactory in computational biology. Of course it depends on the publication venue, but a paper announcing a new computational tool or method generally requires demonstrating that it works on a reasonable set of test data. NIH-funding generally comes with a requirement for "dissemination" of the program, although this does not map onto any particular license. In my experience both in submitting and reviewing such papers, the code and the test data are made available to the reviewers on request, and the requirements for demonstrating usability outside the originating group can be quite strict. Requirements for academic publication of code
Inside the mind of a Coccinelle programmer
Wol
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
Inside the mind of a Coccinelle programmer
QEMU uses Coccinelle too
114
https://github.com/qemu/qemu/compare/96165b9eb420...a2c5e...
QEMU uses Coccinelle too
Better use
$ git log | grep -c -i coccinelle
114
$ git log -i --grep coccinelle --oneline | wc -l
It would avoid (possible) double counting, and I think it is faster