LibreOffice leverages Google’s OSS-Fuzz to improve quality of office suite
LibreOffice leverages Google’s OSS-Fuzz to improve quality of office suite
Posted May 23, 2017 19:58 UTC (Tue) by welinder (guest, #4699)Parent article: LibreOffice leverages Google’s OSS-Fuzz to improve quality of office suite
LO's native file format is xml-inside-zip. If you fuzz a zip file directly, you are going to trip up either the checksum or the compression in the zip layer. I.e., you are fuzzing the zip library and very little of LO. And no finite amount of coverage-based mutation is going to change that. I tried.
If you fuzz the xml and stuff it inside a well-formed zip container you will get further, but you will mostly be testing the xml library. If the xml library does a full validation first, then possibly you will be testing only the xml library because almost any mutation will lead to a malformed xml file. Little fuzzed data will make it into the guts of the program.
Contrast this to file formats that are basically sequences of binary records. Think the outdated xls format. Or some image format. For those you really will manage to get fuzzed data into the guts of the program.
Perhaps someone in the know could tell how they get around these obstacles.
(And, boy, is that article full of PR speak.)
Posted May 23, 2017 20:36 UTC (Tue)
by khim (subscriber, #9252)
[Link] (2 responses)
You could read tutorial. Yes, fuzzing is an art, you need some clever ideas about what to fuzz and at what level. I don't think they started with XML or, even worse, ZIP files. I hope they used some functions which are beyond protection offered by "this must be a valid zip" or "this should be a valid XML" layers. Although I'm not sure if XML fuzzing is so hopeless as you describe it: if you start with valid XML and alter it slightly - chances are great that the result would still be an XML file, but with some internal logic broken (IDs of objects which don't exist are referenced, etc). "Great" here means: "do thousand tries and you'll get one candidate which triggers new path in code"
Posted May 24, 2017 13:31 UTC (Wed)
by welinder (guest, #4699)
[Link] (1 responses)
I have done the xml fuzzing experiment -- with Gnumeric, not LO. I have done it both with
Fuzzing is a numbers game. You really cannot afford to have all but one in a million trials fail early due to consistency tests. If some xml attribute needs to contain a reference to a spreadsheet cell, then random mutation isn't very likely to produce a different, still-valid reference. Contrast that with xls where just about any replacement bit pattern will do.
Note, that the xml used by LO and Gnumeric compress so well that compression is built-in to the file formats. They compress so well precisely because there are so many syntactic rules that the files must satisfy, both at the xml level -- tags must nest properly -- and at the application level -- some attribute must be an integer.
Posted May 24, 2017 14:57 UTC (Wed)
by epa (subscriber, #39769)
[Link]
Posted May 23, 2017 20:40 UTC (Tue)
by xtifr (guest, #143)
[Link]
Posted May 24, 2017 9:31 UTC (Wed)
by epa (subscriber, #39769)
[Link] (2 responses)
Posted May 25, 2017 7:39 UTC (Thu)
by shiftee (subscriber, #110711)
[Link] (1 responses)
Posted May 25, 2017 20:12 UTC (Thu)
by spaetz (guest, #32870)
[Link]
LibreOffice leverages Google’s OSS-Fuzz to improve quality of office suite
LibreOffice leverages Google’s OSS-Fuzz to improve quality of office suite
"binary" fuzzers like American Fuzzy Lop and specialized fuzzers that know a good deal about xml. While the latter is getting you further, faster, the coverage is still depressingly shallow.
The only good news here is that they get precisely the same outer-layer bugs that anyone else doing an automated scan will get.
LibreOffice leverages Google’s OSS-Fuzz to improve quality of office suite
LibreOffice leverages Google’s OSS-Fuzz to improve quality of office suite
LibreOffice leverages Google’s OSS-Fuzz to improve quality of office suite
LibreOffice leverages Google’s OSS-Fuzz to improve quality of office suite
It's useful for scripting
LibreOffice leverages Google’s OSS-Fuzz to improve quality of office suite