Troubles with triaging syzbot reports
A report from the syzbot kernel fuzz-testing robot does not usually spawn a vitriolic mailing-list thread, but that is just what happened recently. While the invective is regrettable, the underlying issue is important. The dispute revolves around how best to report bugs to affected subsystems and, ultimately, how not to waste maintainers' time.
Al Viro was apparently fed
up with syzbot reports that involved the ntfs3
filesystem but that were not copied (CCed) to the maintainers of ntfs3.
The syzbot message was sent to the kernel mailing list, but Viro shouted
his reply that
"ANY BUG REPORTS INVOLVING NTFS3 IN
REPRODUCER NEED TO BE CCED TO MAINTAINERS OF NTFS3
". That complaint had
been relayed several times in the past, he indicated, without the problem
getting
fixed, so he was planning to stop looking at the reports. In fact, they
will be "getting triaged
straight to /dev/null here
".
After an ... impenetrable reply from Hillf Danton, Viro followed up with more details of the problems he sees. He pointed to a post from September where he made a similar request and said that others had also reported these kinds of problems to the maintainers of syzbot. The issue is that the mail sent by syzbot does not contain enough useful information for someone to quickly determine if it pertains to their area of interest:
It's really a matter of triage; as it is, syzkaller folks are expecting that any mail from the bot will be looked into by everyone on fsdevel, on the off-chance that it's relevant for them. What's more, it's not just "read the mail" - information in the mail body is next to useless in such situations. [...]What really pisses me off is that on the sending side the required check is trivial - if you are going to fuzz a filesystem, put a note into report, preferably in subject. Sure, it's your code, you get to decide what to spend your time upon (you == syzkaller maintainers). But please keep in mind that for [recipients] it's a lot of recurring work, worthless for the majority of those who end up bothering with it. Every time they receive a mail from that source.
Ignore polite suggestions enough times, earn a mix of impolite ones and .procmailrc recipes, it's that simple...
Danton misunderstood
what Viro was complaining about, but Matthew Wilcox tried to explain.
The complaint is not that the linux-fsdevel list is being
copied on the mail, but that the ntfs3 maintainers are not. Wilcox said:
"So this is just noise.
And enough noise means that signal is lost.
"
Viro agreed and
painstakingly described exactly how he (and any other interested recipient
of a syzbot report) would triage it, which eventually ends up at the syzkaller
dashboard entry for the bug and its syzkaller
reproducer. That file, which resembles "line
noise
", as Viro noted, does contain enough information to see that it
was an ntfs3 filesystem that was being fuzzed. But that information is not
in the email (or, better still, email subject), nor is it used to direct
the report to the right people to look at it. The underlying problem is
that the syzkaller/syzbot maintainers are not providing the relevant data,
which should be easily obtained:
From what I've seen in various discussions, the assumption of syzkaller folks seems to be that most of the relevant information is in stack trace and that's sufficient for practical purposes - anything beyond that is seen as unwarranted special-casing. [...]Face it, the underlying assumption is broken - for a large class of reports the stack trace does not contain the relevant information. It needs to be augmented by the data that should be very easy to get for the bot. Sure, your code, your priorities, but reports are only useful when they are not ignored and training people to ignore those is a bad idea...
Ted Ts'o agreed,
noting that he has been asking for improvements of this sort for several
years. Syzbot "is not doing things that really could be
done automatically --- and cloud VM time is cheap, and upstream
maintainer time is expensive
". In effect, the syzbot developers are
not being respectful of upstream maintainers' time, he said. Things have
been improving, but not in this particular area:
Now, to be fair to the Syzbot team, the Syzbot console has gotten much better. You can now download the syzbot trace, and download the mounted file system, when before, you had to do a lot more work to extract the file system (which is stored in separate constant C array's as compressed data) from the C reproducer. So have things have gotten better.
Marco Elver reported that the problem is being worked on by the syzbot project. He pointed to a bug report comment from syzkaller (and syzbot) creator Dmitry Vyukov that was posted at the end of November. It linked to yet another message from Viro complaining about the problem. Looking further at the bug comment thread makes it clear that progress is being made on identifying what to search for and on adding tags to email subject lines to identify which filesystem is being fuzzed.
The thread eventually went completely off the rails, including a message
that seems likely to draw a response from the kernel code of
conduct committee. The overall tone of the thread was unfortunate, at
least in spots, but
both Ts'o and Viro (especially the latter) spent a fair amount of time
patiently reiterating the problems
that have been raised multiple times along the way, albeit at a lower
volume. Those requests did not go far, so, as Ts'o put it, "maybe
something a bit
more.... assertive by Al [Viro] is something that will inspire them to
prioritize this feature request
".
Fuzz testing generates a huge number of reports; in order for the testing to be effective—useful—those reports have to be acted upon. Since that is the goal, it obviously makes sense to create reports that can be quickly routed to the right people. This not the first time we have seen complaints about fuzzing reports, and in a filesystem context, but hopefully we are on track to see improvements soon.
Index entries for this article | |
---|---|
Kernel | Development model/Bug reporting |
Kernel | Filesystems/Fuzzing |
Posted Dec 14, 2022 19:29 UTC (Wed)
by warrax (subscriber, #103205)
[Link] (18 responses)
Incredibly disrespectful.
Posted Dec 14, 2022 21:38 UTC (Wed)
by Kamiccolo (subscriber, #95159)
[Link]
Posted Dec 14, 2022 23:48 UTC (Wed)
by patrick_thomson (guest, #152863)
[Link]
Automated bug reports are only as good as the routing-to-humans procedure they undergo. Maintainers live and die by the signal/noise ratio in various project fora, and reducing that ratio irritates people, justifiably. While it’s not always great praxis to say “why don’t you just fix $BEHAVIOR with $STRATEGY,” it’s perfectly valid for maintainers to outline what kind of strategy would be useful for their purposes without being expected to fix the fuzzers themselves.
Posted Dec 15, 2022 0:02 UTC (Thu)
by linuxrocks123 (subscriber, #34648)
[Link] (15 responses)
-------
Frankly no interest here at all wasting any network bandwidth just to get you
Seems likely to mean something like:
-------
I'd have no interest in bothering you if it would take less than 3 days to trace one of your bugs to you, but that's not the case, and it would take even more effort to ensure we'd actually traced the bug to the right person if we tried to do that analysis ourselves.
Even "translated" it doesn't show the best attitude, but it makes a little more sense and it seems clear that Hillf didn't really understand what was being asked of him.
Posted Dec 15, 2022 0:35 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link] (14 responses)
(I have no idea whether Hillf is actually a Google engineer. syzkaller appears to be a Google-owned GitHub repo, so I'm guessing that at least some of the people who work on it are probably Googlers?)
Posted Dec 15, 2022 0:49 UTC (Thu)
by space (subscriber, #157761)
[Link] (1 responses)
Posted Dec 16, 2022 0:49 UTC (Fri)
by WolfWings (subscriber, #56790)
[Link]
Absolute bafflement that they're not part of any of the components in question and still acting like that to kernel devs.
Posted Dec 15, 2022 1:24 UTC (Thu)
by viro (subscriber, #7872)
[Link] (1 responses)
Marco Elver is one of the syzkaller developers, so's Alexander Potapenko. AFAICS, nobody else in that thread is.
Hillf sounds like a Team OS/2 refugee or an Amiga fanboy on a bad flashback, TBH...
Posted Dec 15, 2022 5:48 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link]
You'd be surprised. I've encountered quite a few (secondhand) horror stories of upstreams behaving in exactly that fashion.
(My usual attitude towards these kinds of disputes tends to look pro-upstream, but it's really more pro-the-people-who-do-the-work-call-the-shots; if upstream doesn't want to make a change, they don't have to, but neither does downstream have to use/package/triage/etc. upstream's work. It would obviously be preferable if everyone got along.)
Posted Dec 15, 2022 13:39 UTC (Thu)
by Wol (subscriber, #4433)
[Link] (5 responses)
But you're imposing your workflow on other people ...
I make no judgement as to whether the tools are "good" or "bad" - different people have different definitions, and personally I'd dump both the browser and gmail in the "bad" category, but just because you are quite happy with your tools, doesn't mean that other people can function efficiently with them.
For grey-beards, it's likely that the console and MUA are the same tool (emacs).
I have to work with gmail, slack, and Excel/VBA. When I'm doing "real work" it's almost invariably in Excel's VBA window, and even the context switch to Excel proper is a productivity-damaging one. gmail and slack pretty much get ignored.
I don't think it really matters what your tools are, what matters is that (if possible) you are working with a favourite tool, and that you can minimise the number of times you switch between tools.
And another thing the young bucks don't realise, is that the reason greybeards stick to their favourite tools is that OLDER PEOPLE DON'T LEARN SO FAST. As a greybeard myself, I don't give a monkeys what tools other people prefer (so long as it doesn't impact on me), but I can do the job with MY tools of choice in a fraction of the time it would take if *I* used *SOMEONE*ELSE'S* tools of choice. I like to think I could do it faster than them with their choice, but that's arrogance :-)
(And all the evidence says that people "who can multi-task" are actually much less productive than people who can sit and concentrate without interruption. Case in point, we had something kick off at work yesterday morning, by yesterday evening I had digested the problem, worked out a fix, and rolled it out for testing. My bosses were shocked how fast it was done. But I was allowed to *concentrate*!)
Cheers,
Posted Dec 15, 2022 17:58 UTC (Thu)
by SLi (subscriber, #53131)
[Link] (4 responses)
That's not to say that I, too, wouldn't find the problem complained of here very real and the response completely impossible to understand or accept
Posted Dec 15, 2022 19:14 UTC (Thu)
by viro (subscriber, #7872)
[Link] (2 responses)
Does not work for sending patches.
Gmail web client converts tabs to spaces automatically.
At the same time it wraps lines every 78 chars with CRLF style line breaks
Another problem is that Gmail will base64-encode any message that has a
Might or might not be accurate these days, but that pretty much means "non-starter for kernel development, use their IMAP interface with a real MUA if you are forced to use a gmail account". Has nothing to do with the age, etc. - see the mentioned files for the real reasons. If gmail web client has solved these problems nowadays, a patch to Documentation/process/email-clients.rst along the lines of "here's how to set it up so it would do the right thing" would be welcome...
Posted Dec 15, 2022 21:33 UTC (Thu)
by SLi (subscriber, #53131)
[Link] (1 responses)
Look, I'm no fan of all the new ways of doing things either. Rather, I just want to point that "it disrupts my/our way of working" only goes so far. It goes a lot more far, of course, in communities where lots of people want to do it like that (or there's people with sufficient power to just say this is how we do it), but even that has its limits...
Posted Dec 16, 2022 4:27 UTC (Fri)
by neilbrown (subscriber, #359)
[Link]
True. But when "you" are asking "me" to respond to your bug report, "you" need to make that worth my while. So either a REALLY important bug, or a report that is REALLY easy to work with (or lots of dollars).
Posted Dec 15, 2022 23:25 UTC (Thu)
by linuxrocks123 (subscriber, #34648)
[Link]
It uses Aaron Swartz's html2text library to convert it to plaintext. It actually works pretty well.
Posted Dec 16, 2022 2:03 UTC (Fri)
by rgmoore (✭ supporter ✭, #75)
[Link] (3 responses)
My reading says the problem with the different tools is only the tip of the iceberg. Yes, the need to switch tools is annoying, but the underlying problem is that relevant information isn't being included up-front. The fuzz testers know which filesystem they were testing, and the information is present in the dashboard entry that has the more detailed information. But instead of including that very basic information in the title of the email, they make maintainers dig through the dashboard entry to figure it out. It's just massively inefficient and makes people much less inclined to pay attention. If they would just do the absolute most basic thing, like saying in the email title they generated the bug while testing fuzzed NTFS images, it would make life much easier.
Posted Dec 16, 2022 2:34 UTC (Fri)
by NYKevin (subscriber, #129325)
[Link] (2 responses)
(This is not meant to excuse it, merely to commiserate with other people being subjected to it.)
Posted Dec 16, 2022 17:44 UTC (Fri)
by rgmoore (✭ supporter ✭, #75)
[Link]
An email that points you to the information might be OK if you can at least have some confidence it's about your project. It's far worse when it's an email to a whole mailing list, most of whom aren't involved. Expecting people dig through a convoluted process just to figure out if the email is even relevant to them is just ridiculous.
Posted Dec 16, 2022 19:19 UTC (Fri)
by Wol (subscriber, #4433)
[Link]
And if it's work, I just dash off a reply saying "please provide the following extra info ..." - if I don't get a response it just disappears down the priorities :-) And if I do get a response well, once you've actually got buy-in from the other end, things usually end up well :-)
Cheers,
Posted Dec 15, 2022 3:25 UTC (Thu)
by flussence (guest, #85566)
[Link] (1 responses)
I've got my RSS reader pointed at my distro's bugzilla because that turns out to be a good way of staying ahead of upcoming problems, and on a good day it's very low volume. On a *bad* day... a flood of hundreds of automated, mostly identical, and dubious-usefulness QA bugs from two different users' buildbots (and an enormous amount of them end up as wontfix landfill after developers waste their time triaging them).
Posted Dec 20, 2022 8:56 UTC (Tue)
by thoeme (subscriber, #2871)
[Link]
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Calm downnnnnn Sir even if this is not the east ender style.
interrupted if it would take less than 72 hours to discover one of the beatles
you created. And actually more than double check is needed to ensure who
did that.
-------
Please calm done: aren't British people supposed to be polite?
-------
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Wol
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
-----------------------------------
Gmail (Web GUI)
***************
although tab2space problem can be solved with external editor.
non-ASCII character. That includes things like European names.
-----------------------------------
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports
Wol
Troubles with triaging syzbot reports
Troubles with triaging syzbot reports