LWN: Comments on "How to ruin Linus's vacation" https://lwn.net/Articles/452117/ This is a special feed containing comments posted to the individual LWN article titled "How to ruin Linus's vacation". en-us Sat, 01 Nov 2025 09:55:09 +0000 Sat, 01 Nov 2025 09:55:09 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net How to ruin Linus's vacation https://lwn.net/Articles/453236/ https://lwn.net/Articles/453236/ pebolle <div class="FormattedComment"> No, it's not related. That bug has to do - as far as I can tell right now - with lockdep noticing there's recursive locking going on but not being told that this is a case of expected nested locking. Well, I think the nesting is expected ...<br> <p> Anyhow, the message lockdep prints, includes the neat line:<br> *** DEADLOCK ***<br> <p> It took me a few days before I noticed that the kernel actually still was running quite OK after that disturbing message.<br> <p> By the way, the reason you ran into this recently seems to be that systemd-30 is (apparently) the first program in wide use that triggers this.<br> </div> Thu, 28 Jul 2011 17:12:40 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452841/ https://lwn.net/Articles/452841/ Lennie <div class="FormattedComment"> I just knew the "janitor effect".<br> <p> I presume the name comes from working late because you can not figure out a problem. Then the janitor comes and he can sees you are kind of frustrated and asks what the problem is.<br> <p> You try to explain it to the janitor in simple terms and then you understand your problem and can fix it.<br> <p> Then you go home happy I guess.<br> <p> The ducky is just a way to try and induce that state of mind with an inanimate object where you are trying to explain the problem to someone else in simple terms.<br> </div> Mon, 25 Jul 2011 18:08:39 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452812/ https://lwn.net/Articles/452812/ bronson <div class="FormattedComment"> It was obviously a joke.<br> </div> Mon, 25 Jul 2011 16:23:16 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452783/ https://lwn.net/Articles/452783/ mlankhorst <div class="FormattedComment"> <font class="QuotedText">&gt; It's all Hugh's fault. </font><br> Blame the messenger, not the person who originally introduced the bug?<br> </div> Mon, 25 Jul 2011 08:57:14 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452709/ https://lwn.net/Articles/452709/ neilbrown <div class="FormattedComment"> I hadn't heard of it either .. but search engines are our friends.<br> <p> <a href="http://lmgtfy.com/?q=Rubber+Duckie+Test">http://lmgtfy.com/?q=Rubber+Duckie+Test</a><br> <p> Pick the wikipedia link on Rubber Duck Debugging.<br> <p> </div> Sat, 23 Jul 2011 21:38:57 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452672/ https://lwn.net/Articles/452672/ smurf <div class="FormattedComment"> *familiar. *grumble*<br> </div> Sat, 23 Jul 2011 09:45:58 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452634/ https://lwn.net/Articles/452634/ smurf <div class="FormattedComment"> IMHO you mean "This is *why …".<br> <p> Please enlighten us what this test means; not everybody is damiliar with the idiom.<br> </div> Fri, 22 Jul 2011 19:58:22 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452507/ https://lwn.net/Articles/452507/ fuhchee <i>"there is an arbitrarily complex state Y which is transformed into the almost-identical state Y', and the only relevant difference between Y and Y' is X"</i><p> That sounds like the AI Frame Problem. Thu, 21 Jul 2011 23:34:52 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452464/ https://lwn.net/Articles/452464/ nas <div class="FormattedComment"> This is way the "Rubber Duckie Test" is effective.<br> </div> Thu, 21 Jul 2011 19:19:13 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452367/ https://lwn.net/Articles/452367/ dgm <div class="FormattedComment"> I have another quote for you:<br> <p> "Oh Lord it's hard to be humble<br> when you're perfect in every way.<br> I can't wait to look in the mirror<br> cause I get better looking each day.<br> To know me is to love me<br> I must be a hell of a man.<br> Oh Lord it's hard to be humble<br> but I'm doing the best that I can."<br> <p> -- Mac Davis<br> <p> ;-)<br> </div> Thu, 21 Jul 2011 08:34:55 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452336/ https://lwn.net/Articles/452336/ bfields Yeah. Especially for CS students, something like the classic first point-set topology course might give good experience with that kind of proof-or-counterexample mode of problem solving. I think that's rare, unfortunately, at least outside a few countries with very rigorous math programs? <p>(Like some others, I'm a refugee from mathematics, coming late to this after getting a PhD (commutative algebra and some algebraic topology). Not a particularly smart career path, but fun in its own way.) Wed, 20 Jul 2011 23:02:47 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452243/ https://lwn.net/Articles/452243/ viro <div class="FormattedComment"> Yeah, well... you forgot to add "and actually read" to conditions... Exhibit A: people adding hardlinks to directories or equivalents thereof, despite the aforementioned example of documentation ;-/<br> <p> We do need such writeups, of course. If nothing else, writing them tends to find holes - see e.g. -&gt;d_lock mess discussion on fsdevel lately. There the locking order had been fscked in head (not transitive, for one thing), but locks outside of that set had mostly avoided bad trouble. Trying to write the proof of correctness hadn't been fun (and what I've got still relies on unverified assumptions about the things filesystem code does not do; verifying those has already caught a bunch of really broken things), but it helped to catch rather nasty stuff. Simply by reasoning about the properties of counterexample - i.e. "what would a deadlock have to look like". It's math, like any other...<br> <p> FWIW, I wonder what backgrounds people have - in my case, it's geometry and topology and _that_ has certainly helped to acquire many mental habits useful for that kind of work...<br> </div> Wed, 20 Jul 2011 04:05:55 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452240/ https://lwn.net/Articles/452240/ bfields <p>Well, it's also true that you can't write purely "formal" proofs for most mathematical theorems. And yet, mathematics gets done, because people can write perfectly good proofs in ordinary language. <p>And in fact anyone that writes non-trivial code probably does form in their head at least a hand-wavy proof of its correctness. If those actually got written down, it would probably help clarify thinking and avoid some bugs. But that doesn't happen for the same reason that nobody writes documentation. <p>An example of an exception: <a href="http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=blob;f=Documentation/filesystems/directory-locking;h=ff7b611abf330d11b8a5aef416e06106a39abbca;hb=3a5c3743f15f27237ab025736a981e2d0c9fdfed">Documentation/filesystems/directory-locking</a>. Wed, 20 Jul 2011 03:24:49 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452238/ https://lwn.net/Articles/452238/ jzbiciak <div class="FormattedComment"> Indeed.<br> <p> This is also why I find debugging with print and assert statements to be much more helpful than single stepping, in the vast majority of cases. Single stepping was more useful to me when I didn't necessarily understand all the language constructs. That was 20-25 years ago, though.<br> <p> Nowadays, if I have some weird bug, it's because something I think is true is not, or some other "invisible-to-me" error. A judiciously placed print statement (perhaps guarded by an 'if' to filter out the noise) makes it easy for me to prove or disprove my assumptions about what's happening in the code, though, without disrupting any of the code around it. I can compare working cases to non-working cases easily and in a batch-wise manner after the fact. Very useful.<br> </div> Wed, 20 Jul 2011 02:44:51 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452237/ https://lwn.net/Articles/452237/ elanthis <div class="FormattedComment"> This is the same reason why authors of prose have editors. Your internalized assumptions make it so your eyes can pass right over "obviously" bad constructs and not notice.<br> <p> It's also why "check your assumptions" is the first rule of debugging I teach to new coders. :)<br> </div> Wed, 20 Jul 2011 02:25:38 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452231/ https://lwn.net/Articles/452231/ neilbrown <div class="FormattedComment"> I have a lot of experience hunting bugs that make no sense, sometimes in my own code and sometimes in other people's. I usually find it harder to find those bugs in my own code, and it certainly isn't because my code is particularly "clever".<br> <p> The common pattern that I find in "bugs that make no sense" is that I am believing something that isn't true. Once I identify and challenge that belief, the bug becomes obvious. The reason that these bugs are harder to find in my own code is because I believe my code is correct (in general, and in lots of specifics), so I don't question it so closely. When I'm reading other people's code I have less belief that it is correct, so less hindrance to finding the bugs.<br> <p> Put another way: finding a bug in my code is both a success and an admission of failure. Finding a bug in someone else's code is pure success. This sounds like a strong case for pair-programming!<br> <p> </div> Wed, 20 Jul 2011 00:26:54 +0000 How to ruin Linus's patch https://lwn.net/Articles/452227/ https://lwn.net/Articles/452227/ viro <div class="FormattedComment"> No, you are right - the check is BUG_ON() misspelled. umount *can't* happen<br> at that point - we are holding vfsmount_lock all the way through RCU<br> walk.<br> </div> Tue, 19 Jul 2011 23:47:52 +0000 How to ruin Linus's patch https://lwn.net/Articles/452211/ https://lwn.net/Articles/452211/ neilbrown <div class="FormattedComment"> I'm trying to understand Linus' patch.<br> <p> Moving the assignment to *inode from the start to the end makes lots of sense.<br> <p> The other mucking about with seqcount doesn't make any sense to me at all.<br> <p> What exactly is the read_seqcount_retry on path-&gt;dentry-&gt;d_seq trying to protect? That dentry is the mounted-on dentry so it is pinned and cannot be renamed or deleted.<br> <p> About the only thing I can think of that might need to be protected against at this point is the mount being unmounted - but if that were the goal of the code I would have expected a comment to that effect, and it doesn't seem like the right place for it anyway..<br> <p> Any ideas?<br> <p> </div> Tue, 19 Jul 2011 23:18:30 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452219/ https://lwn.net/Articles/452219/ Ben_P <div class="FormattedComment"> It's also important to note than when working in languages with the freedom that C gives you, determining accurate life times and reference counts can be hellish if very strict coding conventions are not adhered to.<br> </div> Tue, 19 Jul 2011 23:03:04 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452217/ https://lwn.net/Articles/452217/ dgm <div class="FormattedComment"> Maybe. But when you find yourself hunting that bug that makes no sense, you should question if you were as clever as you thought when writing the code.<br> </div> Tue, 19 Jul 2011 22:55:37 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452210/ https://lwn.net/Articles/452210/ neilbrown <div class="FormattedComment"> Yes, it is a great quote. But experience shows that it is wrong ... so where is the flaw?<br> <p> I think the reality is that writing really clever code actually makes you smarter - by solving a difficult problem you can learn something useful. So having written (or just read and understood) really clever code, you become smart enough to at least make an attempt at debugging it.<br> <p> Maybe.<br> <p> </div> Tue, 19 Jul 2011 22:33:31 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452209/ https://lwn.net/Articles/452209/ dgm <div class="FormattedComment"> While reading the article I had this quote in my mind all the time:<br> <p> "Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it."<br> <p> — Brian W. Kernighan and P. J. Plauger in The Elements of Programming Style.<br> </div> Tue, 19 Jul 2011 22:22:39 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452207/ https://lwn.net/Articles/452207/ smoogen <div class="FormattedComment"> I wonder if this is anything related to a set of "crashes" I have had with the 3.0.0 kernels (the 2.6.39 works fine so maybe not)<br> <p> <a href="https://bugzilla.redhat.com/show_bug.cgi?id=722472">https://bugzilla.redhat.com/show_bug.cgi?id=722472</a><br> <p> I noticed that at least in rc7 it came out with a new error message versus slowing down until the deadlock indicator starts spewing "hard-locks"<br> <p> <p> </div> Tue, 19 Jul 2011 22:13:46 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452205/ https://lwn.net/Articles/452205/ tshow <div class="FormattedComment"> The trouble with formal proofs is that the logical errors and bad assumptions made when writing the code tend to be propagated into the formal proofs, where they typically remain equally unnoticed. Or worse, the proof is fine, but the code deviates in subtle ways due to abstraction mismatches and model simplification.<br> <p> At least as I understand the state of the art in formal code proofs, the only place they work is places where they aren't particularly useful; places where things like timing, instruction reordering, hardware errors and concurrency are not considerations.<br> </div> Tue, 19 Jul 2011 21:57:38 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452197/ https://lwn.net/Articles/452197/ kleptog <div class="FormattedComment"> I'm not sure about complete proofs, but we do need better tools for testing certain assertions with respect to race conditions.<br> <p> It happens frequently that a cache introduced for performance actually contains some race condition. One way of checking this is to make the cache throw away entries almost immediately, this has a way of stress testing certain failure modes. I wonder if that would have helped here.<br> <p> In any case, testing assertions in the face of race conditions is something we could all use. There is software proving software that tries but the false positive rate is still too high. I truly hope that in the future we will have good tools for this, because complexity is only getting worse.<br> </div> Tue, 19 Jul 2011 20:54:09 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452192/ https://lwn.net/Articles/452192/ smurf <div class="FormattedComment"> Unfortunately, any VFS operation needs to be formally described not as in "X happens", but as in "there is an arbitrarily complex state Y which is transformed into the almost-identical state Y', and the only relevant difference between Y and Y' is X". (There may be non-relevant differences, e.g. cache state.)<br> <p> Add in the fact that the kernel is reentrant (formal descriptions for concurrent processes? Dream on) and has the aforementioned caching and RCU and whatnot (so for each file there are multiple valid pre- and postconditions), and you're in for a _real_ treat.<br> <p> I very much doubt that anybody can manage this for any specific non-trivial testcase, much less in general.<br> </div> Tue, 19 Jul 2011 20:48:16 +0000 How to ruin Linus's vacation https://lwn.net/Articles/452186/ https://lwn.net/Articles/452186/ cesarb <div class="FormattedComment"> <font class="QuotedText">&gt; The behavior of the dentry cache is, at this point, so subtle that even the combined brainpower of developers like Linus, Al, and Hugh has a hard time figuring out what is going on. [...] But if we reach a point where almost nobody can understand, review, or fix some of our core code, we may be headed for long-term trouble.</font><br> <p> I wonder if we are reaching the point where we would need to write formal proofs for parts of the kernel code.<br> </div> Tue, 19 Jul 2011 20:00:22 +0000