LWN: Comments on "The pernicious USB-stick stall problem" https://lwn.net/Articles/572911/ This is a special feed containing comments posted to the individual LWN article titled "The pernicious USB-stick stall problem". en-us Wed, 08 Oct 2025 09:25:29 +0000 Wed, 08 Oct 2025 09:25:29 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net The pernicious USB-stick stall problem https://lwn.net/Articles/938015/ https://lwn.net/Articles/938015/ juliano_vs <div class="FormattedComment"> 10 years later, this is still causing headaches for many users<br> I keep imagining a new user going to linux and facing this problem, having to wait 2 hours to unmount his usb stick 2.0 and having to go out looking for a manual solution on google<br> In my case i am not a new user but i had this problem and i ended up finding a solution and now i can unmount the pendrive as soon as the copy progress bar ends (just like it is in windows)<br> correction:<br> create the file: <br> /etc/udev/rules.d/60-usb-dirty-pages-udev.rules<br> <p> with the following content:<br> ACTION=="add", KERNEL=="sd[a-z]", SUBSYSTEM=="block", ENV{ID_USB_TYPE}=="disk", RUN+="/usr/bin/bash -c 'echo 1 &gt; /sys/block/%k/bdi/strict_limit; echo 16777216 &gt; /sys/block/%k/bdi/max_bytes'"<br> <p> then restart your machine !<br> <p> After so many years this should already have a definitive solution in the kernel and not need manual intervention by the user.<br> It's this kind of thing that keeps new users away from the system<br> </div> Wed, 12 Jul 2023 19:43:21 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/866970/ https://lwn.net/Articles/866970/ pizza <div class="FormattedComment"> <font class="QuotedText">&gt; Do I have to become a Linux Dev so I can fix this?</font><br> <p> Patches welcome!<br> </div> Sat, 21 Aug 2021 01:25:11 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/866944/ https://lwn.net/Articles/866944/ xmready <div class="FormattedComment"> 8 years later this is still an issue. Do I have to become a Linux Dev so I can fix this?<br> </div> Fri, 20 Aug 2021 20:10:41 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/853500/ https://lwn.net/Articles/853500/ flussence <div class="FormattedComment"> Windows neatly sidesteps the entire problem by making file IO CPU-bound via malware scanners.<br> <p> More serious answer: has anyone benchmarked the bufferbloat in writing directly to a USB stick compared to spinning up a VM, handing the USB device to that, exposing the stick as an NFS share and writing that way? I honestly wouldn&#x27;t be surprised if the latter works better.<br> </div> Tue, 20 Apr 2021 00:42:48 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/853226/ https://lwn.net/Articles/853226/ LaurentD <div class="FormattedComment"> Thanks for sharing the issue and all the information.<br> <p> That said, I concur. 2021 already and still seeing the issue here as well. The below does not seem to help much.<br> <p> echo $((16*1024*1024)) &gt; /proc/sys/vm/dirty_background_bytes<br> echo $((48*1024*1024)) &gt; /proc/sys/vm/dirty_bytes<br> <p> I wonder: how have other OSs addressed the issue?<br> </div> Sun, 18 Apr 2021 13:19:12 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/848264/ https://lwn.net/Articles/848264/ kolay.ne <div class="FormattedComment"> Hello from 2021, lol<br> Still having this issue<br> </div> Fri, 05 Mar 2021 14:23:23 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/816833/ https://lwn.net/Articles/816833/ abdulla95 <div class="FormattedComment"> Running the lines:<br> <p> echo $((16*1024*1024)) &gt; /proc/sys/vm/dirty_background_bytes<br> echo $((48*1024*1024)) &gt; /proc/sys/vm/dirty_bytes<br> <p> Didn't help me. The performance did improve but it would still lag. (I have a 1TB HDD and 8GB RAM)<br> <p> My question is, is using a hack to go around this a good thing? Like `ionice`, `rsync`, `pv`? I have seen these being thrown around in the internet. And I have used rsync and it works.<br> </div> Sun, 05 Apr 2020 16:10:35 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/771011/ https://lwn.net/Articles/771011/ sourcejedi <p>"The entire system proceeds to just hang" - I think this is misleading :-(. Artem didn't report this, and I don't see any other evidence presented for it. <p>I am hopeful that it is prevented, or at least mitigated, by the <a rel="nofollow" href="https://lwn.net/Articles/456904/">"No-I/O dirty throttling" code</a> that you reported on in 2011 :-). This throttles write() calls to control both the size of the overall writeback cache, and the amount of writeback cache *for the specific backing device*. <p>Artem did not report the entire system hanging while it flushes cached writes to a USB stick. His report only complained the "sync" command could take up to "dozens of minutes". <p>In his followup message, Artem reported "the server almost stalls and other IO requests take a lot more time to complete even though `mysqldump` is run with `ionice -c3`". But this was not the USB-stick problem. It happened after creating a 10GB file on an *internal* disk. <p>I'm not saying there isn't a bufferbloat-style problem. But I cant find any evidence here, that excessive writeback cache on one BDI is delaying writes to other BDIs. At least in the simple case you described. <p>I wrote a StackExchange post about this <a rel="nofollow" href="https://unix.stackexchange.com/questions/480399/why-were-usb-stick-stall-problems-reported-in-2013-why-wasnt-this-problem-al/">here</a>. Wed, 07 Nov 2018 21:11:45 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/648079/ https://lwn.net/Articles/648079/ paulj <div class="FormattedComment"> How does that work?<br> </div> Sat, 13 Jun 2015 08:04:14 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/648060/ https://lwn.net/Articles/648060/ evultrole <div class="FormattedComment"> I've been fighting with this problem for the last year, and I had no luck with changing any of these. Came across this article many times while seeking an answer, so thought I'd leave what eventually worked for me. <br> <p> Got the problem fixed with a custom udev rule.<br> <p> /usr/lib/udev/rules.d/81-udisks_maxsect.rules<br> <p> SUBSYSTEMS=="scsi", ATTR{max_sectors}=="240", ATTR{max_sectors}="32678"<br> <p> My hangs disappeared after a reboot.<br> </div> Fri, 12 Jun 2015 20:41:01 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/579580/ https://lwn.net/Articles/579580/ jospoortvliet <div class="FormattedComment"> because when the IO subsystem gets clogged, other processes don't get anything either. It just becomes a huge mess with everything stalling.<br> </div> Wed, 08 Jan 2014 10:26:22 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/574087/ https://lwn.net/Articles/574087/ mathstuf <div class="FormattedComment"> Two other situations I've seen which may be related:<br> <p> - Heavy output on any local terminal (switching workspaces or to another tmux window makes it work better, so closer to the X side of the I/O pipeline; the switch might not take effect for 20–30 seconds though)<br> - rdesktop (I have to hide Windows' TTY window(s) during a build since its speed affects my local machine even when on another, hidden, desktop locally)<br> </div> Sun, 17 Nov 2013 04:25:26 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573976/ https://lwn.net/Articles/573976/ HenrikH <div class="FormattedComment"> I wonder if this also can be responsible for the problems that I see with XTerm+SSH when lots and lots of lines of text are sent from a server to my ssh client. When that happens (often due to a grep done wrong) my whole machine freezes until all rows have been transferred.<br> </div> Fri, 15 Nov 2013 21:02:26 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573919/ https://lwn.net/Articles/573919/ Nagilum <div class="FormattedComment"> That's right I'm not a full time developer.<br> Anyway if nothing else is waiting for IO on that disk then it still wouldn't bother you very much since it won't block anything.<br> Anyhow you have your use-case solved.<br> </div> Fri, 15 Nov 2013 14:03:01 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573917/ https://lwn.net/Articles/573917/ Wol <div class="FormattedComment"> <font class="QuotedText">&gt; There is usually only a very slim chance that data will be written that will be deleted right away again</font><br> <p> You're obviously not a developer (or gentoo user). I have a huge (20/30Gb ramdisk) for temp precisely because I quite often have gigs of data that gets created and deleted pretty quick. What's the point of writing it to disk when my system has plenty of ram?<br> <p> Cheers,<br> Wol<br> </div> Fri, 15 Nov 2013 13:40:57 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573908/ https://lwn.net/Articles/573908/ jezuch <div class="FormattedComment"> My question is this: why is writing to one slow device blocking the entire system and not just the process that is doing the writing?<br> <p> I remember there was an issue with page locking, I think, some time ago that was solved with stable pages. But it still doesn't make much sense.<br> </div> Fri, 15 Nov 2013 11:09:12 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573849/ https://lwn.net/Articles/573849/ hummassa <div class="FormattedComment"> The problem here is "the knob works well, but the default value is not good for desktops", the solution being "when installing a desktop, remember to set the knob value to something saner" and you get the perfect case (as I have) that is fast USB copies without hogging the CPU.<br> <p> Ah, and in my job I see hundreds of USB drives becoming totally hosed every year, by way users writing them on a windows machine and removing without unmounting. Don't do that, even on Windows.<br> </div> Thu, 14 Nov 2013 21:48:21 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573836/ https://lwn.net/Articles/573836/ khim This is classic case of "perfect is the enemy of good". Yes, Windows commits everything to USB stick right away, yes it's slow and inefficient, but it also means that you can actually use USB sticks without worry! With Linux small operations are extremely fast, but try to copy few gigs of data to USB stick and be ready to use a different computer for a few minutes (or hours) because system will be totally hosed. Thu, 14 Nov 2013 19:51:09 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573833/ https://lwn.net/Articles/573833/ apoelstra <div class="FormattedComment"> My understanding is that Windows basically commits everything to disk ASAP under the assumption that the user may pull the drive at any second and expect all the transferred data to be in place. (At least, all the data that the GUI says was transferred successfully.)<br> <p> One result of this is that average-case data transfer on Windows is noticeably slower than on Linux, which is probably why the kernel folks are loath to copy such a solution.<br> </div> Thu, 14 Nov 2013 19:34:15 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573832/ https://lwn.net/Articles/573832/ chojrak11 <div class="FormattedComment"> Is this problem really *that* hard to solve? Does it need tweaking instead of proper solving? Well... then how is it possible that Windows doesn't experience it *at all*? One time I had 5 different USB sticks plugged into my laptop and copied things all over in various directions, while working with other programs at the same time with no noticeable slowdowns? When reading about such simple problems (the other example is mtime update for mmap'ed files), it is obvious that Windows is still light years ahead...<br> <p> </div> Thu, 14 Nov 2013 19:29:54 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573766/ https://lwn.net/Articles/573766/ callegar <div class="FormattedComment"> I wonder if this can be the reason why linux software raid occasionally fails external usb disks are used for the raid and large data transfers are started. Any clue?<br> </div> Thu, 14 Nov 2013 10:02:37 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573459/ https://lwn.net/Articles/573459/ jzbiciak <div class="FormattedComment"> It turns out that bufferbloat also applies to on-chip switch fabrics and the buffering you can find there. <br> </div> Mon, 11 Nov 2013 05:32:53 +0000 COW filesystems and the stall problem https://lwn.net/Articles/573454/ https://lwn.net/Articles/573454/ giraffedata Yes, I was not responding to the point. <p> The actual point lost me, because I don't see the connection between the high-volume writing and reading you described and copy-on-write and the stalling of the playback. But I also don't know anything about mythtv or btrfs specifically. <p> I don't think there's anything inherent in COW that means if you flush a large file write sooner that you make more copies of tree data or some group of blocks, but I may have just totally missed the scenario you have in mind. Mon, 11 Nov 2013 04:16:18 +0000 COW filesystems and the stall problem https://lwn.net/Articles/573451/ https://lwn.net/Articles/573451/ jhhaller <div class="FormattedComment"> The point was not how mythtv handles this, but that using more aggressive memory pressure may cause COW filesystems to have lower performance, by forcing large files writes to be flushed sooner than they currently do, with consequent COW filesystem behaviors to create copies of at least the tree data and last group of blocks originally written when the next memory pressure forces another write. For my case, I could adjust the configuration of the directory holding the video files to disable the COW behavior, but that's harder to do for the general case. I suspect msync would cause the same problem for a COW filesystem as fsync.<br> </div> Mon, 11 Nov 2013 01:59:52 +0000 COW filesystems and the stall problem https://lwn.net/Articles/573450/ https://lwn.net/Articles/573450/ giraffedata fsync is a rather primitive way to cause the system not to keep memory needlessly dirty. <P> Lots of fadvise and madvise flags have been developed over the years; doesn't one of them get the writeback happening immediately? I created a TV recorder based on a ca 2004 Linux kernel that used msync(MS_ASYNC) for that purpose (yes, I made it write the file via mmap just so I could use msync). <p> Of course, if the issue is that you think the recorder might not actually be able to keep up with the data arriving, and want to make sure when that happens it just drops TV data and doesn't cripple the rest of the system too, then you do need something synchronous like fsync. Mon, 11 Nov 2013 00:45:58 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573449/ https://lwn.net/Articles/573449/ giraffedata I don't see how that will make things noticeably better. In fact, it will probably make it worse. <p> With the defaults, your system can write up to .2M (where M is the size of your memory) to the USB stick before the system slows to a crawl. Remember that not just the process writing to the stick must wait for writeback before it can dirty more pages - all processes must. With your proposed numbers, the crawl happens for writes to the stick as small as .05M. <p> Lowering dirty_background_ratio will give your system a probably imperceptible head start (and thus earlier finish) on writing all that data to the USB stick. The headstart will be the amount of time it takes to buffer .08M of writes (10% - 2%). <p> These workarounds using global parameters can help only in carefully constructed cases. To prevent a slow write to USB from affecting non-USB-writing processes, the kernel would need some kind of memory allocation scheme that distinguishes processes or distinguishes write speed of backing devices. Mon, 11 Nov 2013 00:34:39 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573388/ https://lwn.net/Articles/573388/ raven667 <div class="FormattedComment"> This has become one of my new pet points, queuing theory as it applies to IO has a ton of overlap between Disk IO and Net IO and there should be frequent reminders to make sure that new research and techniques cross pollinate between the two different communities. Storage doesn't have the option to just drop requests and return IO errors like IP does to indicate contention but it can certainly block writers until latencies improve. The same algorithms with different tunables might be useful across subsystems.<br> </div> Sat, 09 Nov 2013 15:30:01 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573375/ https://lwn.net/Articles/573375/ marcH <div class="FormattedComment"> <font class="QuotedText">&gt; It's a storage equivalent to the bufferbloat problem.</font><br> <p> Good analogy. It stops at the cure though: with TCP/IP it's all about dropping packets! Back-pressure is extremely rare in networking because of Head Of Line blocking.<br> <p> Speaking of Head Of Line blocking, I suspect the queues involved in this article don't make the difference between users, do they? In other words, someone writing a lot to a slow device will considerably slow other users, correct?<br> <p> (Yes, I do realize USB sticks don't tend to have a lot of concurrent users :-)<br> <p> </div> Sat, 09 Nov 2013 08:54:33 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573369/ https://lwn.net/Articles/573369/ Nagilum <div class="FormattedComment"> This has been bugging me for years and my solution has almost always been:<br> vm.dirty_background_ratio=0<br> or some other very low number (0..5). Personally I see no reason to delay starting to write dirty data out other than power saving. There is usually only a very slim chance that data will be written that will be deleted right away again so delaying starting to flush the data to disk makes very little sense to me.<br> If you have multiple writers it may also help with the performance if you have a higher value here but high for me is something like 5.<br> <p> </div> Sat, 09 Nov 2013 06:58:28 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573364/ https://lwn.net/Articles/573364/ naptastic <div class="FormattedComment"> As a "good" and not "perfect" solution, why not make dirty_background_ratio and dirty_ratio a percentage of some fixed value, and allow arbitrarily large* percentages?<br> <p> * - dirty_background_ratio is a signed 32-bit value. If our "fixed value" were 1MiB, you could specify up to 2TiB of buffer space in the future. When we get to 128-bit architectures, we might want to increase the size of dirty_background_ratio to accommodate larger buffers.<br> </div> Sat, 09 Nov 2013 04:25:09 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573363/ https://lwn.net/Articles/573363/ naptastic <div class="FormattedComment"> Close! Put the values in /etc/sysctl.conf.<br> </div> Sat, 09 Nov 2013 04:07:25 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573315/ https://lwn.net/Articles/573315/ ssam <div class="FormattedComment"> So does this mean if you have the issue of everything being slow and laggy while writing large files to USB you should look at the values in<br> <p> cat /proc/sys/vm/dirty_background_ratio<br> cat /proc/sys/vm/dirty_ratio<br> <p> (I get 10 and 20 on fedora), and then change them to smaller numbers with something like<br> <p> echo 2 &gt; /proc/sys/vm/dirty_background_ratio<br> echo 5 &gt; /proc/sys/vm/dirty_ratio<br> <p> in your /etc/rc.d/rc.local<br> </div> Fri, 08 Nov 2013 16:12:32 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573290/ https://lwn.net/Articles/573290/ neilbrown <div class="FormattedComment"> <font class="QuotedText">&gt; Seriously? I bet that no one (especially the sysctl tool) gets this right. </font><br> <p> Guess what. The kernel doesn't even get it right!! Almost but not quite.<br> <p> There is a global variable "ratelimit_pages" which is effectively a granularity - we only do the expensive tests every "ratelimit_pages" pages.<br> <p> This gets updated whenever you set dirty_ratio or dirty_bytes. It is set to dirty_thresh / (num_online_cpus() * 32)<br> <p> However if you set "dirty_bytes" and then "dirty_ratio", the second calculation of ratelimit_pages will be based on the old "dirty_bytes" value, not the new "dirty_ratio" value.<br> <p> It's a minor bug, but it confirms your assertion that this is an easy interface to get wrong.<br> <p> [dirty_ratio_handler() should set "vm_dirty_bytes = 0" *before* the call to writeback_set_ratelimit()]<br> <p> </div> Fri, 08 Nov 2013 06:42:38 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573288/ https://lwn.net/Articles/573288/ eru <i>I'm sure that some users have such an HDD plugged in all the time and they would object to the performance degradation.</i> <p> I am myself one of those, finding that the easiest way to expand storage on my old home PC. But I think that is still exceptional, and could be solved by passing a parameter when mounting (in fstab? I have those drives mounted the traditional way, instead of hot-plugging), or later via some tuning command. Fri, 08 Nov 2013 03:39:48 +0000 COW filesystems and the stall problem https://lwn.net/Articles/573280/ https://lwn.net/Articles/573280/ jhhaller <div class="FormattedComment"> This change may have interesting affects on COW filesystems like btrfs. I have been playing with an interesting combination of mythtv and btrfs. When recording HDTV, mythtv writes about 6GB of data per hour per channel being recorded, or about 16MB/s. Because of the cache/stall effects, mythtv calls fsync once per second, to avoid filling the cache and causing the problems mentioned in the article. While watching live TV, the program is written to disk by one process, and read by another, in a multimedia version of less+tail -f. On some occasions, when fsync runs, there is more than just a little bit of disk I/O, my guess is that the COW semantics are causing more than just the recently written data to be written. I wouldn't otherwise notice this, but the reading process hangs (causing the displayed TV program to hang) until the fsync completes, which can take seconds.<br> <p> My question is whether writing dirty pages back more quickly will have a big effect on COW filesystem performance. Copying a large file to a COW filesystem may trigger more COW actions than before.<br> </div> Thu, 07 Nov 2013 23:43:52 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573260/ https://lwn.net/Articles/573260/ nybble41 <div class="FormattedComment"> Exactly. I would have expected that both limits apply, with the actual limit determined by whichever is lower. In that case the fix would be easy: set a sensible default for the byte limit for large-memory systems, and a sensible percentage for small-memory ones.<br> <p> In any case the percentage should be based on the same numbers--total RAM, not low memory--regardless of the word size.<br> </div> Thu, 07 Nov 2013 21:43:11 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573236/ https://lwn.net/Articles/573236/ luto <blockquote>The one that applies is simply the one that was set last; so, for example, setting dirty_background_bytes to some value will cause dirty_background_ratio to be set to zero and ignored.</blockquote> Seriously? I bet that no one (especially the sysctl tool) gets this right. Thu, 07 Nov 2013 16:17:24 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573200/ https://lwn.net/Articles/573200/ jzbiciak <BLOCKQUOTE><I>a device which goes from above-SSD speeds to USB-key speeds on the fly</I></BLOCKQUOTE> <P>I guess it depends on whether there are devices that have their own in-built caching and can absorb quite a few writes until they slow down dramatically. They could exhibit rather bimodal behavior based on the size of the incoming writes. Also, where does NFS fit in the picture? There, performance may fluctuate quite a bit as well, although I don't know if it's affected by this particular set of knobs. (Seems like it ought to be.) With NFS, you have the combined effects of the buffering on the NFS server as well as all the other people on the network vying for the same bandwidth.</P> <P>My gut feel tells me any simple-minded rate estimator should also have a fairly quick adaptation rate so it tracks any workload-dependent behavior and other variations in media performance. ie. it should probably represent recent history (the last several seconds), more than long-term history.</P> Thu, 07 Nov 2013 14:48:56 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573184/ https://lwn.net/Articles/573184/ renox <div class="FormattedComment"> <font class="QuotedText">&gt;&gt;but writethrough makes sense for slow devices</font><br> &gt;<br> <font class="QuotedText">&gt; And also for all removable devices, because with them it is common you want to flush all pending writes and unmount.</font><br> <p> *all* removable devices?? What you describe is common for USB key, sure, but what about USB HDD?<br> I'm sure that some users have such an HDD plugged in all the time and they would object to the performance degradation..<br> </div> Thu, 07 Nov 2013 13:38:02 +0000 The pernicious USB-stick stall problem https://lwn.net/Articles/573164/ https://lwn.net/Articles/573164/ johannbg <div class="FormattedComment"> Cant the tuning be based on the detected device? <br> </div> Thu, 07 Nov 2013 08:43:40 +0000