|
|
Subscribe / Log in / New account

Disk encryption

Disk encryption

Posted Sep 10, 2024 18:01 UTC (Tue) by cloehle (subscriber, #128160)
In reply to: Disk encryption by Heretic_Blacksheep
Parent article: The trouble with iowait

Note that, unless there's something in the stack that's not on my radar right now, not having/using AES-NI as opposed to having it should actually make iowait boosting *less* effective, not more. Taken to the extreme, if the workload is CPU-bound, we should already be driving the CPU at maximum frequency.
It's a bit counter-intuitive but the problem iowait boosting set out to solve originally was the CPU being driven at a low frequency because it appears mostly idle.


to post comments

Disk encryption

Posted Sep 10, 2024 18:52 UTC (Tue) by Heretic_Blacksheep (guest, #169992) [Link]

Ok, didn't think that through then, thanks for the explanation. So... this sounds like something that would have been a necessary thing in the Sandy Bridge era where it was possible to have really slow IO buses like USB storage attached to USB 2 buses with encrypted media. The Core 2 I5 2xxx series had AES-NI but some of the motherboards it ended up on didn't have USB 3 controllers so the CPU would be going into idle state because the IO bus was so slow. Am I in the right mental model?

Reason for needing I/O wait

Posted Sep 10, 2024 19:36 UTC (Tue) by farnz (subscriber, #17727) [Link] (1 responses)

If I've understood your descriptions properly, the point of I/O wait isn't about being CPU bound - it's about slow I/O devices (like hard drives instead of SSDs). A HDD can easily have an average seek time over 8ms, and a Haswell-era Intel CPUs (4th Generation Core i7, for example) has (according to static struct cpuidle_state hsw_cstates[] in intel_idle.c) a C-state with 7.7ms target residency, and 2.6ms exit latency (note that the numbers in the struct are in µs).

Without iowait, it's plausible that the system would determine that it can sleep for 7.7ms comfortably after a single read triggers a seek, and enter that deep sleep state; it then takes 2.6ms to wake up before it even processes the completion interrupt. With iowait, the process can contribute to "wakefulness" of the CPU, and ensure that it doesn't enter the deep C-state, and thus avoid sleeping.

Thus, I'd expect that you'd see the worst case with a CPU that has long exit latencies from the deep C state and a HDD as the I/O device - some of the time, you'll hit long exit latencies because cpuidle (correctly) predicts that you're going to sleep for a long time, and those exit latencies will hurt.

Reason for needing I/O wait

Posted Sep 11, 2024 9:43 UTC (Wed) by cloehle (subscriber, #128160) [Link]

You're correct but let me expand. I need to point out the difference between cpufreq and cpuidle regarding iowait behavior. One solves the IO utilization problem, the other selects shallower states on the CPU that had tasks go to sleep on with in_iowait set.
You're referring to cpuidle, in that case the long IO latency is a problem, as you said, with 8ms IO latency the cpuidle governor will have made a correct decision when choosing a state with a target residency ("How long do we have to sleep for this state to be worth it?") of <8ms and doesn't care about the exit latency (I should note, exit latency is kind of treated as worst-case, so you will probably observe significantly less than 2.6ms in your example).
The theoretical worst-case would be an IO device that has a latency just above the target residency of the highest state (if the IO device is slower, the cost of the exit latency will diminish in the IO latency).
If you are in that theoretical worst-case and the exit latency does hurt you, you're much better off not relying on the governor altogether and either using the PM QoS API or disabling the state(s) while you're doing IO, menu (with iowait heuristic) wasn't always doing a good job there either, see the cover letter of my patch.

Unfortunately even if we did say that iowait was a good metric to base cpuidle heuristics on (I have outlined why it isn't in the cover-letter), for how long will we choose to select shallower states? Tasks can be in_iowait for seconds even (i.e. aeons) and if you suffer 'end-to-end' throughput decrease by the exit latency (or for cpufreq the low CPU frequency) (and if that decrease might be a good trade-off for the power saved) is practically impossible for the kernel to tell.
Similarly to the iowait boost comment, if you're maximizing IO performance, that IO request that just completed is hopefully large enough for completion not too happen often or it is just one of many queued requests for the IO device.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds