Amdahl's law, 55 years later
Amdahl's law, 55 years later
Posted Nov 1, 2025 2:55 UTC (Sat) by jreiser (subscriber, #11027)In reply to: Better than forcing it by WolfWings
Parent article: Ubuntu introduces architecture variants
AVX-512 is not worth it for the vast majority of packages or users. AVX-512 is worth it if the computation mix is at least 60% linear algebra or crypto, but otherwise AVX-512 is not worth the effort and the cost in storage space, build time, and administrative morass.
Posted Nov 1, 2025 6:05 UTC (Sat)
by WolfWings (subscriber, #56790)
[Link] (5 responses)
The BMI sub-extensions around AVX2 added a TON of fine-grained data-manipulation instructions down to the bit-level (thus the name), and AVX512 added more advanced masking features and selective packing on write with VPCOMPRESS to get variable-length memory writes from non-contiguous sequential bytes out of the 512-bit register.
So even just dealing with 32-byte blocks of data on something as simple as adding escape backslashes to a string or colorspace conversion can benefit almost fully.
AVX512 really straddles the line with what you'd expect more from GPU compute shaders.
Posted Nov 1, 2025 7:27 UTC (Sat)
by epa (subscriber, #39769)
[Link] (2 responses)
Posted Nov 1, 2025 16:53 UTC (Sat)
by fishface60 (subscriber, #88700)
[Link]
Posted Nov 1, 2025 22:16 UTC (Sat)
by WolfWings (subscriber, #56790)
[Link]
https://www.intel.com/content/www/us/en/docs/intrinsics-g...
For a simple but sufficient example of the escaping-strings idea, and how you can POPCNT the mask used for VPCOMPRESS to get the byte-count written https://lemire.me/blog/2022/09/14/escaping-strings-faster... is a pretty decent point of reference.
Posted Nov 1, 2025 14:32 UTC (Sat)
by khim (subscriber, #9252)
[Link] (1 responses)
AVX512 would have been great if Intel wouldn't have bombed its introduction so badly. Today you may expect AVX512 from AMD in consistent fashion, but not from Intel. This is extremely stupid, but hey, that's Intel for you.
Posted Nov 1, 2025 22:18 UTC (Sat)
by WolfWings (subscriber, #56790)
[Link]
AMD's implementation? 1 VP2INTERSECT per clock cycle as of Zen5, where Intel is was over 25 clock cycles.
Posted Nov 1, 2025 19:15 UTC (Sat)
by thoughtpolice (subscriber, #87455)
[Link]
Amdahl's law doesn't really mean anything here, because the most basic way of applying it is measuring a _single_ enhancement versus the system baseline at a single point in time. But making these instructions more useful with more features, more widely applicable, and improving their speed, expands the number of cases where they can be applied beneficially. Thus, the overall proportion of the system where improvements are possible has increased. This fact is not captured by the basic application of the law.
The reality is that AVX-512 is extremely nice to use but Intel completely fucked up delivering it to client systems, from what I can tell, due to their weird dysfunction and total addiction to product segmentation. We could have already been long past worrying about it if not for that.
Amdahl's law, 55 years later
Amdahl's law, 55 years later
An example of vectorisation helping string operation
Amdahl's law, 55 years later
Amdahl's law, 55 years later
Amdahl's law, 55 years later
Amdahl's law, 55 years later
