offload to ssd? neat

Posted Apr 1, 2025 21:33 UTC (Tue) by dankamongmen (subscriber, #35141)
Parent article: Updates on storage standards

https://nvmexpress.org/wp-content/uploads/NVM-Express-Com...

this is pretty cool! i'd worked with some computational memory experimental hardware, but hadn't seen any kind of standard for offloading computation to the memory hierarchy. anyone know devices supporting this?

offload to ssd? neat

Posted Apr 1, 2025 22:06 UTC (Tue) by willy (subscriber, #9762) [Link] (5 responses)

When I was at Intel, it was a fairly common request from customers "Can we offload $feature to the SSD? There's this great paper from $ResearchGroup showing improvements".

The reply from our SSD group was always the same: We have designed our SSD to fit in a certain power/performance/cost envelope. We don't have "spare cycles" on the drive's CPU to process the data. Indeed, we go out of our way to avoid touching the user data with the drive's CPU.

I don't expect this effort to go anywhere unless something has fundamentally changed. Certainly not on consumer devices. Maybe you'll find a research device, or cloud vendors will offer it as part of their virtualized storage devices.

offload to ssd? neat

Posted Apr 1, 2025 22:36 UTC (Tue) by andresfreund (subscriber, #69562) [Link]

Funnily enough, I've been repeatedly in the reverse position. Various storage vendors trying to convince us (various postgres services companies) that we really need to offload parts of postgres storage to their fancy new drives.

offload to ssd? neat

Posted Apr 2, 2025 3:40 UTC (Wed) by kpmckay (subscriber, #134608) [Link]

I tend to agree with the SSD group. In a $/GB or Watts/GB dogfight, it's hard to justify spending extra die area on something without well understood value. Even if there is a $feature that's a net positive for some use case, where's the 2nd or 3rd source going to come from and will $feature behave the same way across vendors? I think that there are a handful of compute functions that make sense to do within a storage/NVMe controller, but they have to be essentially invisible to applications. Nobody thinks of encryption as a "computational storage" function, but I think it's a good example of a widely deployed compute function in storage devices that makes sense. IMO, DPU-like devices are probably the right place to do any real heavy lifting with storage offloads because their resources/functions are amortized/applied over a number of drives, those drives can come from multiple vendors, and they're not necessarily bound to the block device abstraction.

Device → device copy offload?

Posted Apr 3, 2025 7:25 UTC (Thu) by DemiMarie (subscriber, #164188) [Link] (1 responses)

What about offloading device to device copies? The device already needs to be able to do such copies for GC.

Device → device copy offload?

Posted Apr 3, 2025 13:31 UTC (Thu) by willy (subscriber, #9762) [Link]

I assume you mean intra-device copying (as opposed to one device sending data to another device, which is functionality that exists).

Funnily, it's a completely different operation from the device's point of view. The GC operation copies the data block intact and updates the FTL so that lookups of LBA 45678 now point to the new location on flash. An offloaded copy needs to read in the data block, decrypt, update the tags, encrypt, write it out and update the LBA. That's because both the encryption and tag verification use the LBA as the seed, not the location on the flash.

This is why I was never able to get the REMAP command into NVMe. It looks cheap from the host point of view, but it's very expensive for the drive. It saves PCIe bandwidth, but that's not generally the limiting factor.

offload to ssd? neat

Posted Apr 3, 2025 20:14 UTC (Thu) by kbusch (guest, #171715) [Link]

You don't really need the computation and storage to coexist on the same device. I have some gpu type devices that look remarkably like nvme, but they use a vendor specific command set (and don't have flash). Not sure how closely you're tracking nvme driver happenings, but the uring_cmd support with the "nvme-generics" (ex: /dev/ng0n1) created some interesting ways to leverage the protocol. For some extra spice, add device direct io queues (Stephan Bates' lsfmm talk), and you can get peer-to-peer communication among many devices all talking nvme.

offload to ssd? neat

Posted Apr 2, 2025 10:35 UTC (Wed) by kurogane (subscriber, #83248) [Link] (1 responses)

I was very excited about this class of SSD say about 4 years ago. For the main database engine types it's the only thing in any credible research that can raise performance up by the next 10x factor. And reduce latency volatility. The volatility point is especially important, as the more optimized a software-only db engine gets the more horribly the gears get jammed when the I/O channels become saturated.

But when I tried to get my hands on one of them, no luck. I had some great phone calls, promised access to a datacenter with some units of SSDs that had come out of the factory line which were implemented some of the earlier specs being discussed in the article. At the time I represented a company with hundreds of customers of our own, too. But they never came through.

A comment from the SSD sales guy later was they were mainly aiming for hyperscaler orders. But that strategy depends on hyperscalers buying into something that will _reduce_ revenue in their DBaaS services and oblige them to provision more network IO around user's database servers.

Why offload to SSD?

Posted Apr 6, 2025 19:37 UTC (Sun) by DemiMarie (subscriber, #164188) [Link]

Why is this such a huge performance win? Reducing round-trips?