Ushering out strlcpy()
Ushering out strlcpy()
Posted Aug 26, 2022 15:48 UTC (Fri) by wtarreau (subscriber, #51152)In reply to: Ushering out strlcpy() by tialaramex
Parent article: Ushering out strlcpy()
Yep absolutely, but I thought about it because it was a perfectly valid real-world example of something that can be done extremely efficiently when you manipulate bytes and that one cannot afford to process as individual strings using allocations nor doing memmove() etc.
> I assume you don't consider the behaviour if hdr is in fact pointing at something that is not a zero-terminated string (ie a buffer overflow with undefined behaviour) to be desirable, and so we don't need Rust's unsafe which is the only way to duplicate that.
Absolutely. These are final implementation details. Just like it would be fine to require to know the length upfront and use strnchr() if needed.
> https://gist.github.com/rust-play/59883fd0aecbfa0c988f2bf...
> [ I have not tested this code, but I believe it does what your C does ]
Thanks! [ I have not tested mine either :-) ]
So overall once presented as a byte array like this it looks similar and should be of equivalent complexity. It may miss some optimizations that can be done for strchr() for example, that would allow to skip 8 bytes at once (or possibly even more using vector instructions), but overall it looks similar.
> One very obvious difference is that Rust doesn't have pointer arithmetic, so we're writing the Rust in terms of indexing into the slice
I'm fine with this, I'm used to both forms in C as well, it's just a matter of preference. a[b] is exactly the same as b[a] and *(a+b) or *(b+a) in C, so writing -1[ptr] is valid but only useful to confuse the reader :-)
> Rust doesn't think that "any slice of bytes" is a string, but, on the other hand, it also thinks most of C's "string" features are reasonable things to do to a slice of bytes, so this is mostly a matter of terminology. You can for example call make_ascii_lowercase() on a slice.
Makes sense. The only thing you'll miss then is the machine-specific optimizations that went into a number of C libraries for byte search, fill or move (which can be significant for strchr() or memset()).
> If it wasn't actually ASCII text before then what this does is well defined but probably not very useful.
Sure but the point of such protocols precisely is that you don't care as they're byte-oriented and very fast to process when done right.
Thanks!
