C considered dangerous
C considered dangerous
Posted Aug 29, 2018 22:44 UTC (Wed) by iabervon (subscriber, #722)Parent article: C considered dangerous
Also, chances seem slim that the caller will do something secure in the case where memcpy stops early, given that they didn't think to compare lengths and do something particular in that situation.
I could see having an "end of allocation" value that you get from your allocator and pass around along with your pointers, which has the advantage that getting it wrong is likely to be extremely obvious, it's hard to synthesize it yourself (which you might get wrong), and you don't need to adjust it (ditto).
Posted Aug 30, 2018 1:12 UTC (Thu)
by balkanboy (guest, #94926)
[Link] (1 responses)
Posted Sep 2, 2018 11:23 UTC (Sun)
by xav (guest, #18536)
[Link]
Posted Aug 30, 2018 4:44 UTC (Thu)
by quotemstr (subscriber, #45331)
[Link] (14 responses)
It's surprisingly useful. I wish C11 Annex K (the standard that describes the *_s functions) had taken off outside the Windows world. The annex K functions have much nicer properties than the BSD-style "safe" ones (like strlcpy). In particular...
> I'm not sure another argument to memcpy would help; unlike with strcpy, you have the number of bytes that would be copied anyway
Sometimes, you use only part of a larger buffer. Passing in the buffer size can catch situations in which your partial-buffer size accidentally grows larger than the buffer.
> Also, chances seem slim that the caller will do something secure in the case where memcpy stops early
You're right: there's nothing secure that can be done generically in this situation. That's why memcpy_s just aborts. In the kernel, that would be a panic. A panic is just a DoS and is much better than an escalation of privilege attack.
Posted Aug 30, 2018 16:48 UTC (Thu)
by epa (subscriber, #39769)
[Link] (11 responses)
Posted Aug 30, 2018 21:49 UTC (Thu)
by zlynx (guest, #2285)
[Link] (10 responses)
memcpy and memmove are *NOT IDENTICAL*.
That extra comparison to determine copy direction costs time and branch prediction failure. The only reason everyone seems to think it's free is that common CPU types now run ahead and prime the branch prediction.
I had evidence from oprofile in 2005 that showed memmove was most definitely slower than memcpy. That was on Pentium 3 and 4. Pipeline bubbles on P4 were painful.
Just like all the other sharp edges in C like pointer aliasing, programmers need to *read the documentation* and *know what they're doing* when using memcpy. Having it blow up their program is entirely fair.
Posted Aug 31, 2018 6:07 UTC (Fri)
by epa (subscriber, #39769)
[Link] (1 responses)
Posted Aug 31, 2018 6:40 UTC (Fri)
by iabervon (subscriber, #722)
[Link]
Posted Sep 4, 2018 5:19 UTC (Tue)
by ncm (guest, #165)
[Link] (7 responses)
Complicating the interface for this and similar functions amounts to piling deck chairs on the Titanic into an obstacle course.
It is more than a little stupid to write a new program in C. All the typical pitfalls of pointer manipulations just don't arise in C++. C++ does still have integer overflow UB, as does Rust, and anyway unsigned integers wrapping around is nowhere near so benign as many like to believe.
There would be no problem with C++ in the Linux kernel if not for mindless bigotry, with many fewer opportunities for silly-bugger mistakes, and both faster and shorter code. It will happen eventually. It might happen first in FreeBSD, as the blinkered ancients there are closer to retirement.
Posted Sep 4, 2018 6:46 UTC (Tue)
by quotemstr (subscriber, #45331)
[Link] (5 responses)
That's absurd nonsense. You must be talking about different implementations of memcpy and memmove. so it's not an apples-to-apples comparison. You can always implement memcpy by taking memmove and baking in the memory-direction branches, which means that no matter how efficient your memmove, you can generate an even _more_ efficient memcpy out of it.
I can imagine memmove and memcpy being equally fast, but memmove being faster, by doing more work? Something is clearly wrong with the comparison.
Posted Sep 5, 2018 23:03 UTC (Wed)
by jreiser (subscriber, #11027)
[Link] (4 responses)
Posted Sep 5, 2018 23:45 UTC (Wed)
by quotemstr (subscriber, #45331)
[Link] (3 responses)
Posted Sep 6, 2018 3:42 UTC (Thu)
by jreiser (subscriber, #11027)
[Link] (2 responses)
Posted Sep 6, 2018 6:55 UTC (Thu)
by liw (subscriber, #6379)
[Link]
I tried writing a benchmark. GCC optimised it away.
Posted Sep 6, 2018 7:14 UTC (Thu)
by quotemstr (subscriber, #45331)
[Link]
Posted Sep 21, 2018 18:14 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link]
No, Rust defines integer overflow as twos-complement (though in debug builds it will panic). You can use an always-panicking addition however, but that's not the default.
Posted Sep 1, 2018 16:12 UTC (Sat)
by rurban (guest, #96594)
[Link] (1 responses)
Wrong. memcpy_s does not abort, it returns an error code. With the safeclib (which can be used with the kernel) it returns ESNOSPC: https://github.com/rurban/safeclib/blob/master/src/mem/me...
The kernel code can then decide what to do. It doesn't need to be a panic.
Regarding overlap: memcpy_s detects EOVERFLOW and ESOVRLP. The slow variant memmove_s does allow overlapping regions, but is much slower. People demand a fast memcpy. The safeclib with clang5+ even has a faster memcpy_s than the optimized glibc assembler variants. clang can do these optimizations much better than the handwritten assembler. gcc not so.
Posted Sep 1, 2018 17:31 UTC (Sat)
by quotemstr (subscriber, #45331)
[Link]
Wrong. Annex K functions call the global error handler on constraint violation, which almost always aborts, at least in any sane implementation. Yes, you can set it to do something else, and if the error handler returns, memcpy_s will return an error code. I would not recommend using the functions this way.
Posted Aug 30, 2018 6:59 UTC (Thu)
by anton (subscriber, #25547)
[Link] (10 responses)
Better give him a library function with two length parameters, and an appropriate reaction (probably reporting an error) if the destination buffer is too small.
Posted Aug 30, 2018 7:47 UTC (Thu)
by rschroev (subscriber, #4164)
[Link]
Posted Aug 31, 2018 20:46 UTC (Fri)
by rweikusat2 (subscriber, #117920)
[Link] (8 responses)
I mean, your fictional programmer who can't be trusted to get something as simple as that right suddenly can be trusted to always pass another parameter reflecting the actual size of the buffer correctly? How so? Why shouldn't he just pass (size_t)-1 because he knows "the buffer will be big enough", IOW, doesn't care?
Posted Sep 1, 2018 14:04 UTC (Sat)
by epa (subscriber, #39769)
[Link] (7 responses)
Posted Sep 3, 2018 15:49 UTC (Mon)
by rweikusat2 (subscriber, #117920)
[Link] (6 responses)
The interesting question is "Does one gain anything tangible by passing a destination buffer size to a block memory copy routine which already takes a number-of-bytes-to-copy parameter" and the answer is no: The application programmer still has to ensure that the buffer whose address is passed is large enough so that the intended number of bytes can be copied into it. Ie, the only change is that the programmer now has to track two lengths instead of one. Assuming that tracking one length correctly is already considered too difficult for people to do reliably, how can the situation possibly be improved by introducing another length value application programmers have to worry about?
Posted Sep 5, 2018 8:56 UTC (Wed)
by anton (subscriber, #25547)
[Link] (5 responses)
The tangible benefit that one gets with the memcpy_s variant is that we do not get a buffer overflow even if there is a mistake in tracking whether the assumption is true. True, we do not get that benefit if we write the wrong value as the second parameter of memcpy_s, but that's a less likely mistake in this case, and probably in many others.
You can make the assumption explicit by writing
Posted Sep 5, 2018 9:08 UTC (Wed)
by johill (subscriber, #25196)
[Link] (3 responses)
memcpy_s(buf,sizeof(buf),src,len);
in this case, I guess.
Posted Sep 5, 2018 9:23 UTC (Wed)
by anton (subscriber, #25547)
[Link]
Posted Sep 5, 2018 10:12 UTC (Wed)
by excors (subscriber, #95769)
[Link]
Posted Sep 5, 2018 10:21 UTC (Wed)
by pizza (subscriber, #46)
[Link]
(Personally, most uses of 'memcpy' in my code involve assembling chunks of data into a buffer. Nearly all of the memcpy()s' destinations are at nonzero offsets of the original buffer, so memcpy_s() really doesn't do me any good...)
Posted Sep 5, 2018 20:12 UTC (Wed)
by rweikusat2 (subscriber, #117920)
[Link]
memcpy_s(buf, BUFLEN, src, len)
is changed to
memcpy_s(src, len, buf, BUFLEN)
now, there's no buffer overflow but likely a very serious memory corruption problem.
There's exactly no reason for this assumption except that's it's another way to break the working code but there wasn't any reason for your assumption, either.
C considered dangerous
C considered dangerous
C considered dangerous
They look useful, but...
C considered dangerous
If the source and destination overlap, the behavior of memcpy_s is undefined.
How hard would it have been to fix that sharp edge at the same time as the other things? It could just be defined as equivalent to memmove_s() in that case. Hopefully a future version of the standard will tidy that up.
C considered dangerous
C considered dangerous
C considered dangerous
C considered dangerous
C considered dangerous
memmove faster than memcpy for small N
memmove faster than memcpy for small N
memmove faster than memcpy for small N
memmove faster than memcpy for small N
memmove faster than memcpy for small N
C considered dangerous
C considered dangerous
C considered dangerous
C considered dangerous
I'm not sure another argument to memcpy would help; unlike with strcpy, you have the number of bytes that would be copied anyway.
Yes. So now the programmer has two numbers, and only one parameter. What will he do? Will he write a long sequence that takes both numbers into account? Or will he just call memcpy() with the source length, because he knows that the destination buffer is bug enough? Maybe he is right in his knowledge, and maybe that knowledge does not become wrong during maintenance. But would you bet your program's security on that?
C considered dangerous
> ... because he knows that the destination buffer is bug enough
Well played, sir :)
C considered dangerous
C considered dangerous
C considered dangerous
Let's take a look at an example:
C considered dangerous
char buf[BUFLEN];
And we want to copy len bytes from src to buf, and we know that len<=BUFLEN. Doing the copy with
memcpy(buf,src,len);
would be correct, as would be
memcpy_s(buf,BUFLEN,src,len);
Now, during maintenance, there are changes that can cause len>BUFLEN. Now the first variant has a buffer overflow, while the second does not (but, as written, does not report the error). Tracking BUFLEN and len is not the problem here; it's tracking whether the assumption implied in the memcpy-using code is true.
assert(len<BUFLEN);
before the memcpy, but then you have to "track two lengths" again, and in addition, assertion checking can be disabled, or the length in the memcpy can be changed without changing the assertion.
C considered dangerous
Yes, that's even better. I was actually thinking about the maintenance programmer changing the len parameter of memcpy (to, say, len+a-b), and forgetting to change the assertion; there are often stale comments, so I would not be surprised about stale assertions (unless the tests cause them to trigger).
C considered dangerous
C considered dangerous
C considered dangerous
C considered dangerous
