The shrinking role of ETXTBSY
The shrinking role of ETXTBSY
Posted Aug 19, 2021 20:47 UTC (Thu) by willy (subscriber, #9762)Parent article: The shrinking role of ETXTBSY
Well, hmm, no?
Assuming we're on a local filesystem (ie not NFSv3 or something), the write() goes directly into the page cache. Even if the application has used MAP_PRIVATE, that covers how to handle a store from the mmapper, not a write() from somebody else.
So the code does change under you. Now, I don't think we necessarily flush the CPU instruction cache at that point, so you might continue to execute some old instructions for a while, but at some point the CPU is going to notice that the i$ is out of date.
Unless you O_TRUNC, of course. Then, umm ... we get rid of all those pages immediately and your program segfaults straight away.
Posted Aug 19, 2021 23:20 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link] (3 responses)
You mean to tell me that I can instantly segfault an entire system by just running sudo truncate libc.so?
I mean, I'm not that surprised, there are loads of ways a malicious or stupid root can break the system. I'm just impressed by the "segfault every userspace process at the same time" angle.
Posted Aug 20, 2021 0:33 UTC (Fri)
by Karellen (subscriber, #67644)
[Link] (1 responses)
Well, you can instantly kill every userspace process and panic the kernel at the same time on a system with sudo kill -9 -1 1 * Probably. I have not just tried this.
Posted Aug 23, 2021 12:17 UTC (Mon)
by anselm (subscriber, #2796)
[Link]
You can't kill -9 1; the init process is protected from signals for which it doesn't have an explicitly installed signal handler.
Posted Aug 20, 2021 8:32 UTC (Fri)
by taladar (subscriber, #68407)
[Link]
Posted Aug 22, 2021 22:02 UTC (Sun)
by Paf (subscriber, #91811)
[Link] (1 responses)
So, the interpreter doesn’t read in a copy or anything? Unless it’s mmaping it’s going to have a copy of at least part of the file. That’s how read() works.
User space doesn’t work from the page cache unless it’s mmaping.
Posted Aug 22, 2021 22:49 UTC (Sun)
by willy (subscriber, #9762)
[Link]
My system shows the 'cat' binary mapped five times. One is executable.
But thanks for explaining to me how the read() system call and the page cache works.
Posted Aug 23, 2021 9:06 UTC (Mon)
by anton (subscriber, #25547)
[Link] (2 responses)
So that is what could be done on writing to an executed text file: in every affected process, make private copies of the pages of the whole original text (as if on copy-on-write) and populate the mapping with them. There is an opportunity for sharing between several processes that run the same changed binary, but I guess that the benefit is too small and too rare to make that effort. OTOH, the benefit of not having ETXTBSY and not having processes crash when their binary changes is more substantial IMO.
Actually, I would like that also for interpreters (a have had a number of shell scripts crash when I edited them while they were running), maybe by making MAP_PRIVATE|MAP_POPULATE behave that way, or with an additional flag to mmap().
Posted Aug 23, 2021 20:18 UTC (Mon)
by nybble41 (subscriber, #55106)
[Link] (1 responses)
For that matter, any process that could rewrite or truncate a file while it's in use could also corrupt the data beforehand. ETXTBUSY only protects against *accidentally* corrupting a file by updating it while it's in use, by forcing the update to fail. However, since we don't want the update to fail anyway, the solution which doesn't risk data corruption *or* an ETXTBUSY error is to write the new data to a temporary file and rename it over the original. This does require write access to the parent directory, but that doesn't seem unreasonable to me since logically you are modifying the directory to point to a new file. Any attempt to atomically update the content without replacing the file will run into the issue that mapping follow the file, not the content.
Posted Aug 23, 2021 21:28 UTC (Mon)
by anton (subscriber, #25547)
[Link]
The approach to write new file and rename over the old one would a good one, but despite the inconvenience of ETXTBSY linkers don't use this approach, so maybe the problem with the unwritable directories is more relevant than we think.
The shrinking role of ETXTBSY
The shrinking role of ETXTBSY
The shrinking role of ETXTBSY
The shrinking role of ETXTBSY
The shrinking role of ETXTBSY
The shrinking role of ETXTBSY
The shrinking role of ETXTBSY
Even if the application has used MAP_PRIVATE, that covers how to handle a store from the mmapper, not a write() from somebody else.
That's disturbing. Still, if a MAP_PRIVATE page is written to (e.g., with its original content) in one place, it is copied on that write, and later changes to the original don't affect it.
The shrinking role of ETXTBSY
The shrinking role of ETXTBSY
What would be the benefit of doing that with mmap(MAP_PRIVATE|MAP_POPULATE) vs. just reading the entire file into the process's anonymous private memory?
Zero-copy unless someone tries to write to the file. The way I imagine it, the write would block until the copying is completed, so this race condition would not exist.
