LWN: Comments on "Architecture emulation containers with binfmt_misc" https://lwn.net/Articles/679308/ This is a special feed containing comments posted to the individual LWN article titled "Architecture emulation containers with binfmt_misc". en-us Thu, 04 Sep 2025 10:39:47 +0000 Thu, 04 Sep 2025 10:39:47 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Architecture emulation containers with binfmt_misc https://lwn.net/Articles/679676/ https://lwn.net/Articles/679676/ jejb <div class="FormattedComment"> The problem with both of those is that they contaminate the container image, which makes it hard to handle non native pure images. Secondly, with hard linking, the emulator has to be on the same mount as the image, which usually isn't the case for docker style images and for bind mounting, you require the support of the container orchestration system to perform the bind mount. None of this can't be solved, but it's certainly a lot easier to have the emulation just work.<br> </div> Thu, 10 Mar 2016 22:51:27 +0000 Architecture emulation containers with binfmt_misc https://lwn.net/Articles/679667/ https://lwn.net/Articles/679667/ eternaleye <div class="FormattedComment"> Well, there's also that because it's executed from a pre-opened FD, these may be relevant:<br> <p> - Possibly less overhead, as it doesn't need to do FS traversals to get to the binary<br> - Doesn't break if the user accidentally unlinks it (as was called out as a potential failure mode of what you suggest)<br> - Reduces the divergence between an emulated container and a native container on that arch (as far as the emulated container can see)<br> - Avoids any need to make changes _within_ the container to boot it on another host arch<br> - Likely more I haven't thought of<br> </div> Thu, 10 Mar 2016 21:05:13 +0000 Architecture emulation containers with binfmt_misc https://lwn.net/Articles/679664/ https://lwn.net/Articles/679664/ eternaleye <div class="FormattedComment"> It depends on your flags - in particular, the "O" flag says to pass a pre-opened FD, and pass the FD number as an argument. There's also "C", which implies "O" and calculates credentials according to the binary rather than the interpreter.<br> <p> However, you are correct that no mode of operation seems to pass it on stdin.<br> </div> Thu, 10 Mar 2016 21:01:08 +0000 Architecture emulation containers with binfmt_misc https://lwn.net/Articles/679659/ https://lwn.net/Articles/679659/ smurf <div class="FormattedComment"> Unfortunately this solution still requires the emulator to be statically linked.<br> Hard-linking or bind-mounting a single emulator binary isn't that difficult, so I don't think this feature helps much.<br> </div> Thu, 10 Mar 2016 20:47:34 +0000 Architecture emulation containers with binfmt_misc https://lwn.net/Articles/679625/ https://lwn.net/Articles/679625/ itvirta <div class="FormattedComment"> The filename of the script, yes. It couldn't be through stdin, since the script might need that for user input.<br> <p> <p> </div> Thu, 10 Mar 2016 17:20:17 +0000 Architecture emulation containers with binfmt_misc https://lwn.net/Articles/679556/ https://lwn.net/Articles/679556/ raven667 <div class="FormattedComment"> I'm sure you are probably right, I'm a sysadmin and not much of a developer, but I just have a unformed suspicion that there is some kernel syscall or resource commonly presented inside containers that would treat the open fd from outside the contained environment as an access token proving the program should be allowed to perform operations outside the container that could be leveraged to exit the container. I don't know of a mechanism to do this, so you are probably right and it's not possible, my lack of confidence is more my lack of deep knowledge of this area than any real problem.<br> </div> Thu, 10 Mar 2016 15:39:33 +0000 Architecture emulation containers with binfmt_misc https://lwn.net/Articles/679537/ https://lwn.net/Articles/679537/ RobSeace <div class="FormattedComment"> <font class="QuotedText">&gt; When such a file is recognized, the name of the interpreter for the script will be read from the first line of the file; the interpreter will then be run with the file as its standard input. </font><br> <p> Actually, it just passes the script as a command-line parameter not as stdin, doesn't it?<br> </div> Thu, 10 Mar 2016 12:40:56 +0000 Architecture emulation containers with binfmt_misc https://lwn.net/Articles/679536/ https://lwn.net/Articles/679536/ jejb <div class="FormattedComment"> Actually, that's not possible. The emulator runs inside the container, not outside of it. What is poked through into the container from the outside is a file descriptor, opened in the host OS, which is then mapped and executed inside the container, so any fault in the emulator faults inside the container, not outside of it. It also means, except for the fd of the emulator binary, the emulator has no access to any resources outside of the container (that's why, as I explained in the 0/3 patch, the emulator has to be static ... it can't resolve dynamic libraries outside of the container)<br> </div> Thu, 10 Mar 2016 12:34:12 +0000 Architecture emulation containers with binfmt_misc https://lwn.net/Articles/679500/ https://lwn.net/Articles/679500/ raven667 <div class="FormattedComment"> I'm sure James did good work but I worry that somewhere along the line the interpreter would retain access to some resource from outside the contained environment, such as the mmap of the interpreter binary outside the container as James notes, that would allow privilege escalation, which he doesn't think is possible but it would be great if someone who understands this better than either of us could comment on it.<br> </div> Thu, 10 Mar 2016 04:14:26 +0000