LWN: Comments on "Application-friendly kernel interfaces" https://lwn.net/Articles/227818/ This is a special feed containing comments posted to the individual LWN article titled "Application-friendly kernel interfaces". en-us Sun, 12 Oct 2025 12:58:14 +0000 Sun, 12 Oct 2025 12:58:14 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net liblinux https://lwn.net/Articles/229438/ https://lwn.net/Articles/229438/ slamb I'm not sure about "would always be in sync". If you require that, people who have multiple kernels on their box would need some mechanism such that the correct liblinux for whatever kernel they happened to boot is dynamically loaded. Seems possible, but it's a step beyond "maintained from the same source". <p>This is one of those areas where the BSDs have an easier time. They do "make world", and it's just inconceivable that an actual end user would mix'n'match kernel and userspace from different versions of FreeBSD. They got away with things like top assuming layout of kernel structures and accessing /dev/kmem for a long time. On Linux, that sort of mutt system is considered normal, so stuff has to be carefully versioned. Fri, 06 Apr 2007 01:06:56 +0000 Why add anything? https://lwn.net/Articles/229403/ https://lwn.net/Articles/229403/ joib One problem is that the number of large page TLB entries is quite limited. E.g. on current Opterons, while you have a 512-entry data TLB for the normal 4K pages, for the 2M large pages you only have 8 entries. So if you have a loop kernel reading/writing from more than 8 big arrays you're going to have TLB trashing.<br> <p> I would presume that for non-HPC applications these non-streaming, irregular access patterns are even more common. Though supposedly AMD is fixing this issue with the upcoming 'Barcelona' by having 128 2M TLB entries, and additionally supporting 1G pages (don't know how many TLB entries for those).<br> <p> For comparison, the Intel Woodcrest has 256 4K and 32(?) 2M TLB entries.<br> Thu, 05 Apr 2007 18:57:44 +0000 Application-friendly kernel interfaces https://lwn.net/Articles/229402/ https://lwn.net/Articles/229402/ joib For large pages, there's <a rel="nofollow" href="http://sourceforge.net/projects/libhugetlbfs">libhugetlbfs</a>, so you can use large pages via LD_PRELOAD without changing the application itself. Thu, 05 Apr 2007 18:38:43 +0000 Why add anything? https://lwn.net/Articles/229330/ https://lwn.net/Articles/229330/ farnz Might be worth looking at <a href="http://linux-mm.org/HugePages">Linux-mm.org on huge pages</a>. In particular, there's a link to <a href="http://lwn.net/Articles/188056/">an LWM article on transparent use of huge pages</a>. The "holy grail" is very definitely transparent use, so that whenever possible, all applications gain; anything that makes it easier to move that way is helpful. <p>One thought; if your mmap parameter is simply a hint that the block will be used in a particular granularity, it's easy to implement. Current mmap sets the parameter to 1 byte (no granularity needed), unless mmaping in hugetlbfs pages, when it sets the parameter to (e.g.) 16M. The kernel then just rounds up to the next highest available page size when possible, or down if not. Thu, 05 Apr 2007 14:15:13 +0000 Why add anything? https://lwn.net/Articles/228702/ https://lwn.net/Articles/228702/ giraffedata I can see the value of using mmap() for this, but I don't think you want mmap() guessing based on the size of the request what page size is best. <p> It's quite possible that 16M of memory will consist of 100 scattered 4K pages of working set and the rest rarely used or even vacant. You wouldn't want to page the whole 16M in and out in that case. <p> Page granularity seems like a perfectly sensible parameter of an mmap, though. Fri, 30 Mar 2007 23:19:16 +0000 Why add anything? https://lwn.net/Articles/228615/ https://lwn.net/Articles/228615/ mjr I'm wondering the same myself. I'm not much for low level hacking, but I fail to see what benefits one would reap from yet another interface.<br> <p> If a separate interface is really necessary for some reason, I'd put the same functionality behind regular libc malloc(); it already does brk() for small allocations and mmap() for large ones I believe, so it could just as well do extra-large allocations via the hugetlb API. (Putting this in malloc instead of mmap would get rid of the partial-munmap issue on the libc end.)<br> Fri, 30 Mar 2007 12:59:16 +0000 Why add anything? https://lwn.net/Articles/228585/ https://lwn.net/Articles/228585/ ncm Why should this need a file system, or a device, or a library at all?<br> <p> It should suffice to call mmap() and ask for an anonymous chunk of 16M, and the kernel can simply recognize that a hugetlb would serve, and use it. If, later, the process unmaps pages within it, the remaining pages can be switched over to the regular mapping scheme; most processes won't. Then it would be easy, safe, and backward-compatible for libc to switch malloc over to allocating hugetlb chunks by default, benefitting everybody.<br> <p> I would also like to see a flag added to mmap() to require that the mapped block be aligned to match its size; e.g. ask for 16M and the bottom 24 bits of the returned address are 0. (Anybody else remember when 68K chips shipped with only 24 address pins, and Apple stuck annotations in the top 8 bits of addresses because the hardware ignored those bits?)<br> Fri, 30 Mar 2007 06:16:30 +0000 Really? https://lwn.net/Articles/228581/ https://lwn.net/Articles/228581/ IkeTo <font class="QuotedText">&gt; I mean, you design a hard-to-use interface, then write your own code which</font><br> <font class="QuotedText">&gt; presents a friendly interface to userspace -- and you write it in</font><br> <font class="QuotedText">&gt; userspace. Well, why not present a friendly interface in the kernel in the</font><br> <font class="QuotedText">&gt; first place?</font><br> <p> Perhaps the whole hugetlb thing tells one of the possible reasons. The original /dev/hshm interface is actually more general than the /dev/hugetlb interface: it allows multiple processes unrelated in ancestry to share the same piece of huge page. It is probably preferable for the kernel API to use only the general interface rather than having to implement both, since every time the interface change it needs to have a "global search" for libraries/applications using the interface, and leave enough time for those libraries/applications to change (if Linus does not say "no" to the change right away). So it might be preferable to implement just the general interface, hoping that it will never change at all; and have another library "cast" it to various different forms that are "more friendly" forms like the hugetlb interface. What unclear to me is actually why one would expect that the new library could be exempted from the global search if it needs to be changed.<br> <p> I think instead of a general liblinux, we should be contented with the tested solutions of, e.g., pthread (futex) and libfam (dnotify): if the functionality fits well into a general audience, the easier interface is implemented in libc, and if it is not, the easier interface is implemented in a functionality specific library. That way, when the generic interface is changed, the kernel developers have fewer places to search for direct users of them; and the specific interface is usable (and thus relied upon) by a more narrow set of end-user applications.<br> <p> Fri, 30 Mar 2007 05:40:33 +0000 Really? https://lwn.net/Articles/228554/ https://lwn.net/Articles/228554/ cpeterso <blockquote>Is it just because kernel->userspace interfaces are set in stone and have to be maintained forever? For that would feel a bit like medieval astronomers -- weaving layer over layer of epicycles so that their spheres would match the real planet trajectories. Here we would have a kernel interface set in stone, then some library code -- which once people use it would again be set in stone, only to add a new glue layer... again and again. Waiting a few iterations might be a better course of action, and I gather from LWN that it is often taken by kernel devs.</blockquote> I think the kernel API <i>can</i> change, so user programs should use the "friendly" userspace library APIs. Fri, 30 Mar 2007 01:16:20 +0000 Really? https://lwn.net/Articles/228523/ https://lwn.net/Articles/228523/ man_ls Do you really think it is a great idea? Pardon for my lack of knowledge about kernel development, but why is it so great? I mean, you design a hard-to-use interface, then write your own code which presents a friendly interface to userspace -- and you write it in userspace. Well, why not present a friendly interface in the kernel in the first place? <p> Is it just because kernel->userspace interfaces are set in stone and have to be maintained forever? For that would feel a bit like medieval astronomers -- weaving layer over layer of epicycles so that their spheres would match the real planet trajectories. Here we would have a kernel interface set in stone, then some library code -- which once people use it would again be set in stone, only to add a new glue layer... again and again. Waiting a few iterations might be a better course of action, and I gather from LWN that it is often taken by kernel devs. <p> If the purpose of this scheme is to have a more powerful interface, I much prefer our editor's suggestion: <blockquote> A separate library for developers trying to do obscure and advanced things with the kernel might be the right solution. </blockquote> I have seen too many complex interfaces that nobody uses because they are so complex, and everyone uses the simplified version. Better start simple, and then add complexity as needed. Thu, 29 Mar 2007 22:21:09 +0000 Application-friendly kernel interfaces https://lwn.net/Articles/228419/ https://lwn.net/Articles/228419/ vmole <p>Strike 2! <i>[Steve readies for the next pitch.]</i> <p>Yeah yeah, I know that code written for forums like this is at best psuedo-code. Hell, I <a href="http://lwn.net/Articles/226285/">blew it</a> just the other day, so I'm hardly the one to be picking on you, but I was amused by the "Show me the code" - "huh?" sequence. <p>Perhaps we can get away with claiming "Well, it was a actually a debugging test for the reader". Right, that's it. Thu, 29 Mar 2007 16:02:23 +0000 Application-friendly kernel interfaces https://lwn.net/Articles/228415/ https://lwn.net/Articles/228415/ ebiederm Yea yea.<br> <p> snprintf(buffer, sizoef(buffer), ....);<br> Thu, 29 Mar 2007 15:38:45 +0000 Application-friendly kernel interfaces https://lwn.net/Articles/228403/ https://lwn.net/Articles/228403/ vmole <p><i>snprintf(buffer, "%s/XXXXXX", PATH_TO_HUGETLBFS);</i> <p>So much for working code... ;-) Thu, 29 Mar 2007 15:16:24 +0000 liblinux https://lwn.net/Articles/228339/ https://lwn.net/Articles/228339/ jospoortvliet Indeed, it really sounds like a great idea. This way, systems like GTK/glibc and Qt/kdelibs could link to this library or even only use it when available to speed some things up, while using workarounds on other OS'es like the BSD's, solaris, mac OS X etc.<br> Thu, 29 Mar 2007 11:54:52 +0000 liblinux https://lwn.net/Articles/228333/ https://lwn.net/Articles/228333/ hummassa This would also permit the kernel devs to further experiment in yanking <br> functionality out of the kernel... things that _could_ be done in <br> userspace without performance penalties _should_ be done in userspace :-) <br> linux + liblinux would be maintained from the same source -- so they would <br> always be in sync -- and this would be really great. <br> Thu, 29 Mar 2007 11:30:32 +0000 Application-friendly kernel interfaces https://lwn.net/Articles/228301/ https://lwn.net/Articles/228301/ ms I think this is a great idea. This allows for greater decoupling between glibc and the Linux kernel and is, IMHO, the proper abstraction. Plus, if the authors of the kernel interfaces are subsequently charged with writing liblinux entries then there could well be cases where the authors rather return to the drawing board and rethink the kernel interface if it's just too damn hard to use from userspace.<br> Thu, 29 Mar 2007 08:21:13 +0000 Application-friendly kernel interfaces https://lwn.net/Articles/228283/ https://lwn.net/Articles/228283/ ebiederm Huh?<br> <p> #define PATH_TO_HUGETLBFS "/dev/hshm"<br> <p> void *map_anon_hugetlb(size_t size)<br> {<br> char buffer[PATH_MAX];<br> int fd;<br> snprintf(buffer, "%s/XXXXXX", PATH_TO_HUGETLBFS);<br> fd = mkstemp(buffer);<br> if (fd &lt; 0)<br> return MAP_FAILED;<br> unlink(buffer);<br> ftruncate(fd, size);<br> return mmap(NULL, size, PROT_READ | PROT_WRITE, MAP_PRIVATE, fd, 0);<br> }<br> <p> Thu, 29 Mar 2007 06:02:46 +0000 Application-friendly kernel interfaces https://lwn.net/Articles/228276/ https://lwn.net/Articles/228276/ orospakr liblinux, eh?<br> <p> now that's an interesting idea.<br> Thu, 29 Mar 2007 04:32:09 +0000 Application-friendly kernel interfaces https://lwn.net/Articles/228270/ https://lwn.net/Articles/228270/ jreiser <i>It's not possible to do normal reads and writes from this filesystem [hugetlbfs] ...</i> <p>and that makes hugetlbfs <b>less</b> than a filesystem. Hugetlbfs is a hack, and it is hard to use. Hugetlbfs is so hard to use that our editor could not find an actual working example to cite. Show me the code! Thu, 29 Mar 2007 03:07:26 +0000