LWN: Comments on "Killing off /dev/kmem" https://lwn.net/Articles/851531/ This is a special feed containing comments posted to the individual LWN article titled "Killing off /dev/kmem". en-us Sat, 20 Sep 2025 06:36:58 +0000 Sat, 20 Sep 2025 06:36:58 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Killing off /dev/kmem https://lwn.net/Articles/880731/ https://lwn.net/Articles/880731/ aCOSwt <div class="FormattedComment"> &quot;The occasional user-space device driver still needs /dev/mem to function, but it&#x27;s otherwise unused.&quot;<br> <p> AFAIU lilo-24.2 (latest) would also be happy to use it :<br> <p> if ((fd=open(DEV_DIR &quot;/mem&quot;, O_RDONLY)) &lt; 0) return buf_valid=1;<br> <p> (from the fetch function in probe.c)<br> <p> this in order to determine misc. hardware (floppies / disks / video ) related information.<br> <p> OK no harm if it cannot, lilo will just print a warning.<br> </div> Sat, 08 Jan 2022 16:11:33 +0000 Killing off /dev/kmem https://lwn.net/Articles/855218/ https://lwn.net/Articles/855218/ jjulian <div class="FormattedComment"> * Disclaimer: I have nothing to do with any of this! :)<br> </div> Mon, 03 May 2021 17:31:26 +0000 Killing off /dev/kmem https://lwn.net/Articles/852185/ https://lwn.net/Articles/852185/ quanstro <div class="FormattedComment"> i suppose if you consider mainstream *nix variants, linux may have been first. however, 8th edition unix introduced the concept of /proc. and iirc, linux was inspired by plan 9, not 8th edition. in plan 9 there is extra expressiveness. for example, /proc allows inspection of another machine&#x27;s processes without endian/word size concerns via mount and bind. this is how stats(1) works. ioctl is similarly not included.<br> <p> </div> Sat, 10 Apr 2021 17:45:32 +0000 Linux proc influence https://lwn.net/Articles/852134/ https://lwn.net/Articles/852134/ quanstro <div class="FormattedComment"> i recall that as well. at the time, nobody had access to plan 9. so linux /proc was inspired by _papers_ about plan9&#x27;s /proc. one of the things linux missed---perhaps because the plan 9 implementation was not visible---was a dirt cheep way to have lots of little single-job file systems. so linux /proc acquired a few warts. there is more to implement in a linux file system than 9p. but perhaps the 9p subset is good enough for most cases.<br> </div> Fri, 09 Apr 2021 17:49:08 +0000 Well, we remember some different things... https://lwn.net/Articles/852025/ https://lwn.net/Articles/852025/ Kamilion <div class="FormattedComment"> Wow, big thanks to Albert Cahalan and Michael K. Johnson for showing up and taking the time to explain to us johnny-come-latelys.<br> <p> Took me a moment to look at the poster&#x27;s names and realize they were the very people involved.<br> </div> Thu, 08 Apr 2021 21:17:31 +0000 how I remember the history https://lwn.net/Articles/851906/ https://lwn.net/Articles/851906/ k8to <div class="FormattedComment"> The warning on ps -aux led to<br> <p> export I_WANT_A_BROKEN_PS=shutup<br> <p> Being part of my .profile on Linux for many years. At some point around 2012 I realized I didn&#x27;t need it anymore.<br> </div> Thu, 08 Apr 2021 03:58:37 +0000 Yes, procps from /proc ps https://lwn.net/Articles/851905/ https://lwn.net/Articles/851905/ k8to <div class="FormattedComment"> Given variations in proc, ps is still the portable solution, sadly.<br> <p> </div> Thu, 08 Apr 2021 03:52:23 +0000 Linux proc influence https://lwn.net/Articles/851895/ https://lwn.net/Articles/851895/ michaelkjohnson <div class="FormattedComment"> My confident recollection is that the initial implementation of the Linux proc filesystem was explicitly inspired by plan 9&#x27;s proc filesystem.<br> </div> Thu, 08 Apr 2021 00:58:12 +0000 Yes, procps from /proc ps https://lwn.net/Articles/851880/ https://lwn.net/Articles/851880/ quanstro <div class="FormattedComment"> procfs goes back to at least 8ed in 1984, and was included (and expanded) in plan 9 from the beginning.<br> <p> </div> Wed, 07 Apr 2021 17:05:44 +0000 Killing off /dev/kmem https://lwn.net/Articles/851859/ https://lwn.net/Articles/851859/ Paf <div class="FormattedComment"> Oh god, that is .... wow. 😂 <br> </div> Wed, 07 Apr 2021 14:52:06 +0000 Killing off /dev/kmem https://lwn.net/Articles/851830/ https://lwn.net/Articles/851830/ songmaster <div class="FormattedComment"> This isn’t an example of reading the loadavg values, but for some code that goes delving into the internals of the OS to get data out of it I recommend looking at the legacy branch of the lsof program at <a href="https://github.com/lsof-org/lsof/tree/legacy">https://github.com/lsof-org/lsof/tree/legacy</a>. It supported many versions of Unix, and had to find and extract many different pieces of data to generate its output. The 00PORTING file at <a href="https://github.com/lsof-org/lsof/blob/legacy/00PORTING">https://github.com/lsof-org/lsof/blob/legacy/00PORTING</a> mentions briefly how it actually did that, and even takes a dig at “some down-sides to the Linux /proc-based lsof.”<br> </div> Wed, 07 Apr 2021 02:50:47 +0000 Well, we remember some different things... https://lwn.net/Articles/851827/ https://lwn.net/Articles/851827/ michaelkjohnson <div class="FormattedComment"> I found some old tarballs to refresh my memory. ☺<br> <p> Branko Lankester built kmem ps that came earlier.<br> <p> In the earliest procps version I found (0.7), I already tried to honor at least SysV arguments e and f for people whose fingers had been trained on SysV, but provided BSD-style output regardless. Your rewrite implemented multiple personalities, which was naturally much better.<br> <p> It looks like I introduced sorting output in version 0.93 in April 1994, later than I recalled, but before I was aware of you doing work on procps. That version definitely sorts by default, and the &quot;o&quot; option toggles sorting. I also clearly failed to update the man page along with that new feature.<br> <p> Your memory of the transition is different from mine. I was doing a poor job of being maintainer (slow to apply patches and do new releases) but I certainly didn&#x27;t &quot;revert&quot; color support, though I suspect it was there in a patch or fork that I hadn&#x27;t adopted. A fork was the obvious response to an unresponsive maintainer, so no complaints there! I did finally step back formally.<br> </div> Wed, 07 Apr 2021 02:10:05 +0000 Killing off /dev/kmem https://lwn.net/Articles/851783/ https://lwn.net/Articles/851783/ Wol <div class="FormattedComment"> Not Unix, but I worked on minis, and I set up a bunch of work queues with very strict limits, so if you wanted to fire off a load of jobs you could hammer the system without impacting everyone, My favourite was the &quot;quick&quot; queue, which ran at highest priority, but had a wall-clock-limit of 30 seconds. If it over-ran that it just got killed.<br> <p> Cheers,<br> Wol<br> </div> Tue, 06 Apr 2021 16:53:10 +0000 Killing off /dev/kmem https://lwn.net/Articles/851779/ https://lwn.net/Articles/851779/ corbet Look at sendmail, for example; it will stop processing mail if the system gets too busy. Tue, 06 Apr 2021 16:24:40 +0000 Killing off /dev/kmem https://lwn.net/Articles/851778/ https://lwn.net/Articles/851778/ shakkhar <div class="FormattedComment"> <font class="QuotedText">&gt; In the distant past, when computers were scarce and it was common to run many tasks on the same machine, jobs that were not time-critical would often consult the load average and defer their work if it was too high. </font><br> <p> Can anyone share algorithm / code / doc which exemplifies this practice?<br> </div> Tue, 06 Apr 2021 16:17:37 +0000 Killing off /dev/kmem https://lwn.net/Articles/851735/ https://lwn.net/Articles/851735/ nix <div class="FormattedComment"> Hah, that&#x27;s nothing! The original home of nightmares is xterm, and while there is no easily web-accessible canonical xterm source (only tarballs), there are random github mirrors I can point at. Look at main() here: <a href="https://github.com/joejulian/xterm/blob/master/main.c">https://github.com/joejulian/xterm/blob/master/main.c</a>. Look at the wonderful tangle in Tinput() here: <a href="https://github.com/joejulian/xterm/blob/master/Tekproc.c">https://github.com/joejulian/xterm/blob/master/Tekproc.c</a>. Then fear, for this is code people are still using. (Though it could be worse still. It could be procmail.)<br> <p> </div> Tue, 06 Apr 2021 14:03:57 +0000 Killing off /dev/kmem https://lwn.net/Articles/851734/ https://lwn.net/Articles/851734/ foxcrisp <div class="FormattedComment"> Earlier unixes required apps to read from /dev/kmem, and know the format of the kernel data structures and location. Linux changed all of that, by exposing most things via /proc - mostly simple text strings. In a security based world, /dev/mem and /dev/kmem are just holes to allow access to any part of memory. Whilst the early implementations used unix group permissions, that just meant delegating the security mechanisms to the group mechanisms. That simply opens up the surface area (either get your self root, or get access to the relevant group for reading /dev/kmem).<br> <p> In a sandboxed container, one doesnt need /dev/kmem, so, one could argue that if its not needed by some apps, it is not needed by any apps. <br> <p> It is a shame if we lose it, but few apps truly needed it (I used it for dtrace, a while back - but would have to research the proposed alternatives).<br> </div> Tue, 06 Apr 2021 13:53:41 +0000 how I remember the history https://lwn.net/Articles/851716/ https://lwn.net/Articles/851716/ acahalan <div class="FormattedComment"> Somebody wrote the original ps.<br> <p> Based on that, Michael K Johnson wrote the procps.<br> <p> Somebody else ended up maintaining procps for a while, adding color to the output, but then not much happened.<br> <p> Michael K Johnson, then at Red Hat, decided (was told?) to maintain procps. He reverted to the pre-color version of the code. He put out a call for help, and Albert Cahalan responded with the suggestion that ps support both BSD and SysV syntax like OSF/1 (later renamed Tru64 then Digital UNIX) and AIX did. This would have been 1996 probably, or perhaps 1997. Sorted output was possible, but I don&#x27;t believe it was the default. It should not be the default, mainly because ps is often used when a system is low on memory but also because partial output is desirable when running on a failing kernel. Sorting with the &quot;O&quot; option appears to have a BSD origin.<br> <p> Albert Cahalan rewrote procps, initially just to prove that it would be possible to go beyond what OSF/1 and AIX could do, parsing mixed BSD and SysV options. (OSF/1 could only do one or the other, not mixed) There was then some human conflict relating to &quot;ps -aux&quot; printing a warning. Craig Small over at Debian started using Albert Cahalan&#x27;s new code. This code definitely did not sort by default.<br> <p> Michael K Johnson turned over a CVS repository to Rick van Riel and Ingo Molnar, excluding Albert Cahalan without explanation. This was almost certainly in 1997. Albert Cahalan then put a version 3.x.x on sourceforge, where he maintained procps for about a decade. At some point the 2.x.x version was made unreliable, grouping processes as threads if they happened to share various attributes as procps non-atomically looked at them. Albert Cahalan instead enhanced the /proc filesystem by adding the /proc/*/task/ directories and the thread counts. All distributions, including Red Hat, switched over to Albert Cahalan&#x27;s procps 3.x.x code.<br> <p> After about a decade maintaining procps, Albert Cahalan became too busy due to a large family. Also, he was demotivated because he found that it was impossible to stop Red Hat from hacking things up in ways that would add bugs and ill-considered compatibility troubles. This led to Craig Small, the Debian package maintainer, joining up with some other people to start the 4.x.x version series elsewhere.<br> <p> </div> Tue, 06 Apr 2021 13:09:35 +0000 Killing off /dev/kmem https://lwn.net/Articles/851731/ https://lwn.net/Articles/851731/ tux3 <div class="FormattedComment"> eBPF can feel pretty limiting sometimes. Thankfully there are Systemtap Guru scripts for all my &quot;poke at kernel structures without manually writing a module&quot; needs :)<br> </div> Tue, 06 Apr 2021 12:39:43 +0000 group kmem ~= root https://lwn.net/Articles/851728/ https://lwn.net/Articles/851728/ michaelkjohnson Given the information available in /dev/kmem, setgid kmem is insignificantly different from setuid root. It <i>feels</i> better, but in the end it needs to be treated the same from a security perspective. Tue, 06 Apr 2021 12:08:23 +0000 Yes, procps from /proc ps https://lwn.net/Articles/851725/ https://lwn.net/Articles/851725/ lyda <div class="FormattedComment"> Can&#x27;t remember exactly why, but I had to look for a process in C on SCO once and discovered that the two ways to do it were to pull it out of /dev/kmem or to parse ps output. Massive gaping hole in libc in my mind. The /proc fs is a good unix-y solution.<br> <p> I can imagine switching to it saved a massive amount of hassle.<br> </div> Tue, 06 Apr 2021 11:40:36 +0000 Killing off /dev/kmem https://lwn.net/Articles/851715/ https://lwn.net/Articles/851715/ markh <div class="FormattedComment"> /dev/kmem was readable by group kmem, so programs requiring access to it could be made setgid kmem. (That is still the case for /dev/mem and /dev/port.) It&#x27;s still a security concern, but better than requiring setuid root.<br> </div> Tue, 06 Apr 2021 03:56:47 +0000 Killing off /dev/kmem https://lwn.net/Articles/851691/ https://lwn.net/Articles/851691/ Paf <div class="FormattedComment"> “ but the truly masochistic can wade through what must be one of the deeper circles of #ifdef hell”<br> My God. You were *not* kidding about the ifdefs.<br> </div> Mon, 05 Apr 2021 21:36:48 +0000 Killing off /dev/kmem https://lwn.net/Articles/851689/ https://lwn.net/Articles/851689/ luto <div class="FormattedComment"> Fortunately, eBPF can replace these legacy /dev/kmem uses.<br> <p> /me runs<br> </div> Mon, 05 Apr 2021 21:05:33 +0000 Yes, procps from /proc ps https://lwn.net/Articles/851684/ https://lwn.net/Articles/851684/ michaelkjohnson <div class="FormattedComment"> Yes, that was the source of the name.<br> <p> The original ps for Linux, often requiring rebuild after a new kernel build, if any of the structures had changed, used /dev/kmem. And after procps was released, the original ps was sometimes referred to as &quot;kmem ps&quot; to differentiate.<br> <p> The original proc filesystem did not have enough functionality for a full replacement version of ps. I modified it to have all the necessary data for ps and uptime, and then wrote procps as a suite of programs that used the new functionality.<br> <p> The output was formatted compactly (keep in mind this was when a 386sx16 was a decent machine) and I separated the stat and statm files because of the expense of producing the statm data, then in ps I kept track of whether statm needed to be read in order to produce the output.<br> <p> As far as I know, my original procps was the original implementation of ps that defaulted to sorted output. All versions of ps that directly read /dev/kmem, as far as I know, listed the data in the order it happened to find it in the kernel memory it was digging through, and I was tired of juggling sort arguments when invoking ps.<br> <p> I believe my original procps was also among the first, if not the first, to just recognize both BSD or SysV command line arguments and do what you meant, rather than requiring you to remember which syntax you needed to use on this particular system.<br> <p> In any case, I don&#x27;t know whether I was actually the first to introduce either or both of sorting internally and honoring both BSD and SysV arguments, or if one or both were previously invented and I unknowingly reimplemented ideas that already existed.<br> </div> Mon, 05 Apr 2021 20:30:05 +0000 Killing off /dev/kmem https://lwn.net/Articles/851678/ https://lwn.net/Articles/851678/ josh <div class="FormattedComment"> You can do that with /proc/kcore now.<br> </div> Mon, 05 Apr 2021 18:43:18 +0000 Killing off /dev/kmem https://lwn.net/Articles/851676/ https://lwn.net/Articles/851676/ ribalda <div class="FormattedComment"> Am I the only one that was using:<br> <p> cat /dev/kmem &gt; core<br> gdb vmlinux core<br> <p> To debug the current state of the kernel/drivers?<br> <p> Of course, never enabled in production. But for bringup is extremely helpful.<br> </div> Mon, 05 Apr 2021 18:34:27 +0000 Killing off /dev/kmem https://lwn.net/Articles/851673/ https://lwn.net/Articles/851673/ sjfriedl <div class="FormattedComment"> <font class="QuotedText">&gt; How did unprivileged code determine the load average?</font><br> <p> I think the `ps` command was setuid root; what could go wrong? :-)<br> </div> Mon, 05 Apr 2021 18:15:43 +0000 Killing off /dev/kmem https://lwn.net/Articles/851665/ https://lwn.net/Articles/851665/ nickodell <div class="FormattedComment"> <font class="QuotedText">&gt; But how does one determine the current load average? Unix kernels have maintained those statistics for decades, but they originally kept that information to themselves. User-space code that wanted to know this number would have to do the following:</font><br> <p> <font class="QuotedText">&gt; * Read the symbol table from the executable image of the current kernel to determine the location of the avenrun array.</font><br> <font class="QuotedText">&gt; * Open /dev/kmem and seek to that location.</font><br> <font class="QuotedText">&gt; * Read the avenrun array into a user-space buffer.</font><br> <p> How did unprivileged code determine the load average? Were unprivileged users allowed to read /dev/kmem in the past?<br> </div> Mon, 05 Apr 2021 17:56:39 +0000 Killing off /dev/kmem https://lwn.net/Articles/851660/ https://lwn.net/Articles/851660/ josh <div class="FormattedComment"> Did you call it &quot;procps&quot; because it&#x27;s a ps that uses /proc rather than other methods?<br> </div> Mon, 05 Apr 2021 17:31:40 +0000 Killing off /dev/kmem https://lwn.net/Articles/851650/ https://lwn.net/Articles/851650/ michaelkjohnson <div class="FormattedComment"> Well, that took a while.<br> <p> Not needing /dev/kmem was one of the points when I first created procps lo these many years ago.<br> </div> Mon, 05 Apr 2021 16:04:19 +0000