Removing support for Emacs unexec from Glibc
The Emacs editor requires a lot of Lisp code and program state before it can start doing its job. That led Emacs developers to add the "unexec" feature to quickly load all of that at startup, but unexec has always been something of a hack. It employs a fairly ugly (and intrusive) mechanism to do its job. Some non-standard extensions to the GNU C library (Glibc) are required, so a plan to eventually eliminate those extensions was met with some dismay in the Emacs community.
As part of the Emacs build process, a simpler version of the editor, called "temacs", is built. That program consists of all of the C files in Emacs, which comprise the Emacs Lisp interpreter and not much else. It is then run to load the standard Lisp startup files and to dump a copy of the running program. That dump is then used as the binary that users invoke when they want to run Emacs.
The mechanism used is in the Emacs unexec() function that converts the running program into a new executable. To do that, it needs to handle memory that was allocated by malloc(), which also requires that the Glibc internal tracking and housekeeping data structures for memory allocation be preserved. The dumping (and restoring) mechanism uses malloc_get_state() and malloc_set_state() to do that, which are the extensions that Glibc developers would like to eliminate.
In mid-January, Florian Weimer posted a
heads-up message
about the change to the emacs-devel mailing list. He said that it
was likely coming this year and was being done to allow changes to the "heap layout between glibc releases, for
standards conformance, performance and security improvements
". The
intent is that existing Emacs binaries will still continue to work, but
that at some point the Emacs build would have to change, he continued. He
also noted that supporting existing binaries "causes a significant ongoing maintenance overhead for glibc upstream,
basically maintaining a separate malloc implementation indefinitely
".
Emacs maintainer John Wiegley voiced his
concerns that the
alterations needed to Emacs would be "a rather significant change to
low-level code that has been functioning for a very long time
". He
suggested extending the timeline for the change and discussing ways to
provide the same or similar functionality. Weimer, though, believes that it is "time to tackle it
at the root
" by fixing Emacs, which is seen as the only user of the
interfaces:
A change "forced" on Emacs by another GNU project was always likely to get the attention of Richard Stallman. He emailed the private—and largely unused—mailing list for the Glibc steering committee (glibc-sc), which Mark Brown reposted to the normal Glibc mailing list (libc-alpha). The steering committee does not really exist in the form of that list anymore, which led Carlos O'Donell to suggest that the mailing list be closed. In the reposted message, Stallman largely echoed Wiegley:
Please have a discussion with the Emacs developers and decide together which course of action is best for the GNU system, and what time scale for that action fits the release schedules best.
Stallman also mentioned that he had
contacted the Glibc maintainers in the emacs-devel thread. That led Weimer
to wonder why it was appropriate to move
the discussion to a private list: "This doesn't really match how
glibc development proceeds today.
" Stallman, though, thought that it would be better discussed in
private: "This a sensitive issue; it is best to discuss it
without an audience.
"
That particular cat was out of the bag at that point, however. The conversation in emacs-devel proceeded by looking at the dump/load functionality, whether it is still needed, and ways to implement it that don't require intimate knowledge of the internals of Glibc's memory-allocation techniques.
Ali Bahrami, who works on the Solaris linker, wondered if it even made sense to continue to support the unexec functionality. Computers have gotten a lot faster since that optimization was added, so it might make sense to simply leave it behind:
So Bahrami and others set out to measure the difference between starting Emacs and starting temacs (which must load all of the different startup files). It became clear that there is still a substantial difference; a half second versus more than five seconds, though that depends on various factors. Different tricks were tried, with some success, but the main problem remained.
David Caldwell suggested one possible approach: taking the compiled Lisp files (.elc files) that temacs loads and adding them into the binary.
Stallman seemed
interested in that idea, "but the real test is in implementing
it
".
According to Paul Eggert, making unexec more
portable has
been on the to-do list for a
while, "and this will light more of a fire
under it
". Concerns that Emacs might not build using a new
Glibc API (which has not even been written yet) that came up earlier in the
thread are not a problem, he said. "Emacs should still build and run even if the glibc API is changed, as Emacs
./configure probes for the glibc malloc-related API and falls back on its
own malloc implementation otherwise.
"
In another message, Eggert outlined how
Emacs uses unexec and how it might be able to get along without it: "Emacs could live without the current unexec in a semi-portable way by doing
what XEmacs does, which is to write out data and mmap it in later
".
He does not know the details of the XEmacs approach, though, and suggested
other possibilities as well.
The discussion continued, looking at various ways for Emacs to accomplish its goals without requiring Glibc to maintain the same heap layout forever. If there was a need to consider the matter privately, it certainly wasn't apparent in the thread. As with many changes of this sort, developers on both sides of the change simply worked things out. Perhaps there will be glitches down the road, but there was plenty of notice and it seems like a direction that the Emacs developers wanted to go in anyway, so it is hard to see any major potholes in the road ahead.
Posted Jan 28, 2016 10:48 UTC (Thu)
by jezuch (subscriber, #52988)
[Link]
AFAIK this is also what the Sun/Oracle's Java VM (HotSpot) is doing: it pre-computes the entire standard library into an internal form and just mmaps it. I'm not sure how long it's been doing that, but it's quite a while.
Posted Jan 28, 2016 11:59 UTC (Thu)
by sorokin (guest, #88478)
[Link] (5 responses)
Couldn't a program just dump all its memory (including memory allocator data structures and code of all shared objects) on disk?
Posted Jan 28, 2016 20:23 UTC (Thu)
by kamil (guest, #3802)
[Link] (4 responses)
I think glibc also performs some runtime checks at init time (kernel version, CPU hardware features) and uses those to select between different versions of some functions, so such a dump would not be as portable as proper dynamically linked binary.
Posted Jan 29, 2016 9:56 UTC (Fri)
by sorokin (guest, #88478)
[Link] (3 responses)
Yes, exactly. That was how I would implement it. I would stress that all shared objects and stacks should be loaded to exactly the same place they were before unexec(), because objects in heap could point to functions in shared objects and to other objects in a stack.
> That would, however, be akin to static linking, which is a practice frowned upon these days. Glibc doesn't even do proper static linking anymore as far as I know, still requiring some shared objects (NSS) to be loaded dynamically.
I don't quite understand how one can implement unexec() and still being able to relink with a new version after exec(). A new version of a library may have different struct layout. As I see this is the main complain of glibc developers, but the same applies to any other library. Developers are not allowed to change layout of structs that are supposed to survive unexec().
> I think glibc also performs some runtime checks at init time (kernel version, CPU hardware features) and uses those to select between different versions of some functions, so such a dump would not be as portable as proper dynamically linked binary.
Good point. In case of emacs one could check if CPU is the same as the one that was present before unexec() and fallback to normal startup if it is not. One could include checking of a checksum of shared objects if one doesn't want keeping their code in an unexec'ed image.
In my optinion unexec() is quite a general method and probably it can be used by other software too. Probably even ability to suspend to ELF/resume from ELF for arbitrary processes would be useful. How to deal with opened file descriptors is an open question though.
Posted Jan 29, 2016 15:51 UTC (Fri)
by kamil (guest, #3802)
[Link] (1 responses)
I'm guessing Emacs is careful to only preserve the malloc() structures, nothing else. But yeah, it must be a hassle, no doubt about it.
> In my optinion unexec() is quite a general method and probably it can be used by other software too. Probably even ability to suspend to ELF/resume from ELF for arbitrary processes would be useful. How to deal with opened file descriptors is an open question though.
You are describing the functionality of a checkpoint/restart system. I've worked on one in the past as a research project, implemented in user space, in the dynamic loader, that worked exactly that way (generating a new ELF binary on checkpoint). I wanted to put a link here but that was 15 years ago and the pages seem to be gone :-(. Tracking open files was indeed a hassle, and dealing with anything more complicated (pipes, sockets, shared memory, GUIs) was basically out of the question.
Posted Dec 5, 2016 11:18 UTC (Mon)
by bjartur (guest, #67801)
[Link]
Posted Sep 10, 2021 14:42 UTC (Fri)
by mirabilos (subscriber, #84359)
[Link]
Don’t.
Nuke this thing.
Posted Jan 29, 2016 1:13 UTC (Fri)
by JanC_ (guest, #34940)
[Link] (3 responses)
;-)
Posted Jan 29, 2016 4:31 UTC (Fri)
by neilbrown (subscriber, #359)
[Link] (2 responses)
There were only two applications that were useful enough to get away with being bloated: emacs and TeX. Both used the same "unexec" trick, though probably different implementations.
Posted Jan 29, 2016 8:08 UTC (Fri)
by iabervon (subscriber, #722)
[Link] (1 responses)
Posted Jan 29, 2016 8:19 UTC (Fri)
by andresfreund (subscriber, #69562)
[Link]
Posted Jan 29, 2016 2:53 UTC (Fri)
by NightMonkey (subscriber, #23051)
[Link] (3 responses)
Don't get me wrong - I know all significant software likely has a sordid story in its history. :)
Posted Jan 29, 2016 3:38 UTC (Fri)
by pr1268 (guest, #24648)
[Link]
And ever since the vi devs got the kernel folks to accommodate their special request, the emacs team has been jealously crying foul. How unfair that vi gets special support from the kernel and emacs has to settle for just glibc! Just kidding. And being facetious. ;-) Seriously, though, while I know glibc has its origins in the work of Roland McGrath and Ulrich Drepper, I'm convinced that RMS made lots of demands/requests etc. with regards to its development, seeing how emacs was/is his longtime pet project.
Posted Jan 31, 2016 4:43 UTC (Sun)
by giraffedata (guest, #1954)
[Link]
Running Emacs, not building it. Every time someone invokes Emacs, it is fast because of its call to glibc's malloc_set_state().
And I'm sure it wasn't put there just for Emacs; the expectation was that other programs could exploit it the same way. It's generic enough, and the problem (programs taking a long time to start up because they have to load a bunch of stuff) common enough, that that would have been reasonable.
So I think the dog wagged the tail.
Posted Feb 1, 2016 23:17 UTC (Mon)
by flussence (guest, #85566)
[Link]
Wouldn't that be the fabled “TTY layer” I've heard horror stories about? I've heard it eats developers...
Posted Jan 29, 2016 17:32 UTC (Fri)
by jtaylor (subscriber, #91739)
[Link]
Posted Jan 29, 2016 21:27 UTC (Fri)
by excors (subscriber, #95769)
[Link] (5 responses)
I remember excitedly discovering the dump() function in Perl, which was meant to dump the program state (e.g. after all the expensive module loading and initialisation) so you could run an undump program to convert it to an executable and repeatedly resume it later. I thought it'd be great for speeding up all my CGI scripts. (This was some years ago.) Sadly I never found an undump program for Linux. But at least there's an undump for HP-UX from 1994, so at least some people had it working once. Android's Java VM seems to take a less scary approach: it creates a zygote process which loads the VM and a bunch of standard libraries and some data resources, and then every time you launch a new app it will fork the zygote and load the app's code into the new process. That means the VM initialisation only needs to happen once per boot, and any memory that isn't modified after initialisation will be shared between all the apps, and it doesn't rely on any special OS/library support beyond what's already needed for forking. And it can also preload complicated things like OpenGL drivers, since file descriptors that tie the userspace driver to the kernel driver will get cloned automatically. Maybe that's only a sensible tradeoff since Android can be certain it's going to be launching lots of Java apps - most people probably don't launch Emacs frequently enough to justify preloading it at boot time and keeping it in memory. But perhaps it could be an option for those who really want it.
Posted Jan 29, 2016 22:00 UTC (Fri)
by andresfreund (subscriber, #69562)
[Link] (3 responses)
On the other hand, it essentially makes things like ALSR useless...
Posted Jan 30, 2016 17:00 UTC (Sat)
by scottt (guest, #5028)
[Link] (1 responses)
Posted Feb 1, 2016 20:31 UTC (Mon)
by smcv (subscriber, #53363)
[Link]
Posted Jan 30, 2016 18:16 UTC (Sat)
by aggelos (subscriber, #41752)
[Link]
Posted Feb 4, 2016 10:47 UTC (Thu)
by njs (subscriber, #40338)
[Link]
this is actually a very-well supported configuration. I recommend it highly. (Well, not preloading at boot time, but loading at first use and then keeping that process around to make future loads fast.) All you have to do is
alias emacs="emacsclient --alternate-editor='' -c"
and you're good to go.
Posted Jan 29, 2016 23:13 UTC (Fri)
by meyert (subscriber, #32097)
[Link] (3 responses)
Posted Feb 4, 2016 23:28 UTC (Thu)
by nix (subscriber, #2304)
[Link] (2 responses)
(It is possible to use a separate allocator just for buffers, either direct mmap() or a 'relocating allocator' -- that one's notably useful under DOS. Yes, Emacs still supports DOS, specifically in conjunction with DJGPP.)
So yes, Emacs contains not just its own memory allocator but two or depending on how you define it even three memory allocators, and of course at one point had two garbage collector implementations as well (now, it only has one: GCPRO is finally dead.)
Posted Feb 5, 2016 10:12 UTC (Fri)
by meyert (subscriber, #32097)
[Link]
Posted Feb 5, 2016 23:28 UTC (Fri)
by nix (subscriber, #2304)
[Link]
Posted Jan 30, 2016 12:46 UTC (Sat)
by ncm (guest, #165)
[Link] (1 responses)
Posted Jan 31, 2016 3:09 UTC (Sun)
by tabefactus (guest, #105520)
[Link]
"Emacs should still build and run even if the glibc API is changed, as Emacs ./configure probes for the glibc malloc-related API and falls back on its own malloc implementation otherwise."
So emacs should be fine without any changes, as long as their API detection and malloc fallback work correctly.
Posted Jan 31, 2016 5:04 UTC (Sun)
by kevinm (guest, #69913)
[Link] (1 responses)
Posted Feb 5, 2016 23:23 UTC (Fri)
by nix (subscriber, #2304)
[Link]
Posted Jan 31, 2016 11:00 UTC (Sun)
by ksandstr (guest, #60862)
[Link]
It appears that Emacs' function should, at its simplest, require just two pieces of information: a cookie of some kind (root pointer?) that the allocator can use to recover its state, and a list of mmap() calls that would leave the allocator able to remove mappings and/or shrink the heap as though it hadn't been restarted at all.
Is there a significant reason why this couldn't be given a stable interface? I could see this being applicable to other language runtimes as well.
[0] besides those that don't mind being subject to a lock-step update cycle whenever a gnu farts within 8000 miles
Posted Feb 3, 2016 6:37 UTC (Wed)
by massimiliano (subscriber, #3048)
[Link] (6 responses)
The VM can load such a snapshot at startup, and start from there. Essentially it is like if an initial preamble of Javascript code had just been executed.
What Chrome does is to prepare a snapshot at build time, containing the pre-jitted Javascript standard library.
Maybe the elisp VM could implement a similar feature?
Posted Feb 4, 2016 0:09 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link] (5 responses)
Posted Feb 4, 2016 23:31 UTC (Thu)
by nix (subscriber, #2304)
[Link] (4 responses)
Posted Feb 17, 2016 17:34 UTC (Wed)
by nye (subscriber, #51576)
[Link] (3 responses)
What the actual fuck?
Posted Mar 1, 2016 19:45 UTC (Tue)
by nix (subscriber, #2304)
[Link] (2 responses)
Both are also running several hundred packages off ELPA via John Wiegley's use-package, including the awesome but... distinctly large Icicles.
So, yeah... I got a server with 24GiB RAM back in 2009 to run various crucial virtual machines and to run Emacs. It turns out that quite a lot of that memory goes to Emacs. :)
(And it's still fast! GC time is imperceptible.)
Posted Sep 11, 2021 17:05 UTC (Sat)
by sdalley (subscriber, #18550)
[Link] (1 responses)
Posted Sep 15, 2021 12:33 UTC (Wed)
by nix (subscriber, #2304)
[Link]
(actually, because a lot of my work happens in containers and VMs and the like, some of the more useful features like language server support are things I just don't use because the environment in which stuff is compiled is not the same as the environment in which my Emacs is running. So I spend a lot of time in terminal emulators too. I should fix that!)
Posted Feb 16, 2016 20:34 UTC (Tue)
by fw (subscriber, #26023)
[Link] (1 responses)
I mentioned this in the emacs-devel thread, but it was pretty out of control at that point.
Posted Feb 17, 2016 17:32 UTC (Wed)
by nye (subscriber, #51576)
[Link]
That's... disgusting. I remember doing something similarly nasty in, like, my second C program ever, and being pretty ashamed about it.
Posted Sep 10, 2021 14:41 UTC (Fri)
by mirabilos (subscriber, #84359)
[Link]
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
The only checkpointing implementation I knew of is CRIU. Today I found a tiny draft of a reimplementation, CryoPID. When I first heard about CRIU I was surprised that the concept was feasible enough to be seriously pursued. Did you write an article about your 15yo project? It would make an interesting read, if you can find it on your computer or on the Internet Archive.
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
It's been sour grapes ever since...
"The kernel has special code to allow vi to clear the screen without leaving framebuffer glyph artifacts."
Removing support for Emacs unexec from Glibc
it seems shocking that Glibc has code that exists primarily to support building Emacs.
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
https://sourceware.org/bugzilla/show_bug.cgi?id=6527
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
In the face of pervasive information leak vulnerabilities, ASLR doesn't seem to help much anyway, so...
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
2.) "on its own malloc implementation otherwise." - WAT? WHY?
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
So yes, Emacs contains not just its own memory allocator but two or depending on how you define it even three memory allocators
Sorry, four: I forgot src/sheap.c. (Though that's just a brk().)
Extremely weird
Extremely weird
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
What about making it generic?
Removing support for Emacs unexec from Glibc
I bet also nodejs does the same for its own set of libraries, and recently the interface for doing so has been cleaned up and made easily usable for every embedder.
Probably the trick is that in V8 the GC can relocate objects, so pointer values can be rewritten at will. Maybe this is something that elisp cannot do?
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc
Removing support for Emacs unexec from Glibc