Even that wouldn't help much, at least on 32-bit systems. You've got a
choice there between reading everything into memory as (X)Emacs does, or
mmap()ing with MMAP_PRIVATE and having a well-under-4Gb maximum file size
limit *and* a requirement to configure huge amounts of swap space, or a
loooong wait as the file gets copied to new storage at startup (vim of big
files on slow NFS storage is *painful*, much worse than Emacs), or even an
editor that essentially implements its own VM management and a sliding
file access window, copying changed bits to a new file, which would
necessarily crap out if the file was changed out from underneath it. (I've
never seen anyone implement the latter, and no surprise: it would be
really difficult and have no real advantages).
I'd like to find an ideal middle road, but I'm not sure there is one.
Posted Feb 25, 2008 9:02 UTC (Mon) by eru (subscriber, #2753)
[Link]
or even an
editor that essentially implements its own VM management and a sliding
file access window, copying changed bits to a new file, which would
necessarily crap out if the file was changed out from underneath it. (I've
never seen anyone implement the latter, and no surprise: it would be
really difficult and have no real advantages).
Remember the days of 8-bit and earlier 16-bit computers? Text editors at
the time normally used techniques like this to be able to edit larger
files. Otherwise editing would be limited to something like 30 K (remember
that the OS, the ROM, the editor and the data would all have to fit into
a 64k address space - if you were lucky to even have that much memory!).
I guess writing an editor that does not keep all its data in memory now
qualifies as forgotten tech.
Why jumping around in text is hard
Posted Feb 25, 2008 21:39 UTC (Mon) by nix (subscriber, #2304)
[Link]
Waste-of-time tech, more like. It adds significant complexity, several new
failure modes (other processes changing the file, not something many older
systems really had to consider), and doesn't even provide anything you
can't get by privately mmap()ing it and relying on the OS to do
everything.
`If you want to edit truly vast files in a text editor, get a 64-bit box'
seems a fairly reasonable requirement: I mean, how often do you want to
edit multigigabyte text files anyway?
The older systems had to implement something like their fake-VM
sliding-window things because editing a book is an entirely reasonable
thing to want to do. But how many people want to edit the entire British
Library, as one unstructured lump of text? It's hardly a common need.
Why jumping around in text is hard
Posted Feb 26, 2008 6:04 UTC (Tue) by eru (subscriber, #2753)
[Link]
I don't really disagree with you in general. But I too have occasionally encountered Emacs (or other text editor) limitations when wanting to browse a really big log file or collected debugging output from some misbehaving program. Actually a pretty common need in software development. Probably this can be seen as misusing the tool, as in those cases I usually have no real desire to change the file. On the other hand it is natural, since Emacs is the way I normally interact with text files. Perhaps for those cases there should be a quick read-only viewing mode that makes no attempt to read anything but the region being viewed, and this mode would be used automatically for files larger than some threshold.
Why jumping around in text is hard
Posted Feb 26, 2008 15:25 UTC (Tue) by anand21 (guest, #38076)
[Link]
If you do need to browse a large log file, less is your friend.
Editing is a hard problem. Modern editors do a lot of formatting and storing that information
takes a lot of space. Concurrent access could be solved by locking but it wouldn't help
because the editor must still load the whole file, and then parse it and store temporary
information, for providing a consistent view.
Why jumping around in text is hard
Posted Feb 28, 2008 20:03 UTC (Thu) by nix (subscriber, #2304)
[Link]
Locking text files is *really unfriendly*, too. Shades of Windows.
(Note that less also sucks the whole file into RAM, so if it's huge less
won't do the trick either.)
(Emacs doesn't do a lot of *formatting*, as such, but both the Emacsen
*can* attach properties/extents/whatever to regions of buffers, and both
also have to handle multibyte support to the extent of having single
buffers with many encodings in. All of these things push up the memory
consumption somewhat, especially the latter.)