LWN.net Logo

Why jumping around in text is hard

Why jumping around in text is hard

Posted Feb 25, 2008 7:54 UTC (Mon) by nix (subscriber, #2304)
In reply to: Why jumping around in text is hard by zlynx
Parent article: Emacs news: new maintainer, version 22 pretest

Even that wouldn't help much, at least on 32-bit systems. You've got a 
choice there between reading everything into memory as (X)Emacs does, or 
mmap()ing with MMAP_PRIVATE and having a well-under-4Gb maximum file size 
limit *and* a requirement to configure huge amounts of swap space, or a 
loooong wait as the file gets copied to new storage at startup (vim of big 
files on slow NFS storage is *painful*, much worse than Emacs), or even an 
editor that essentially implements its own VM management and a sliding 
file access window, copying changed bits to a new file, which would 
necessarily crap out if the file was changed out from underneath it. (I've 
never seen anyone implement the latter, and no surprise: it would be 
really difficult and have no real advantages).

I'd like to find an ideal middle road, but I'm not sure there is one.


(Log in to post comments)

Why jumping around in text is hard

Posted Feb 25, 2008 9:02 UTC (Mon) by eru (subscriber, #2753) [Link]

or even an editor that essentially implements its own VM management and a sliding file access window, copying changed bits to a new file, which would necessarily crap out if the file was changed out from underneath it. (I've never seen anyone implement the latter, and no surprise: it would be really difficult and have no real advantages).

Remember the days of 8-bit and earlier 16-bit computers? Text editors at the time normally used techniques like this to be able to edit larger files. Otherwise editing would be limited to something like 30 K (remember that the OS, the ROM, the editor and the data would all have to fit into a 64k address space - if you were lucky to even have that much memory!).

I guess writing an editor that does not keep all its data in memory now qualifies as forgotten tech.

Why jumping around in text is hard

Posted Feb 25, 2008 21:39 UTC (Mon) by nix (subscriber, #2304) [Link]

Waste-of-time tech, more like. It adds significant complexity, several new 
failure modes (other processes changing the file, not something many older 
systems really had to consider), and doesn't even provide anything you 
can't get by privately mmap()ing it and relying on the OS to do 
everything.

`If you want to edit truly vast files in a text editor, get a 64-bit box' 
seems a fairly reasonable requirement: I mean, how often do you want to 
edit multigigabyte text files anyway?

The older systems had to implement something like their fake-VM 
sliding-window things because editing a book is an entirely reasonable 
thing to want to do. But how many people want to edit the entire British 
Library, as one unstructured lump of text? It's hardly a common need.

Why jumping around in text is hard

Posted Feb 26, 2008 6:04 UTC (Tue) by eru (subscriber, #2753) [Link]

I don't really disagree with you in general. But I too have occasionally encountered Emacs (or other text editor) limitations when wanting to browse a really big log file or collected debugging output from some misbehaving program. Actually a pretty common need in software development. Probably this can be seen as misusing the tool, as in those cases I usually have no real desire to change the file. On the other hand it is natural, since Emacs is the way I normally interact with text files. Perhaps for those cases there should be a quick read-only viewing mode that makes no attempt to read anything but the region being viewed, and this mode would be used automatically for files larger than some threshold.

Why jumping around in text is hard

Posted Feb 26, 2008 15:25 UTC (Tue) by anand21 (guest, #38076) [Link]

If you do need to browse a large log file, less is your friend.

Editing is a hard problem. Modern editors do a lot of formatting and storing that information
takes a lot of space. Concurrent access could be solved by locking but it wouldn't help
because the editor must still load the whole file, and then parse it and store temporary
information, for providing a consistent view.

Why jumping around in text is hard

Posted Feb 28, 2008 20:03 UTC (Thu) by nix (subscriber, #2304) [Link]

Locking text files is *really unfriendly*, too. Shades of Windows.

(Note that less also sucks the whole file into RAM, so if it's huge less 
won't do the trick either.)

(Emacs doesn't do a lot of *formatting*, as such, but both the Emacsen 
*can* attach properties/extents/whatever to regions of buffers, and both 
also have to handle multibyte support to the extent of having single 
buffers with many encodings in. All of these things push up the memory 
consumption somewhat, especially the latter.)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds