LWN.net Logo

Definition of software bloat - see "Emacs" (veering off-topic)

Definition of software bloat - see "Emacs" (veering off-topic)

Posted Feb 25, 2008 4:43 UTC (Mon) by pr1268 (subscriber, #24648)
In reply to: Definition of software bloat - see "Emacs" (veering off-topic) by atai
Parent article: Emacs news: new maintainer, version 22 pretest

Vi (actually, Vim) opened the file, albeit slowly. The weird part was searching down about 96% of the entire file to find the part he was trying edit. Needless to say the hard disk was quite active for some time.

I'm unsure what kind of success other editors would have - I prefer Kwrite in KDE as my preferred text editor (assuming I've got X and KDE running), but even it hesitates noticeably when opening a text file larger than e.g., 5 megabytes.


(Log in to post comments)

Posted Feb 25, 2008 6:05 UTC (Mon) by joey (subscriber, #328) [Link]

I sometimes fear I'll never find
an editor that uses rewind(3).

Is that the system call I need?
Perhaps it should lseek(2) and read(2).

Why jumping around in text is hard

Posted Feb 25, 2008 6:30 UTC (Mon) by zlynx (subscriber, #2285) [Link]

It would be great if more editors used lseek or mmap to read the files.

But, it is difficult to do this.

If you wish to jump to line 9997, where is it?  How many characters are in a line?  The editor
cannot know unless it scans for each newline from position 0.

Text encodings like UTF-8 complicate this even more because now characters can be variable
sizes as well.

Why jumping around in text is hard

Posted Feb 25, 2008 7:54 UTC (Mon) by nix (subscriber, #2304) [Link]

Even that wouldn't help much, at least on 32-bit systems. You've got a 
choice there between reading everything into memory as (X)Emacs does, or 
mmap()ing with MMAP_PRIVATE and having a well-under-4Gb maximum file size 
limit *and* a requirement to configure huge amounts of swap space, or a 
loooong wait as the file gets copied to new storage at startup (vim of big 
files on slow NFS storage is *painful*, much worse than Emacs), or even an 
editor that essentially implements its own VM management and a sliding 
file access window, copying changed bits to a new file, which would 
necessarily crap out if the file was changed out from underneath it. (I've 
never seen anyone implement the latter, and no surprise: it would be 
really difficult and have no real advantages).

I'd like to find an ideal middle road, but I'm not sure there is one.

Why jumping around in text is hard

Posted Feb 25, 2008 9:02 UTC (Mon) by eru (subscriber, #2753) [Link]

or even an editor that essentially implements its own VM management and a sliding file access window, copying changed bits to a new file, which would necessarily crap out if the file was changed out from underneath it. (I've never seen anyone implement the latter, and no surprise: it would be really difficult and have no real advantages).

Remember the days of 8-bit and earlier 16-bit computers? Text editors at the time normally used techniques like this to be able to edit larger files. Otherwise editing would be limited to something like 30 K (remember that the OS, the ROM, the editor and the data would all have to fit into a 64k address space - if you were lucky to even have that much memory!).

I guess writing an editor that does not keep all its data in memory now qualifies as forgotten tech.

Why jumping around in text is hard

Posted Feb 25, 2008 21:39 UTC (Mon) by nix (subscriber, #2304) [Link]

Waste-of-time tech, more like. It adds significant complexity, several new 
failure modes (other processes changing the file, not something many older 
systems really had to consider), and doesn't even provide anything you 
can't get by privately mmap()ing it and relying on the OS to do 
everything.

`If you want to edit truly vast files in a text editor, get a 64-bit box' 
seems a fairly reasonable requirement: I mean, how often do you want to 
edit multigigabyte text files anyway?

The older systems had to implement something like their fake-VM 
sliding-window things because editing a book is an entirely reasonable 
thing to want to do. But how many people want to edit the entire British 
Library, as one unstructured lump of text? It's hardly a common need.

Why jumping around in text is hard

Posted Feb 26, 2008 6:04 UTC (Tue) by eru (subscriber, #2753) [Link]

I don't really disagree with you in general. But I too have occasionally encountered Emacs (or other text editor) limitations when wanting to browse a really big log file or collected debugging output from some misbehaving program. Actually a pretty common need in software development. Probably this can be seen as misusing the tool, as in those cases I usually have no real desire to change the file. On the other hand it is natural, since Emacs is the way I normally interact with text files. Perhaps for those cases there should be a quick read-only viewing mode that makes no attempt to read anything but the region being viewed, and this mode would be used automatically for files larger than some threshold.

Why jumping around in text is hard

Posted Feb 26, 2008 15:25 UTC (Tue) by anand21 (guest, #38076) [Link]

If you do need to browse a large log file, less is your friend.

Editing is a hard problem. Modern editors do a lot of formatting and storing that information
takes a lot of space. Concurrent access could be solved by locking but it wouldn't help
because the editor must still load the whole file, and then parse it and store temporary
information, for providing a consistent view.

Why jumping around in text is hard

Posted Feb 28, 2008 20:03 UTC (Thu) by nix (subscriber, #2304) [Link]

Locking text files is *really unfriendly*, too. Shades of Windows.

(Note that less also sucks the whole file into RAM, so if it's huge less 
won't do the trick either.)

(Emacs doesn't do a lot of *formatting*, as such, but both the Emacsen 
*can* attach properties/extents/whatever to regions of buffers, and both 
also have to handle multibyte support to the extent of having single 
buffers with many encodings in. All of these things push up the memory 
consumption somewhat, especially the latter.)

Why jumping around in text is hard

Posted Feb 25, 2008 8:04 UTC (Mon) by nix (subscriber, #2304) [Link]

XEmacs at least has a cache mapping (line number -> newline location), 
invalidated on significant buffer changes. Whether it's worthwhile in this 
day and age I'm not sure: it may be faster to scan for nearby newlines 
(probably both in the L2 cache if referenced recently) than to maintain 
the line-start cache; and distant lines won't be in that cache anyway.

Why jumping around in text is hard

Posted Feb 25, 2008 10:20 UTC (Mon) by rsidd (subscriber, #2582) [Link]

qemacs (unfortunately unmaintained) uses mmap(). I tried reading a 450MB file in memory, and it alternately acts very responsively and locks up solid for up to a minute at a time. It also fails to find search strings at the end of the document if I am at the beginning.

emacs takes a few seconds to open the file, but when I was at the top, it found a search string near the bottom quite readily. It does not seem to lock up for more than a second or two. This is an opteron workstation with 2GB RAM, so I guess it's not exercising the swap.

Why jumping around in text is hard

Posted Feb 25, 2008 10:37 UTC (Mon) by tzafrir (subscriber, #11501) [Link]

qemacs. Indeed tester under harsher conditions. And unlike GNU- and X- Emacs it even has
support for bidirectional text rendering.

'nvi -f' also does the trick.

Or, if you only want to view it: less

P.S. Although I'm not a regular Emacs user, this is still a good place to thank the previous
maintainer for the hard work he put in all previous releases and welcome the new maintainers.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds