Posted Mar 6, 2007 22:52 UTC (Tue) by nix
In reply to: bad taste
Parent article: Quote of the week
Plus, the length byte was never long enough, and the overhead of keeping
the length up to date dominated surprisingly often, even dead-reckoning
them. At least with null-terminated strings you don't need to work out the
length unless you need it.
String ADTs make more sense :) internally they can do whatever they like,
possibly a varying representation depending on the length.
I did this and more with an adaptive string ADT I wrote a few years ago
for an application my then employer wanted. If you kept asking for the
length of a string it started tracking the length itself; if you kept
inserting and deleting from it, and it was long enough, it switched the
representation to a buffer-gap; if you kept on asking for subsets of the
string and it was long enough to blow the dcache (I randomly picked 64Kb),
it turned itself into a position-keyed binary tree, and if the string was
long enough and rarely-read enough it started zipping the longer and
more-rarely-referenced hunks up with zlib.
There would doubtless be more tricks I could have used, but that was all I
needed to get performance up for that application. It wasn't dealing with
strings longer than 200Mb, after all. ;)
One of these days I should rewrite it (I'd call it `clean it up' but it's
too dirty for that, it needs a rewrite, not least because I want to hold
the copyright this time) and release it, only there's probably no point as
someone else has doubtless written something much better.
(Judy trees, which I didn't discover till much later, of course knock the
socks off this, but their API uses such awful names for basically all its
functions that I've not yet been able to bring myself to use them. I
wonder if they'd accept a patch adding names that it's actually possible
to remember? The Great Lowercase Letter and Vowel Shortage ended *years*
to post comments)