Far too kind

Posted Jan 8, 2010 20:51 UTC (Fri) by baldridgeec (guest, #55283)
I don't understand this at ALL.

The email client should not locally index anything from an IMAP server. IMAP has a perfectly good search feature built into the protocol. Evolution does the same thing (I'm not even going to discuss Outlook's problems). Is there any email client that gets this right?

Posted Jan 8, 2010 21:39 UTC (Fri) by foom (subscriber, #14868) [Link]

"IMAP has a perfectly good search feature"...yeah right.

The protocol has a reasonable search command, but have you actually tried using full-text search on any IMAP servers? I don't know of any for which it takes a reasonable amount of time to execute. Opening every single message in turn and looking for the text is not a useful implementation of fulltext search.

Furthermore, IMAP is a utterly useless at multiple-mailbox handling. Most every command, including search, operates on a single mailbox at a time. Of course, the user often wants to do a search across all mailboxes (and be notified of new mail in all mailboxes, but that's another discussion...)

Once the servers people use ACTUALLY have a perfectly good search feature, then, maybe, clients will start using it instead of their own local index.

Posted Jan 8, 2010 21:53 UTC (Fri) by baldridgeec (guest, #55283) [Link]

Speed is an issue, true - Courier-IMAP should do indexing and store the results in a database. Using this sort of method I can't see why server-end search wouldn't be almost as fast as local search.

But IMAP not handling multiple mailboxes seems like sort of a misplaced complaint. IMAP is the protocol - the client should be handling connections to multiple servers in a seamless fashion if that's what is desired.

Posted Jan 9, 2010 0:45 UTC (Sat) by foom (subscriber, #14868) [Link]

> IMAP not handling multiple mailboxes seems like sort of a misplaced complaint.

The client can certainly work around it, yes. But it's a pain in the ass design. For example, if you want to wait for new mail to appear in multiple mailboxes, you have to make a bunch of connections to the same server, one per mailbox, just so that each one can sit idle, watching for new mail in the one single mailbox. And some clients try to do this. Others give up push notification, and poll for new mail periodically.

Back to the problem with SEARCH: To implement multi-mailbox search, you need to invoke the SELECT command and the SEARCH command once per mailbox. (remember: this is all on a single server!) With my 44 mailboxes, that'd requires 44 SELECT/SEARCH commands. You could parallelize it by using multiple connections, but that's still a bunch of extra work. I find it unlikely that you'll be able to make that competitive in speed with a single search on your mail client's fulltext index.

Posted Jan 9, 2010 1:18 UTC (Sat) by baldridgeec (guest, #55283) [Link]

My fault - you were actually using the term "mailbox" correctly! I work with Windows guys who never refer to them as anything but "folders" and it rubs off. I was assuming multiple servers/accounts in my previous reply.

Ok, fair enough. Maybe there should be an MSELECT extension that specifies on which mailboxes the following commands should be run...

Ugh, then you have a modal response set, as responses run after an MSELECT will need to include a mailbox name as well as whatever ordinary response they give.

Maybe better to define a MULTISEARCH extension that returns "mailbox/message#"... results. Should we talk to the LEMONADE guys? :)

Posted Jan 9, 2010 1:46 UTC (Sat) by baldridgeec (guest, #55283) [Link]

Actually, a little googling suggests they're considering the implementation of multi-mailbox search already:

Posted Jan 16, 2010 11:24 UTC (Sat) by dlang (subscriber, #313) [Link]

they are also working on implementing fuzzy search to allow ranked results for search-engine like use.

Posted Jan 8, 2010 21:54 UTC (Fri) by dskoll (subscriber, #1630) [Link]

TB3's indexer was a vicious, unpleasant surprise. IMO, it should be off by default, not on by default. Or at the very least, it should be off if you're upgrading from TB2. Leaving it on for new TB3 installations is vicious and unpleasant, but at least it's not a surprising change in behaviour.

Posted Jan 9, 2010 0:21 UTC (Sat) by quotemstr (subscriber, #45331) [Link]

cyrus-imapd has a good indexing implementation called 'squat'. It's far better than a linear search.

Posted Jan 16, 2010 11:22 UTC (Sat) by dlang (subscriber, #313) [Link]

with a cyrus server searching is extremely efficiant (and with a decent server it can be significantly faster than on a slow laptop drive)

I was just recently testing this comparing two mail clients, one that used the server-side search and one that pulled everything to the client and searched there.

This was on a large complex account (80+ folders containing >60K messages)

searching on the client across all folders, 15 min
searching on the server, 4 seconds

there are a lot of BAD IMAP implementations out there, courier-imap is one of them. it talks the IMAP protocol to the client, but is very inefficient on the server.

Posted Jan 16, 2010 21:42 UTC (Sat) by foom (subscriber, #14868) [Link]

>cyrus server searching is extremely efficiant

That's good to hear. I also noticed Squat for Dovecot. I've actually been thinking of switching to Dovecot (from Courier) for a while now. I guess if I ever do I'll try that plugin.

So, I guess it is possible to run a mail server with indexed full-text search in it using readily available software. I do wonder what percentage of actual users have their mail on a server so configured, though. Squat's page makes it sound rather resource intensive, and it is only in an optional plugin.

>searching on the client across all folders, 15 min
>searching on the server, 4 seconds

That just shows that whatever client you're using is rather crap, and also failed to make an index.

