Weekly Edition Return to the Press pageSponsored link Serve your customers, not your servers, with VERIO Linux VPS. Full-access test-drive here. |
Recoll: A search engine for the Linux desktop (Linux.com)
Linux.com looks at
Recoll. "Desktop search engines are all the rage these days. While
Beagle may be the most popular desktop search engine for Linux, there are
alternatives. If you are looking for a lightweight and easy-to-use yet
powerful desktop search engine, you might want to try Recoll. Unlike
Beagle, Recoll doesn't require Mono, it's fast, and it's highly
configurable. Recoll is based on Xapian, a mature open source search engine
library that supports advanced features such as phrase and proximity
search, relevance feedback, document categorization, boolean queries, and
wildcard search."
(Log in to post comments)
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 2:27 UTC (Tue) by drag (subscriber, #31333) [Link] Wow. I had no idea.
Looks like we will have nearly as many desktop search tools as we do Window managers.
I found this, the freedesktop project to try to unify them. XesamAbout, previously known as WasabiAbout.
It is working with:
Is there a technical reason why we have so many different search engines? It is just people trying different things or is just they decided to do their own in comparative isolation?
Just curious. I used beagle for a while, but now I enjoy the lightweight and fuse-friendliness of Tracker.
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 6:22 UTC (Tue) by Zero_Dogg (subscriber, #31310) [Link] I agree, we need a common search backend (possibly with some simple plugin architecture so that additional filetypes and such can be added), and then we can just have different interfaces. One official per desktop environment (GNOME Desktop Search, KDE Desktop Search, XFce Desktop Search...) that just uses the same backend. That would probably improve the quality of the search tools available and reduce the amount of double work.
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 8:46 UTC (Tue) by khim (subscriber, #9252) [Link] One backed makes sense when there are well-defined problem. Search is definitely not well-defined problem (why do we have Google, Yahoo, etc ?). Sure - if you have limited number of files to be searched... you don't need any search engine at all: grep is enough. If you have a lot of files - you have tough (and not well-defined) problem.
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 11:29 UTC (Tue) by jond (subscriber, #37669) [Link] For me, having a choice of backends is important. I've tried and tried to use and like beagle, but it just keeps busting my machine. The last time I got OOM because of a rogue beagle indexing process was the final straw.
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 18:34 UTC (Tue) by superstoned (subscriber, #33164) [Link] The strigi author started talks about a comon interface between searchsystems, so apps can just use dbus to talk to whatever is running.
But I don't get what you want. You want 1 search engine, and all apps
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 9:24 UTC (Tue) by oever (guest, #987) [Link] Nepomuk-KDE is a not a search tool, but a metadata framework for the next KDE version. It uses Strigi for extracting metadata, indexing and searching. Strigi can use different indexes. One can simply write an index backend and use that. Nepomuk-KDE has implemented an indexing backend for Strigi that uses an RDF store. Strigi itself has virually no dependencies (libz, libbz2, libxml2) in contrast to the other search engines. The speed of data extraction for Strigi is unrivalled. It can extract data from deeply nested files, e.g. from a text file in a zip file attached to an email. This is because it is the only indexer that uses streambased fileanalysis. This speed is available to the other search engines too. All they need to do is use the 'xmlindexer' executable that comes with Strigi or link to the library 'libstreamanalyzer'. KDE4 uses libstreamanalyzer to provide the desktop application with metadata. This ensures consistency between the data in the search index and the data shown in applications.
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 11:18 UTC (Tue) by drag (subscriber, #31333) [Link] I wouldn't go around saying your the fastest this or the fastest that unless you actually are able to back it up with something.
Tracker for me is very fast. It has a unnoticable impact on even laptops when it's just started for the first time. After running for several days the thing is still using only 7MB rss.
Also it's FUSE friendly, which I like since I serve all my media files over sshfs. How does Strigi work on FUSE?
I donno. I'm willing to try anything and it seems like tracker development folks realy aren't that active, unfortunately.
I know that it's going to be a while before you get down to one or two engines that people will like overal.
Does anybody have any experiances with anything else?
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 11:42 UTC (Tue) by oever (guest, #987) [Link] Here's a comparison. Note that it is somewhat outdated.
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 17:20 UTC (Tue) by eklitzke (guest, #36426) [Link] That benchmark was released about a week before the latest tracker release, which was supposed to massively improve the speed of tracker. According to the tracker website, the indexing is _much_ faster now, and they claim to be able to index 100 files per second on ext3 (according to them, basically the maximum possible speed taking I/O time into account). This is more than twice as fast as Strigi in the benchmark you posted, although it goes without saying that the tests would need to be run on the same machine to really be comparable.
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 19:03 UTC (Tue) by superstoned (subscriber, #33164) [Link] Still, does tracker have the deep indexing feature?
Anyway, numbers would be good. Maybe I can try to provide some, or LWN.net
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 11:24 UTC (Tue) by akumria (subscriber, #7773) [Link]
I can feel a grumpy editor's guide to desktop searching coming on.
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 15:07 UTC (Tue) by job (subscriber, #670) [Link] That would be interesting. And please include the proper unix tools Glimpse and Swish. I've used Glimspe now and then during the last ten years to index my home and these new insanely bloated graphical desktop utilities adds nothing, as far as I can tell. I'm not interested in metadata, it's all plaintext anyway.
Recoll: the search engine for the Linux desktop! :) Posted Apr 25, 2007 14:32 UTC (Wed) by gvy (guest, #11981) [Link] Glimpse has nice features like fuzzy search (see also agrep) but it's not free software; Swish-E has had its own problems (and rather stacks up against Xapian Omega or mnoGoSearch indexers which are primarily for web).
Didn't hear about Tracker before; I do maintain Xapian and Recoll packages for ALT Linux.
BTW search.gmane.org is powered by xapian.
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 20:39 UTC (Tue) by zorgan (subscriber, #4016) [Link] I can feel a grumpy editor's guide to desktop searching coming on.Yeah, I could really see our editor getting grumpy when he has beagle running on his laptop everytime he starts a session...
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 12:38 UTC (Tue) by mcz (guest, #44861) [Link] I tried beagle, Kat, strigi, and other.The only one that I find really good is Recoll. Very light (in half an hour I got the index for 12 gbit of files), very simple to use and very powerful.
For me it's surely the best one I found.
mcz on Sidux 64bit
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 25, 2007 14:34 UTC (Wed) by gvy (guest, #11981) [Link] Wow, some 6 to 12 months ago Recoll would have trouble indexing my ~3 gigs of mail archive... (we've discussed that shortly with the author, should also try at more RAM-rich system)
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 14:01 UTC (Tue) by tzafrir (subscriber, #11501) [Link] The name "desktop search" seems to suggest a single-user system.
the old "locate" ignored file permissions. The "s" in "slocate" is because its indexing also indexes file permissiones, and the slocate tool will only tell you about files you're allowed to see.
So what about search tools? Can I run just one indexer on a multi-user system?
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 24, 2007 17:24 UTC (Tue) by eklitzke (guest, #36426) [Link] The default behavior of tracker is to run as a login process and just index your home directory. You can add more directories for it to index (and also blacklist directories), but it isn't designed to index the entire filesystem.
Recoll: A search engine for the Linux desktop (Linux.com) Posted Apr 25, 2007 16:01 UTC (Wed) by jengelh (subscriber, #33263) [Link] "If you need to search, you are too lazy to tidy up."
what else is the computer for? Posted Apr 26, 2007 14:27 UTC (Thu) by xoddam (subscriber, #2322) [Link] Of course I'm too lazy to tidy up, what do I employ a computer for if not to take care of such housekeeping?
What I need now is a metadata indexing service for my SO's wardrobe.
|
Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.