Advertisement Managed Dedicated Servers - HC Servers Dedicated and managed servers in U.S and Europe. Famous 24x7 support and customizations from HCServers.net!
Sponsored link Serve your customers, not your servers, with VERIO Linux VPS. Full-access test-drive here. |
Strigi 0.5.8
Name: Strigi Version: 0.5.8 Type: KDE Improvement Depend: License: LGPL Homepage: http://www.vandenoever.info/software/strigi/ More Info: http://www.kde-apps.org/content/show.php?content=40889 Description: Strigi Desktop Search Here are the main features of Strigi: very fast crawling very small memory footprint no hammering of the system pluggable backend, currently clucene and hyperestraier, sqlite3 and xapian are in the works communication between daemon and search program over an abstract interface with two implementations: DBus and a simple unix socket. Especially the DBus interface makes it very easy to write client applications. There are a few sample scripts in the code using Perl, Python, GTK and Qt. Writing clients is so easy that any Gnome or KDE app could implement this. Aditionally, there is a simple interface for implementing plugins for extracting information. We'll try to reuse the kat plugins, although native plugins will have a large speed advantage. Strigi also has calculation of sha1 for every file crawled which allows for fast finding of duplicate files. Changelog: 0.5.8 - Improve quiting latency of the most important analyzers. Now Strigi reacts more quickly when you tell it to stop indexing. - Add a tool to analyze the analyzer latency profile and find analyzers that have a high latency. - Bring field names in line with the Xesam ontology. - New analyzers for avi, wav, dds, rgb, sid and ico file types. - Fix deepgrep (finally working again since 0.5.2) and extend the number of fields deepgrep searches in. Now it also searches in fields that are passed as "unsigned char*" to the IndexWriter, but only if they are not registered as being binary fields. - Install two headers that provide metadata information about field types. Basically, these classes publish the ontology that strigi uses. - Fix a problem with CLucene throwing CLuceneError. Because of -fvisibility=hidden, the code did not recognize CLuceneError and caused it to fall through, thus crashing programs using libstreamanalyzer. A unit test to avoid the problem from reappearing has been added. - Fix for system where setenv() is not available (for instance windows). Hopefully those systems have putenv() :) - Remove support for starting strigidaemon with an arbiratry index type and index dir, but add an option to use a different configuration file. This effectively gives the use the same possiblities. - Fixes to the build system that allow strigi to be built and tested as part of a larger project (e.g. kdesupport). - 'strigicmd listFiles' now can be used to retrieve all files/dir indexed under a certain path - Added for support for Gentoo-way compilation flags. Implemented more consistent and pretty optional dependency handling. 0.5.7 - use plugins instead of shared libraries for the indexer backends - lots of bugfixes and cleanups - allow backends to be used in RAM by using ':memory:' as the index name 0.5.6 - Added Xesam User Language parser. Now it will be possible to handle Xesam UserLanguage queries (http://wiki.freedesktop.org/wiki/XesamUserSearchLanguage). - Replaced .ini-based ontology parser with RDF/XML one. - Updated strigicmd: now it's possible to perform searches formulated following xesam userlanguage specifications. - Improved ontology introspection API: properties and classes now have child lists and applicable classes/properties lists. - change IndexReader::getFiles to IndexReader::getChildren. - removed IndexReader::documentId and IndexReader::mTime. - loads of build issues fixed - added a script that helps you to find the patch that broke a unit test - add fieldname for document content per the Xesam standard. - lots more 0.5.5 - GUI now uses a .ui file making future improvements much easier - install detection script for ease of use in other cmake projects - modifying the signature of endAnalysis to endAnalysis(bool complete) for StreamLineAnalyzer, StreamEventAnalyzer, and StreamSaxAnalyzer - add a function to AnalyzerConfiguration that tell how many bytes can be read at most from a stream - add an SAX analyzer plugin that extracts the namespaces used in XML documents. With this it possible to get all XML documents that contain e.g. Chemical Markup Language or Dublin Core. - add a stream for changing the encoding of an incoming stream on the fly - use the new encoding stream to do better email parsing - add m3u stream analyzer. - add simple test program for strigi xesam query builder. It loads a file containing the xesam query. It converts the xesam query into a Strigi::Query object. It serializes the Strigi::Query object to xml for e.g. quality control. - add xesamquery option to strigicmd: now it's possibile to make queries using Xesam language. - add XesamQueryLanguage queries support. Now is possibile to translate xesam queries formulated using XesamQueryLanguage into Strigi::Query objects. - add a cgi executable that takes multipart/form-data and outputs an analysis of the data as xml - give xmlindexer the ability to read from stdin - big improvement in parsing ms word files - better input sanity checking. thanks to zzuf for reporting the errors - cleanup of private variables in classes by introducing a d-pointer 0.5.4 - simplify PollingListener by letting it reuse code from DirAnalyzer - improve parsing speed by reading incrementally large blocks and only if no throughanalyzer is ready yet - extract more data from ogg and ID3 files - new registerField(fieldname) function that gets additional data from the ontology - support of indexwriter calls: addValue(index, field, data, size), addValue(index, field, double_value) to CLucene backend. - enable passing of "Tokenized" flag parameter to CLucene backend - support for the Keyword Terms which are not tokenized during queries - handling of optional indexing flags, which are loaded from the ontology - handling of cardinality constraint when indexing - add keyword query type which allows for using keywords that are not split up. e.g. chemistry.molecular_formula#"C 4 H 10". basically "#" sign tells -- do not tokenize - parse the userlanguage wrapped in xesam query language xml - add searialization to xml for Strigi::Query and Strigi::Term, useful for debugging purposes - add types from the xesam dbus interface to strigitypes.h - add support for gif files - add support for analyzing jpeg files. - add prioritized, multithreaded queue for incoming requests - add option --lastfiletoskip to diranalyzer and xmlindexer - add support for Cc: Bcc: Message-ID: In-Reply-To: References: From: and To: - add exclude and include filters to strigicmd create and update commands - add deindex option, it can be used for removing dirs or files from an index created by strigi 0.3.11 - SunOS, BSD, 64 bit and Coverity compatibility fixes - Search in a set of default fields and not just in the text content of a file, if no specific field is specified. - Add histogram widget to simple search client - Add support for Ogg Vorbis - Better decoding of email headers - Expand Query object to handle nested queryies - Fix highlighting and display of title in search results. - Fix path for the child indexables - Fix memory problems in archivereader - Check for too short file names and omit the RPM trailer from the results. - Add an additional unit test for the RPM stream provider. - Revert raise() to kill(getpid()) because raise hangs the thread. - Install qtdbus library for strigi. 0.3.10 - Convienience classes for using Strigi over Qt 4.2 DBus - Change buildsystem to allow building of deepfind, deepgrep and xmlindexer separately - Speedup of deepfind by selectively using only the analyzers deepfind needs - Many portability fixes (GCC 3, Forte, MSVC) - New, more efficien plugin loading - Add IFilter plugin for the Windows version - Remove the big Strigi lock (faster indexing) - Switch strigiclient to communicate of DBus instead of over a unix socket - Reorganization of the indexer with a new IndexerConfiguration - Improvements of file name filters - New Qt widget for configuring file name filters - Add file name setting to the DBus interface - Move verbose unit tests - Bugfixes in some streams 0.3.9 - Added deepfind and deepgrep, programs that are enhanced versions of find and grep. - Added a new way of storing the configuration in an xml file. - Added a way to search in multiple indexes. - Added xmlindexer, a program that outputs the file parsing results as xml. This is convenient for debugging and can also used by other programs that do not want to write their own indexer. It makes the superior Strigi indexer available to other software in a convenient way. - More versatile filters that determine which files to index. (Flavio Castelli) - Add possibility to index files from the client by feeding the file into the daemon. This opens the way to indexing email from remote servers and web pages. _______________________________________________ Kde-announce-apps mailing list Kde-announce-apps@kde.org https://mail.kde.org/mailman/listinfo/kde-announce-apps (Log in to post comments)
|
Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.