Not logged in
Log in now
Create an account
Subscribe to LWN
An "enum" for Python 3
An unexpected perf feature
LWN.net Weekly Edition for May 16, 2013
A look at the PyPy 2.0 release
PostgreSQL 9.3 beta: Federated databases and more
The implementation is based around detecting suspicious patterns. Looks like useful feature to have.
built-in Grammar checker
Posted Feb 15, 2012 9:11 UTC (Wed) by tialaramex (subscriber, #21167)
On the downside they're trying anyway, confident that if their rules are just vague enough they're bound to help. I notice that they haven't tried the benchmark Professor Pullum implicitly offers, typing several pages of a major literary work (say, Moby Dick, or Pride and Prejudice) into this software with everything enabled and verifying that it flags none of the excellent prose as incorrect. I think that building a collection of such inputs would have been simultaneously a good practical test of the software and a disheartening lesson on the difficulty of the general problem.
Several of the rules cited in that link seem harmless, but aren't grammar rules at all. Choosing to highlight violations of style such as double spacing may or may not help people, but it has nothing to do with grammar.
Grammar checker unit tests appreciated.
Posted Feb 15, 2012 9:28 UTC (Wed) by mmeeks (subscriber, #56090)
So lightproof is designed to be minimal and give ~zero false positives. But your idea is a good-one :-) What would be awesome, would be if you could get several of these smallish but representative classic texts, and create some unit tests in the lightproof module, such that we can ensure that not only are there no false positives now, but there will be none in future :-) patches most welcome (this is a volunteer project). The lightproof git repo is here:
git clone git://anongit.freedesktop.org/libreoffice/lightproof
Thanks for checking this out though ! :-)
Posted Feb 15, 2012 12:10 UTC (Wed) by tialaramex (subscriber, #21167)
I will say that the "ying and yang" -> "yin and yang" suggestion, although it's not itself very useful because "ying" is flagged as a spelling mistake already [at least on this system] does show where "zero false positives" is somewhat practical for units larger than a single English word. Of course you would need a lot of work to establish which things are "always" errors.
For example "ad homonym" and "all intensive purposes" are almost always going to be errors, but for every few times you find "tow the line" used when "toe the line" was meant, you'll stumble over a case where an actual rope was being towed.
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds