|
|
Subscribe / Log in / New account

Grammar checker unit tests appreciated.

Grammar checker unit tests appreciated.

Posted Feb 15, 2012 9:28 UTC (Wed) by mmeeks (subscriber, #56090)
In reply to: built-in Grammar checker by tialaramex
Parent article: LibreOffice 3.5 released

> On the downside they're trying anyway, confident that if their rules are
> just vague enough they're bound to help. I notice that they haven't tried
> the benchmark Professor Pullum implicitly offers, typing several pages of
> a major literary work (say, Moby Dick, or Pride and Prejudice) into this
> software with everything enabled and verifying that it flags none of the
> excellent prose as incorrect.

So lightproof is designed to be minimal and give ~zero false positives. But your idea is a good-one :-) What would be awesome, would be if you could get several of these smallish but representative classic texts, and create some unit tests in the lightproof module, such that we can ensure that not only are there no false positives now, but there will be none in future :-) patches most welcome (this is a volunteer project). The lightproof git repo is here:

git clone git://anongit.freedesktop.org/libreoffice/lightproof

Thanks for checking this out though ! :-)


to post comments

Grammar checker unit tests appreciated.

Posted Feb 15, 2012 12:10 UTC (Wed) by tialaramex (subscriber, #21167) [Link]

I don't see any test scaffolding in that repository. My python is mediocre, so I'm definitely not going to attempt to build one from scratch.

I will say that the "ying and yang" -> "yin and yang" suggestion, although it's not itself very useful because "ying" is flagged as a spelling mistake already [at least on this system] does show where "zero false positives" is somewhat practical for units larger than a single English word. Of course you would need a lot of work to establish which things are "always" errors.

For example "ad homonym" and "all intensive purposes" are almost always going to be errors, but for every few times you find "tow the line" used when "toe the line" was meant, you'll stumble over a case where an actual rope was being towed.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds