There is no need to actually OCR the image
Posted Aug 28, 2006 17:43 UTC (Mon) by
tack (subscriber, #12542)
In reply to:
There is no need to actually OCR the image by spitzak
Parent article:
Fighting image spam
I occasionally receive scanned newspaper articles that may be of interest to me.
I prefer the approach of OCRing the image and filtering that through a Bayesian classifier. One might be able to use some of the techniques described here to first optionally determine if the image likely contains a lot of text, and only then OCR it, which would help out with the CPU overhead.
(
Log in to post comments)