

























Version 2.0 of the FairScan document-scanning app for Android has been released. The headline feature for this release is the addition of optical-character-recognition (OCR) support using Tesseract to produce PDFs with searchable text from scans. FairScan developer Pierre-Yves Nicolas has written a detailed blog about adding the feature and explaining why it had not been added previously.
That looks nice, so why didn't FairScan have it before? That's because FairScan wasn't ready for it: I wouldn't be comfortable if FairScan was giving you wrong text half of the time. To get good results from an OCR engine, you need to provide it a readable image. If it's hard to read for a human, it's certainly also hard to read for an OCR engine.
Over the past year, I worked on different parts of FairScan's automatic processing to transform photos of documents into PDFs that are easy for humans to read:
- document detection
- perspective correction
- shadow reduction
- brightness and contrast enhancement
All this work on image processing helped FairScan produce clean PDFs and can now also contribute to making text recognition effective.
FairScan is available via Google Play or F-Droid.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。