Some experiments in using OCR to filter out images that contain text.
The usecase is filtering out images that do not contain people from images from various sources (Twitter, Instagram etc). This can be used together with CV type solutions.
Ubuntu:
sudo apt-get install tesseract-ocr-dev tesseract-ocr wamerican lipleptonica-dev
Mac:
brew install tesseract