Header Ads

Android OCR tutorial - image to text

This tutorial will show how to use and implement OCR library (tesseract) in android application. Tesseract is open source library for OCR originally developed by HP.


1. Download tesseract library for android https://github.com/rmtheis/tess-two/downloads. Download as .zip for
    windows, as .tar.gz for linux user.

2. Software requirement
    - Eclipse
    - Java JDK
    - Android SDK
    - Android NDK
    - Cygwin ( for windows users)
    - Apache-ant
3. For windows user, make sure you already installed cygwin ( you can download it  and install it from http://www.cygwin.com/ make sure during the cygwin installation, install also these source and library gcc-core, gcc-g++, make, swig)

4. Download apache-ant from http://ant.apache.org/bindownload.cgi choose .zip for windows, .tar.bz for linux user.

5. Unzip the apache and set the environment variable (mine is C:\apache-ant-1.8.3\bin)


6. Run cygwin (for windows user only,for linux user,run terminal)
     a.cd <project-directory>/tess-two
     b.export TESSERACT_PATH=${PWD}/external/tesseract-3.01
     c.export  LEPTONICA_PATH=${PWD}/external/leptonica-1.68
     d.export LIBJPEG_PATH=${PWD}/external/libjpeg
     e.ndk-build(for windows user, /cygdrive/<ndk-directory>/ndk-build)
     f. android update project --path . (for windows user, sometime cygwin cannot execute this command, so
        use command prompt to execute this command).
        Note: The “.” after --path must be included in the command.
     g. ant release ( sometimes you will get error like java tools.jar not found, set environment variable
         JAVA_HOME to the jdk folder, mine is C:\Program Files\Java\jdk1.7.0)
7. Run Eclipse. Right click on package explorer, import>> General >> Existing Project into Workspace >> 
    Next >> Select Root Directory >> Browse the tess-two folder location >> Finish.
    You will see tess-two folder in your package explorer.
    
8. Right click on the project >> Android Tools >> Fix Project Properties. Right click >> Properties >> 
    Android >> Check Is Library. 
    Download the simple OCR android app from https://github.com/GautamGupta/Simple-Android-OCR.
    Right click on package explorer, import the simple OCR android app folder.

9. Right click on the project >> Android >> Add >> click tess-two >> OK
   
10. Run the app. Good luck



References
[1] http://gaut.am/making-an-ocr-android-app-using-tesseract/
[2] http://ant.apache.org/bindownload.cgi
[3] http://wolfpaulus.com/journal/android-and-ocr
[4] http://rmtheis.wordpress.com/2011/08/06/using-tesseract-tools-for-android-to-create-a-basic-ocr-app/
[5] https://github.com/rmtheis/tess-two

From :http://kurup87.blogspot.com/2012/03/android-ocr-tutorial-image-to-text.html

No comments:

Powered by Blogger.