- A+
所属分类:未分类
下载leptonica
tesseract官网给出的如下:
You also need to install Leptonica. Ensure that the development headers for Leptonica are installed before compiling Tesseract.
下载地址:http://www.leptonica.com/download.html,我这里下载的是leptonica-1.75.3
编译安装:
tar zxvf leptonica-1.75.3.tar.gz cd leptonica-1.75.3 ./configure make&&make install
编译安装tesseract
wget https://github.com/tesseract-ocr/tesseract/archive/tesseract-3.05.01.tar.gz tar zxvf tesseract-3.05.01.tar.gz cd tesseract-3.05.01 ./autogen.sh
问题一
报错: [root@iZwz9bpg2u1r39ml9st8qzZ tesseract-master]# ./autogen.sh Unable to find a valid copy of libtoolize or glibtoolize in your PATH! ./autogen.sh: line 59: bail_out: command not found Running aclocal ./autogen.sh: line 82: aclocal: command not found Something went wrong, bailing out! 解决:yum install automake -y
问题二
报错: Unable to find a valid copy of libtoolize or glibtoolize in your PATH! ./autogen.sh: line 59: bail_out: command not found Running aclocal Running ./autogen.sh: line 87: -f: command not found Something went wrong, bailing out! 解决:yum install libtool -y
问题三
报错: Leptonica 1.74 or higher is required. Try to install libleptonica-dev package 解决: 配置一下leptonica的环境变量 export LD_LIBRARY_PATH=$LD_LIBRARY_PAYT:/usr/local/lib export LIBLEPT_HEADERSDIR=/usr/local/include export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig
当执行autogen.sh出现如下时,就检测OK了
Running aclocal Running /usr/bin/libtoolize libtoolize: putting auxiliary files in AC_CONFIG_AUX_DIR, `config'. libtoolize: copying file `config/ltmain.sh' libtoolize: putting macros in AC_CONFIG_MACRO_DIR, `m4'. libtoolize: copying file `m4/libtool.m4' libtoolize: copying file `m4/ltoptions.m4' libtoolize: copying file `m4/ltsugar.m4' libtoolize: copying file `m4/ltversion.m4' libtoolize: copying file `m4/lt~obsolete.m4' Running autoheader Running automake --add-missing --copy unittest/Makefile.am:63: variable `EXTRA_apiexample_test_DEPENDENCIES' is defined but no program or unittest/Makefile.am:63: library has `EXTRA_apiexample_test' as canonical name (possible typo) Running autoconf All done. To build the software now, do something like: $ ./configure [--enable-debug] [...other options]
安装:
./configure make&&make install
安装完成后下载语言包,下载地址:https://github.com/tesseract-ocr/tessdata
我这里就下了个中文跟英文的,如下:
[root@iZwz9bpg2u1r39ml9st8qzZ tessdata]# pwd /usr/local/share/tessdata [root@iZwz9bpg2u1r39ml9st8qzZ tessdata]# ls chi_sim.traineddata chi_tra.traineddata configs eng.traineddata pdf.ttf tessconfigs
下来测试一下:
[root@iZwz9bpg2u1r39ml9st8qzZ ~]# tesseract Usage: tesseract --help | --help-psm | --help-oem | --version tesseract --list-langs [--tessdata-dir PATH] tesseract --print-parameters [options...] [configfile...] tesseract imagename|stdin outputbase|stdout [options...] [configfile...] OCR options: --tessdata-dir PATH Specify the location of tessdata path. --user-words PATH Specify the location of user words file. --user-patterns PATH Specify the location of user patterns file. -l LANG[+LANG] Specify language(s) used for OCR. -c VAR=VALUE Set value for config variables. Multiple -c arguments are allowed. --psm NUM Specify page segmentation mode. --oem NUM Specify OCR Engine mode. NOTE: These options must occur before any configfile. Page segmentation modes: 0 Orientation and script detection (OSD) only. 1 Automatic page segmentation with OSD. 2 Automatic page segmentation, but no OSD, or OCR. 3 Fully automatic page segmentation, but no OSD. (Default) 4 Assume a single column of text of variable sizes. 5 Assume a single uniform block of vertically aligned text. 6 Assume a single uniform block of text. 7 Treat the image as a single text line. 8 Treat the image as a single word. 9 Treat the image as a single word in a circle. 10 Treat the image as a single character. 11 Sparse text. Find as much text as possible in no particular order. 12 Sparse text with OSD. 13 Raw line. Treat the image as a single text line, bypassing hacks that are Tesseract-specific. OCR Engine modes: 0 Original Tesseract only. 1 Cube only. 2 Tesseract + cube. 3 Default, based on what is available. Single options: -h, --help Show this help message. --help-psm Show page segmentation modes. --help-oem Show OCR Engine modes. -v, --version Show version information. --list-langs List available languages for tesseract engine. --print-parameters Print tesseract parameters to stdout.
111111111
- 安卓客户端下载
- 微信扫一扫
- 微信公众号
- 微信公众号扫一扫