tesseract-ocr安装

2022-12-26 14:49:12 浏览数 (2)

Windows安装教程

依赖

Windows无需此操作

下载软件无脑下一步就完事了下载地址

yum 派系

依赖

控制台输入

代码语言:javascript复制
yum install libpng-dev libtiff5-dev libwebp-dev libopenjp2-7-dev libgif-dev automake g   git libtool libleptonica-dev make pkg-config

apt 派系

依赖

控制台输入

代码语言:javascript复制
apt-get install libpng-dev libtiff5-dev libwebp-dev libopenjp2-7-dev libgif-dev automake g   git libtool libleptonica-dev make pkg-config

安装 leptonica

代码语言:javascript复制
git clone https://github.com/DanBloomberg/leptonica

cd leptonica
./autogen.sh
./configure
make
sudo make install

安装完成提示

代码语言:javascript复制
tops xtractprotos '/usr/local/bin'
libtool: install: /usr/bin/install -c .libs/convertfilestopdf /usr/local/bin/convertfilestopdf
libtool: install: /usr/bin/install -c .libs/convertfilestops /usr/local/bin/convertfilestops
libtool: install: /usr/bin/install -c .libs/convertformat /usr/local/bin/convertformat
libtool: install: /usr/bin/install -c .libs/convertsegfilestopdf /usr/local/bin/convertsegfilestopdf
libtool: install: /usr/bin/install -c .libs/convertsegfilestops /usr/local/bin/convertsegfilestops
libtool: install: /usr/bin/install -c .libs/converttopdf /usr/local/bin/converttopdf
libtool: install: /usr/bin/install -c .libs/converttops /usr/local/bin/converttops
libtool: install: /usr/bin/install -c .libs/fileinfo /usr/local/bin/fileinfo
libtool: install: /usr/bin/install -c .libs/imagetops /usr/local/bin/imagetops
libtool: install: /usr/bin/install -c .libs/xtractprotos /usr/local/bin/xtractprotos

安装 tesseract-ocr-ocr

代码语言:javascript复制
cd tesseract
./autogen.sh
./configure
make
sudo make install
sudo ldconfig

安装完成

输入 tesseract –version 出现一下提示即安装完成

代码语言:javascript复制
➜  tesseract git:(master) ✗ tesseract --version
tesseract 5.0.0-alpha-859-gd13e
 leptonica-1.81.0
  libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.2) : libpng 1.6.36 : libtiff 4.1.0 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0
 Found AVX2
 Found AVX
 Found SSE
 Found OpenMP 201511

安装字库

语言包地址:https://github.com/tesseract-ocr/tessdata

由于语言包比较大,这里我们之下英文,中文繁体,中文简体语言包

代码语言:javascript复制
wget --no-check-certificate https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata
wget --no-check-certificate https://github.com/tesseract-ocr/tessdata/raw/master/chi_sim.traineddata 
wget --no-check-certificate https://github.com/tesseract-ocr/tessdata/raw/master/chi_tra.traineddata

将语言包移动到tesseract目录下

检查是否安装成功,输入:tesseract –list-langs 出现一下提示视为安装成功

代码语言:javascript复制
➜  tesseract git:(master) ✗ tesseract --list-langs
List of available languages (3):
chi_sim
chi_tra
eng

总结

三步走

第一步,安装相应依赖

第二部,下载源码编译&安装

第三部,安装语言包

最主要的就是依赖部分,可能各种因素使你的依赖不能下载完整,一定按顺序,此博客为踩坑后笔记,笔者按此次总结步骤以安装不下10遍,从本地=》研发=》测试=》预发=》灰度=》生产均无问题

0 人点赞