tesseract安装及问题处理

2019-03-25 10:32:48 浏览数 (1)

问题1

pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it’s not in your path

找到源码中

tesseract_cmd = ‘tesseract’

修改为

tesseract_cmd = r’./Tesseract-OCRtesseract.exe’(你自己的安装路径)

问题2

代码语言:javascript复制
E:BuildFoldertesseract-ocrtesting>tesseract-dlld.exe eurotext.tif eurotext
Error opening data file ./tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Could not initialize tesseract.

解决方法

  • tessdata 目录放在 tesseract.exe 的目录下
  • 将 TESSDATA_PREFIX=D:Program Files (x86)Tesseract-OCR 添加环境变量

临时在 cmd 中设置环境变量,测试

代码语言:javascript复制
set TESSDATA_PREFIX=D:Program Files (x86)Tesseract-OCR

RuntimeError: Failed to init API, possibly an invalid tessdata

先找到我们安装的Tesseract_OCR的tessdata的目录

把tessdata复制到这个报错的那个位置

此时运行正常!

tesserocr._tesserocr.file_to_text RuntimeError: Failed to read picture

tesserocr.file_to_text函数的路径参数中不能有中文字符,否则就会报这个错。经测,在换入一个纯英文的路径后可以正常运行

参考:https://blog.csdn.net/moxiao1995071310/article/details/82630996 https://blog.csdn.net/BobYuan888/article/details/80987178

0 人点赞