dongyi0114 2012-04-17 20:40
浏览 216
已采纳

php exec()和tesseract去''无法打开输入文件'

I use Ghostscript to strip images from PDF files into jpg and run Tesseract to save txt content like this:

  • Ghostscript located in c:\engine\gs\
  • Tesseract located in c:\engine\tesseract\
  • web located pdf/jpg/txt dir = file/tmp/

Code:

$pathgs = "c:\\engine\\gs\\";
$pathtess = "c:\\engine\\tesseract\\";
$pathfile = "file/tmp/"

// Strip images
putenv("PATH=".$pathgs);
$exec = "gs -dNOPAUSE -sDEVICE=jpeg -r300 -sOutputFile=".$pathfile."strip%d.jpg ".$pathfile."upload.pdf -q -c quit";
shell_exec($exec);

// OCR
putenv("PATH=".$pathtess);
$exec = "tesseract.exe '".$pathfile."strip1.jpg' '".$pathfile."ocr' -l eng";
exec($exec, $msg);
print_r($msg);
echo file_get_contents($pathfile."ocr.txt");

Stripping the image (its just 1 page) works fine, but Tesseract echoes:

Array
  (
    [0] => Tesseract Open Source OCR Engine v3.01 with Leptonica
    [1] => Cannot open input file: 'file/tmp/strip1.jpg'
  )

and no ocr.txt file is generated, thus leading into a 'failed to open stream' error in PHP.

  • Copying strip1.jpg into c:/engine/tesseract/ folder and running Tesseract from command (tesseract strip1.jpg ocr.txt -l eng) runs without any issue.
  • Replacing the putenv() quote by exec(c:/engine/tesseract/tesseract ... ) returns the a.m. error
  • I kept strip1.jpg in the Tesseract folder and ran exec(tesseract 'c:/engine/tesseract/strip1.jpg' ... ) returns the a.m. error
  • Leaving away the apostrophs around path/strip1.jpg returns an empty array as message and does not create the ocr.txt file.
  • writing the command directly into the exec() quote instead of using $exec doesn't make the change.

What am I doing wrong?

  • 写回答

2条回答 默认 最新

  • doushang4293 2012-04-19 17:50
    关注

    Perhaps the missing environment variables in PHP is the problem here. Have a look at my question here to see if setting HOME or PATH sorts this out?

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 Python爬取指定微博话题下的内容,保存为txt
  • ¥15 vue2登录调用后端接口如何实现
  • ¥65 永磁型步进电机PID算法
  • ¥15 sqlite 附加(attach database)加密数据库时,返回26是什么原因呢?
  • ¥88 找成都本地经验丰富懂小程序开发的技术大咖
  • ¥15 如何处理复杂数据表格的除法运算
  • ¥15 如何用stc8h1k08的片子做485数据透传的功能?(关键词-串口)
  • ¥15 有兄弟姐妹会用word插图功能制作类似citespace的图片吗?
  • ¥15 latex怎么处理论文引理引用参考文献
  • ¥15 请教:如何用postman调用本地虚拟机区块链接上的合约?