I know there are a lot of PDF extraction methods/techniques, but I'm after a reliable text extractor for PDFs in PHP. All I want is to extract words, but not numbers and no special characters.
Any ideas of solid techniques to achieve this?
I know there are a lot of PDF extraction methods/techniques, but I'm after a reliable text extractor for PDFs in PHP. All I want is to extract words, but not numbers and no special characters.
Any ideas of solid techniques to achieve this?
收起
The Zend Framework provides Zend_Pdf, a php class that will load and parse pdf documents.
Here is a script that shows how to extract the text from a loaded Zend_Pdf object.
报告相同问题?