I need to extract a text in a php variable from a pdf file, i used pdf2text for this, but i have problems when i try to convert the string to utf-8 target.
Also if someone knows a better way to delete the spaces and line breacks of the string, i would be grateful.
this is the code i have used:
header('Content-type: text/html; charset=utf-8');
mb_internal_encoding('UTF-8');
mb_http_output('UTF-8');
include('pdftophp.php');
$doc = new PDF2Text();
$doc->setFilename('pdf/prueba.pdf');
$doc->decodePDF();
$texto = $doc->output();
$resultado = "";
for ($i=0; $i < strlen($texto) ; $i++) {
if (substr($texto,$i,1) != " " && substr($texto,$i,1) != "
"){
$resultado.= substr($texto,$i,1);
}
}
echo $resultado;