I have several PDFs I need to add the Primary Language (Which for us is always english, so (en-us) as the document's catalog dictionary entry) and Title fields to so I can have these PDFs pass ADA checks.
I've had some luck on PDF version 1.4 with doing string replacements on the whole document (Via file_get_contents) and rewriting the file so i wouldn't lose whats in it, but in 1.5 and 1.6 the PDF standard, insides are even space and tab sensitive it seems.
I've attempt to use exiftool via shell_exec(), but this only seems to work on PDF version 1.4, everything else will set inside the PDF but still fail our scans because of flags like /Type/Catalog/ViewerPreferences<</DisplayDocTitle true>>
which seem to be set randomly inside the document on 1.6.
Has anyone tried to tackle this before web side? I was hoping to build something that would solve some troubles to cut down on having to open everyone single one of these in Adobe and resave them.
I've attempted to search for an Adobe API or library i could plug in to do these minor edits. All the frameworks i've seen create new PDFs, which means all the tagging and alt text we put in would be lost so i surely don't want to go the route of Zend or anything that won't JUST edit the Meta Data.
<?php
$dir = getcwd();
$files = scandir($dir);
foreach($files as $file)
{
if(strpos($file, '.pdf') !== false)
{
$pdf = file_get_contents($dir.'/'.$file);
// This seems to work for 1.4, but not anything else
if(strpos($pdf,'/Lang') === false)
{
echo "Changing Lang on " .$file.PHP_EOL;
$pdf_str = preg_replace("/\/Type \/Catalog/", "/Type /Catalog
/Lang (en-us)", $pdf);
file_put_contents($dir.'/'.$file, $pdf_str);
}else{
echo "Lang passed on ".$file.PHP_EOL;
}
}
}
?>