dongxun3424 2011-09-16 20:16
浏览 72
已采纳

如何在不破坏文件的情况下“手动”编辑pdf中的注释?

I need to insert an hyperlink into a few thousand existing pdfs. I'm working with zend_pdf which apparently is not able to set an invisible border. The only way I found to make the link borders invisible (found it somewhere else on this site, here, to be precise) is to search for each link "element" of the pdf and add a /Border annotation, like this:

echo str_replace('/Annot /Subtype /Link', '/Annot /Subtype /Link /Border[0 0 0]', $pdf->render());

Since I need to work on files that reside on my filesystem, I'm using the sed command for the search & replace operation.
Now, at first sight this works, as the documents are displayed correctly by Acrobat 8, osx 10.6's Viewer and Ubuntu's document viewer. However, tools such as pdftk (1.41) and pdfinfo (0.12.1) report the structure is corrupted. This is annoying since it means that no further manipulation of the pdf using pdftk will be possible, since the tool refuses to work on the file as there are errors in it. I looked into the files using a binary editor and I found out that if I add more than two bytes after "/Link", the file gets corrupted. This confuses me a lot, since based on the pdf specifications (I'm using 1.4) there is no checksum except for streams, which should mean that one can add as much bytes as he wants, as long as he's not doing that inside a stream and the inserted bytes are valid pdf syntax. What am I missing here?

Here is an example:
the original pdf
the processed pdf

  • 写回答

1条回答 默认 最新

  • douzhi1813 2011-09-27 08:35
    关注

    Adding the additional "/Border" element in the file actually corrupts the pdf's xref table. The xref table references all the objects by their position, measured in bytes from the beginning of the file. Inserting the additional element of course shifts the position (offset) of the subsequent contents by a few bytes.
    To fix the xref table after the edit, I can use pdftk from pdf labs (http://www.pdftk.com) to fix the xref table:

    $ pdftk corrupted_file.pdf output fixed_file.pdf

    As a matter of fact, I was not able to find a comprehensive Pdf solution for php, and I had to use several different kinds of tools (zend_pdf, pdftk, sed) to implement my workflow.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥100 c语言,请帮蒟蒻写一个题的范例作参考
  • ¥15 名为“Product”的列已属于此 DataTable
  • ¥15 安卓adb backup备份应用数据失败
  • ¥15 eclipse运行项目时遇到的问题
  • ¥15 关于#c##的问题:最近需要用CAT工具Trados进行一些开发
  • ¥15 南大pa1 小游戏没有界面,并且报了如下错误,尝试过换显卡驱动,但是好像不行
  • ¥15 没有证书,nginx怎么反向代理到只能接受https的公网网站
  • ¥50 成都蓉城足球俱乐部小程序抢票
  • ¥15 yolov7训练自己的数据集
  • ¥15 esp8266与51单片机连接问题(标签-单片机|关键词-串口)(相关搜索:51单片机|单片机|测试代码)