doushan1850 2011-02-22 21:09
浏览 815

如何在PHP中合并docx文档?

Does anyone know how to merge (concatenate) docx documents with PHP (or Python if it's not possible in PHP)?

To clarify, my server is Linux based. I have 2 existing docx document, I need to put them in a new docx document using PHP or possibly Python.

  • 写回答

2条回答 默认 最新

  • dqf60304 2011-10-31 21:49
    关注

    Merging two different Docx files may be very complicated because Headers, Styles, Charts, Comments, User Modification Traces and other special contents are saved in separate inner XML sub-files in each Docx. Thus, two Docx may have different objects having the same ids. So it would be a very huge job to list all possible objects in the two documents, give them new inner ids, and re-affect them in a single one. Probably only Ms Office can do this currently.

    Nevertheless, if you know that your two documents to be merged have the same styles, and if you know you have no charts, headers and other special objects, then the merging becomes something quite easy to perform.

    In this case, you only have to use a Zip reader, such as TbsZip, to open the first Docx file (which is technically a zip archive containing XML sub-files) ; then read the sub-file "word/document.xml" and extract the part which is between the tags < w:body > and < /w:body >. In the second Docx file, open the "word/content.xml" and insert the previous content just before the tag < /w:body >. Save the result in a new Docx file.

    This can be done using TbsZip, like this :

    <?php
    
    include_once('tbszip.php');
    
    $zip = new clsTbsZip();
    
    // Open the first document
    $zip->Open('doc1.docx');
    $content1 = $zip->FileRead('word/document.xml');
    $zip->Close();
    
    // Extract the content of the first document
    $p = strpos($content1, '<w:body');
    if ($p===false) exit("Tag <w:body> not found in document 1.");
    $p = strpos($content1, '>', $p);
    $content1 = substr($content1, $p+1);
    $p = strpos($content1, '</w:body>');
    if ($p===false) exit("Tag </w:body> not found in document 1.");
    $content1 = substr($content1, 0, $p);
    
    // Insert into the second document
    $zip->Open('doc2.docx');
    $content2 = $zip->FileRead('word/document.xml');
    $p = strpos($content2, '</w:body>');
    if ($p===false) exit("Tag </w:body> not found in document 2.");
    $content2 = substr_replace($content2, $content1, $p, 0);
    $zip->FileReplace('word/document.xml', $content2, TBSZIP_STRING);
    
    // Save the merge into a third file
    $zip->Flush(TBSZIP_FILE, 'merge.docx');
    
    评论

报告相同问题?

悬赏问题

  • ¥50 potsgresql15备份问题
  • ¥15 Mac系统vs code使用phpstudy如何配置debug来调试php
  • ¥15 目前主流的音乐软件,像网易云音乐,QQ音乐他们的前端和后台部分是用的什么技术实现的?求解!
  • ¥60 pb数据库修改与连接
  • ¥15 spss统计中二分类变量和有序变量的相关性分析可以用kendall相关分析吗?
  • ¥15 拟通过pc下指令到安卓系统,如果追求响应速度,尽可能无延迟,是不是用安卓模拟器会优于实体的安卓手机?如果是,可以快多少毫秒?
  • ¥20 神经网络Sequential name=sequential, built=False
  • ¥16 Qphython 用xlrd读取excel报错
  • ¥15 单片机学习顺序问题!!
  • ¥15 ikuai客户端多拨vpn,重启总是有个别重拨不上