dongti7838 2015-09-02 20:07
浏览 59

与xml_parse_into_struct相反?

I'm trying to write a series of functions that will extract the document.xml part of a MS Word DOCX file and effectively mail merge a series of key/value pairs to replace the defined template fields in the document. I have a function that uses xml_parse_into_struct to convert the XML text into the necessary arrays, but once I'm done with the replacing of text I'll (presumably) need to use the ZipArchive method addFromString to create the new document.xml file and add it to the DOCX zip container. But I'm not sure how to do that when I'm working with an array of data rather than an XML string. Is there a way to convert an array back into the XML string format?

Here's what I have so far:

// $filename = name of DOCX file to open
function get_docx_xml($filename) {
  // Extract XML from DOCX file
    $zip = new ZipArchive();
    if ($zip->open($filename, ZIPARCHIVE::CHECKCONS) !== TRUE) { echo 'failed to open template'; exit; }
    $xml = 'word/document.xml';
    $data = $zip->getFromName($xml);
    $zip->close();
    // Create the XML parser and create an array of the results
    $parser = xml_parser_create_ns();
    xml_parse_into_struct($parser, $data, $vals, $index);
    xml_parser_free($parser);
    // Return the relevant XML information
    return array('vals' => $vals, 'index' => $index);
}

That part works fine, I can print_r both arrays and make sense of the results. However, the following function does not work -- at least not in all cases. If I use certain delimiters for the fields to be replaced it works, but not all the time which I assume is an issue with Word's character encoding or other formatting.

// $templateFile = original, unedited template; $newFile = new file name to be created; $row = array of data to merge in
function mailmerge($templateFile, $newFile, $row) {
  if (!copy($templateFile, $newFile))  // make a duplicate so we dont overwrite the template
    return false; // could not duplicate template
  $xmldata = get_docx_xml($newFile);
  $zip = new ZipArchive();
  if ($zip->open($newFile, ZIPARCHIVE::CHECKCONS) !== TRUE)
    return false; // probably not a docx file
  $file = 'word/document.xml';
  $data = $zip->getFromName($file);
  foreach ($row as $key => $value) {
    $data = str_replace($key, xml_escape($value), $data);
  }
  $zip->deleteName($file);
  $zip->addFromString($file, $data);
  $zip->close();
  return true;
}

So instead of using str_replace (which fails a lot of the time) I was planning on cycling the $vals array that I get from the first function, doing the replace there, and then saving the resulting array back to a string and, in turn, back into the DOCX zip container.

  • 写回答

1条回答 默认 最新

  • doulong1987 2015-09-03 20:54
    关注

    While I didn't find the answer to my question, I've solved the problem via a workaround. Effectively I used a series substr_replace calls to make the necessary updates. Here's my new and improved mail merge function if anyone else needs something like this:

    // Merge data into a Word file (mailmerge or custom)
    // $templateFile = original, unedited template; $newFile = new file name to be created; $row = array of data to merge in; $delim_start = starting delimiter; $delim_end = ending delimiter
    function mailmerge($templateFile, $newFile, $row, $delim_start, $delim_end) {
      if (!copy($templateFile, $newFile))  // make a duplicate so we dont overwrite the template
        return false; // could not duplicate template
      $zip = new ZipArchive();
      if ($zip->open($newFile, ZIPARCHIVE::CHECKCONS) !== TRUE)
        return false; // probably not a docx file
      $file = 'word/document.xml';
      $data = $zip->getFromName($file);
      $currentpos = 0;
      foreach ($row as $key => $value) {
        // Look for a naturally occuring instance of the replacement string (key) and replace as needed
        if (stristr($data, $key)) {
          $currentpos = strpos($data, $key) + strlen($key);
          $data = str_replace($key, xml_escape($value), $data);
        }
        else { // Look for the key's delimiter
          if (stristr($data, $delim_start, $currentpos)) {
            $pos_start = strpos($data, $delim_start, $currentpos);
            // Clear the initial delimiter
            $data = substr_replace($data, '', $pos_start, strlen($delim_start));
            // Now find the actual data (by XML key)
            $datapos_start = (strpos($data, '<w:t>', $pos_start)) + 5;
            $datapos_end = strpos($data, '</w:t>', $datapos_start);
            // Replace the data
            $data = substr_replace($data, xml_escape($value), $datapos_start, ($datapos_end - $datapos_start));
            // Clear the closing delimiter (have to recalculate datapos_end due to the replacement)
            $datapos_end = strpos($data, $delim_end, $datapos_start);
            $data = substr_replace($data, '', $datapos_end, strlen($delim_end));
            // Reset the current posistion variable for the next iteration
            $currentpos = $datapos_end + 6;
          }
        }
      }
      $zip->deleteName($file);
      $zip->addFromString($file, $data);
      $zip->close();
      return true;
    }
    
    评论

报告相同问题?

悬赏问题

  • ¥15 使用C#,asp.net读取Excel文件并保存到Oracle数据库
  • ¥15 C# datagridview 单元格显示进度及值
  • ¥15 thinkphp6配合social login单点登录问题
  • ¥15 HFSS 中的 H 场图与 MATLAB 中绘制的 B1 场 部分对应不上
  • ¥15 如何在scanpy上做差异基因和通路富集?
  • ¥20 关于#硬件工程#的问题,请各位专家解答!
  • ¥15 关于#matlab#的问题:期望的系统闭环传递函数为G(s)=wn^2/s^2+2¢wn+wn^2阻尼系数¢=0.707,使系统具有较小的超调量
  • ¥15 FLUENT如何实现在堆积颗粒的上表面加载高斯热源
  • ¥30 截图中的mathematics程序转换成matlab
  • ¥15 动力学代码报错,维度不匹配