dongti7838 2015-09-02 20:07
浏览 59

与xml_parse_into_struct相反?

I'm trying to write a series of functions that will extract the document.xml part of a MS Word DOCX file and effectively mail merge a series of key/value pairs to replace the defined template fields in the document. I have a function that uses xml_parse_into_struct to convert the XML text into the necessary arrays, but once I'm done with the replacing of text I'll (presumably) need to use the ZipArchive method addFromString to create the new document.xml file and add it to the DOCX zip container. But I'm not sure how to do that when I'm working with an array of data rather than an XML string. Is there a way to convert an array back into the XML string format?

Here's what I have so far:

// $filename = name of DOCX file to open
function get_docx_xml($filename) {
  // Extract XML from DOCX file
    $zip = new ZipArchive();
    if ($zip->open($filename, ZIPARCHIVE::CHECKCONS) !== TRUE) { echo 'failed to open template'; exit; }
    $xml = 'word/document.xml';
    $data = $zip->getFromName($xml);
    $zip->close();
    // Create the XML parser and create an array of the results
    $parser = xml_parser_create_ns();
    xml_parse_into_struct($parser, $data, $vals, $index);
    xml_parser_free($parser);
    // Return the relevant XML information
    return array('vals' => $vals, 'index' => $index);
}

That part works fine, I can print_r both arrays and make sense of the results. However, the following function does not work -- at least not in all cases. If I use certain delimiters for the fields to be replaced it works, but not all the time which I assume is an issue with Word's character encoding or other formatting.

// $templateFile = original, unedited template; $newFile = new file name to be created; $row = array of data to merge in
function mailmerge($templateFile, $newFile, $row) {
  if (!copy($templateFile, $newFile))  // make a duplicate so we dont overwrite the template
    return false; // could not duplicate template
  $xmldata = get_docx_xml($newFile);
  $zip = new ZipArchive();
  if ($zip->open($newFile, ZIPARCHIVE::CHECKCONS) !== TRUE)
    return false; // probably not a docx file
  $file = 'word/document.xml';
  $data = $zip->getFromName($file);
  foreach ($row as $key => $value) {
    $data = str_replace($key, xml_escape($value), $data);
  }
  $zip->deleteName($file);
  $zip->addFromString($file, $data);
  $zip->close();
  return true;
}

So instead of using str_replace (which fails a lot of the time) I was planning on cycling the $vals array that I get from the first function, doing the replace there, and then saving the resulting array back to a string and, in turn, back into the DOCX zip container.

  • 写回答

1条回答

  • doulong1987 2015-09-03 20:54
    关注

    While I didn't find the answer to my question, I've solved the problem via a workaround. Effectively I used a series substr_replace calls to make the necessary updates. Here's my new and improved mail merge function if anyone else needs something like this:

    // Merge data into a Word file (mailmerge or custom)
    // $templateFile = original, unedited template; $newFile = new file name to be created; $row = array of data to merge in; $delim_start = starting delimiter; $delim_end = ending delimiter
    function mailmerge($templateFile, $newFile, $row, $delim_start, $delim_end) {
      if (!copy($templateFile, $newFile))  // make a duplicate so we dont overwrite the template
        return false; // could not duplicate template
      $zip = new ZipArchive();
      if ($zip->open($newFile, ZIPARCHIVE::CHECKCONS) !== TRUE)
        return false; // probably not a docx file
      $file = 'word/document.xml';
      $data = $zip->getFromName($file);
      $currentpos = 0;
      foreach ($row as $key => $value) {
        // Look for a naturally occuring instance of the replacement string (key) and replace as needed
        if (stristr($data, $key)) {
          $currentpos = strpos($data, $key) + strlen($key);
          $data = str_replace($key, xml_escape($value), $data);
        }
        else { // Look for the key's delimiter
          if (stristr($data, $delim_start, $currentpos)) {
            $pos_start = strpos($data, $delim_start, $currentpos);
            // Clear the initial delimiter
            $data = substr_replace($data, '', $pos_start, strlen($delim_start));
            // Now find the actual data (by XML key)
            $datapos_start = (strpos($data, '<w:t>', $pos_start)) + 5;
            $datapos_end = strpos($data, '</w:t>', $datapos_start);
            // Replace the data
            $data = substr_replace($data, xml_escape($value), $datapos_start, ($datapos_end - $datapos_start));
            // Clear the closing delimiter (have to recalculate datapos_end due to the replacement)
            $datapos_end = strpos($data, $delim_end, $datapos_start);
            $data = substr_replace($data, '', $datapos_end, strlen($delim_end));
            // Reset the current posistion variable for the next iteration
            $currentpos = $datapos_end + 6;
          }
        }
      }
      $zip->deleteName($file);
      $zip->addFromString($file, $data);
      $zip->close();
      return true;
    }
    
    评论

报告相同问题?

悬赏问题

  • ¥100 关于使用MATLAB中copularnd函数的问题
  • ¥20 在虚拟机的pycharm上
  • ¥15 jupyterthemes 设置完毕后没有效果
  • ¥15 matlab图像高斯低通滤波
  • ¥15 针对曲面部件的制孔路径规划,大家有什么思路吗
  • ¥15 钢筋实图交点识别,机器视觉代码
  • ¥15 如何在Linux系统中,但是在window系统上idea里面可以正常运行?(相关搜索:jar包)
  • ¥50 400g qsfp 光模块iphy方案
  • ¥15 两块ADC0804用proteus仿真时,出现异常
  • ¥15 关于风控系统,如何去选择