doulun5683
doulun5683
2018-05-23 15:26

用phpword问题将docx转换为html

已采纳

I'm encountering an issue when converting docx document into HTML with PHPWord library (https://github.com/PHPOffice/PHPWord).

Here is the code snippet I use:

$phpWord = \PhpOffice\PhpWord\IOFactory::load('test.docx');
$htmlWriter = new \PhpOffice\PhpWord\Writer\HTML($phpWord);
$htmlWriter->save('test.html');

The issue is that each block of text is encapsulated in <p> tags regardless if I defined titles in the docx document. I would expect <h1> <h2>... tags to be generated. Bullet list are lost too.

Does it work as designed or did I miss something?

Thank you for your feedback.

Regards

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享
  • 邀请回答

1条回答

  • dqtm8504 dqtm8504 3年前

    There's a little bit of a problem when it comes to using IOFactory::load of PHPWord such as what you encountered now, depending what saved the file or what version of Microsoft Word is used to create that file. If the encoding and tags of the docx file cannot be found by PHPWord , then it will produce unexpected results

    The code is fine, the problem is already with the dependency.

    点赞 评论 复制链接分享