duanpie2414
2016-10-22 19:52
浏览 1.6k
已采纳

解析.doc和.docx以使用golang获取所有文本?

How can I parse word documents ".doc", ".docx" to get all the text using golang?

图片转代码服务由CSDN问答提供 功能建议

如何使用golang解析单词文档“ .doc”,“。docx”以获取所有文本?< / p>

  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • doumian3780 2016-10-22 20:27
    已采纳

    You can get some inspiration from those projects:

    https://github.com/nguyenthenguyen/docx
    https://github.com/opencontrol/doc-template

    Basically, DOCX is a Zip file with XMLs in it. All the texts are inside document.xml

    What both project do is remove all XML tags, leaving only text intact. You should see if that approach suits you too.

    已采纳该答案
    打赏 评论
  • 笨笨小弟 2020-06-07 20:18

    用 unioffice这个库。不过不太懂怎么用……

    打赏 评论

相关推荐 更多相似问题