tang5324110
2010-06-07 09:28 阅读 307
已采纳

java使用POI读取potx文件

我使用读取office03的方法来读取07的potx文件
代码如下
SlideShow ss = new SlideShow(new HSLFSlideShow(p));
// 得到源文件中的幻灯片数量
Slide[] slides = ss.getSlides();
for (int a = 0; a < slides.length; a++) {
// 为了取得幻灯片的文字内容,建立TextRun
TextRun[] tr = slides[a].getTextRuns();
for (int i = 0; i < tr.length; i++) {
// 将内容循环写入到txt文档中
new AddTxt().addtxt(path, pot, tr[i].getText(), true);
}

代码异常为
[color=red]org.apache.poi.poifs.filesystem.OfficeXmlFileException: The supplied data appears to be in the Office 2007+ XML. You are calling the part of POI that deals with OLE2 Office Documents. You need to call a different part of POI to process this data (eg XSSF instead of HSSF)
at org.apache.poi.poifs.storage.HeaderBlockReader.(HeaderBlockReader.java:111)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.(POIFSFileSystem.java:151)
at org.apache.poi.hslf.HSLFSlideShow.(HSLFSlideShow.java:103)
at org.apache.poi.hslf.HSLFSlideShow.(HSLFSlideShow.java:91)[/color]

最好有具体代码

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 复制链接分享

1条回答 默认 最新

  • 已采纳
    XanPeng xanpeng 2010-06-07 19:31

    根据异常, 楼主采用的应该是 [url=http://poi.apache.org/apidocs/index.html]org.apache.poi.hslf[/url], 这个包只支持 PowerPoint 2007 以前的 ppt 文档解析. 因为 Office 2007 的文件底层实现方案(OOXML)和以前的版本(OLE2)有[url=http://poi.apache.org/]根本的变化[/url].
    支持 2007 的包为: [url=http://poi.apache.org/apidocs/index.html]org.apache.poi.xslf[/url]
    示例代码:
    [code="java"]
    package xan.poi;

    import java.io.IOException;

    import org.apache.poi.openxml4j.exceptions.OpenXML4JException;
    import org.apache.poi.xslf.XSLFSlideShow;
    import org.apache.poi.xslf.usermodel.XMLSlideShow;
    import org.apache.poi.xslf.usermodel.XSLFSlide;
    import org.apache.xmlbeans.XmlException;
    import org.openxmlformats.schemas.drawingml.x2006.main.CTRegularTextRun;
    import org.openxmlformats.schemas.drawingml.x2006.main.CTTextBody;
    import org.openxmlformats.schemas.drawingml.x2006.main.CTTextParagraph;
    import org.openxmlformats.schemas.presentationml.x2006.main.CTGroupShape;
    import org.openxmlformats.schemas.presentationml.x2006.main.CTShape;
    import org.openxmlformats.schemas.presentationml.x2006.main.CTSlide;

    public class PPTXReader {
    public static void main(String[] args) {
    try {
    XSLFSlideShow slideShow = new XSLFSlideShow("mothersday.potx");
    XMLSlideShow xmlSlideShow = new XMLSlideShow(slideShow);
    XSLFSlide[] slides = xmlSlideShow.getSlides();
    // System.out.println(slides);
    StringBuilder sb = new StringBuilder();
    for (XSLFSlide slide : slides) {
    CTSlide rawSlide = slide._getCTSlide();
    CTGroupShape gs = rawSlide.getCSld().getSpTree();
    CTShape[] shapes = gs.getSpArray();
    for (CTShape shape : shapes) {
    CTTextBody tb = shape.getTxBody();
    if (null == tb) continue;
    CTTextParagraph[] paras = tb.getPArray();
    for (CTTextParagraph textParagraph : paras) {
    CTRegularTextRun[] textRuns = textParagraph.getRArray();
    for (CTRegularTextRun textRun : textRuns) {
    sb.append(textRun.getT());
    }
    }
    }
    }
    System.out.println(sb.toString());
    } catch (OpenXML4JException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    } catch (IOException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    } catch (XmlException e) {
    // TODO Auto-generated catch block
    e.printStackTrace();
    }
    }
    }

    [/code]

    点赞 评论 复制链接分享

相关推荐