dousuitang5239 2011-05-16 20:13
浏览 133

整洁 - 如何从HTML中删除重复的ID

I have a HTML that need to be parsed by DOMDocument::loadHtml($html), but it gives me an error:

DOMDocument::loadHTML(): ID 'my id' already defined in Entity

I don't have control about the $html, but I can use tidy lib (or something else, ideas?) on it and make a parseable HTML. But I'm not finding a option in tidy's config to remove duplicate ID's in tidy config. My code is like that:

$tidy = new tidy();
$tidy->parseString($this->getPageContents());
$html = new DOMDocument();
$html->loadHTML($tidy); // error here

Thx

  • 写回答

1条回答 默认 最新

  • dth981485742 2011-05-16 20:49
    关注

    try

    $html->loadXML($tidy);
    

    and than rewriting the id's using xml dom before parsing as html dom

    评论

报告相同问题?

悬赏问题

  • ¥15 delphi webbrowser组件网页下拉菜单自动选择问题
  • ¥15 wpf界面一直接收PLC给过来的信号,导致UI界面操作起来会卡顿
  • ¥15 init i2c:2 freq:100000[MAIXPY]: find ov2640[MAIXPY]: find ov sensor是main文件哪里有问题吗
  • ¥15 运动想象脑电信号数据集.vhdr
  • ¥15 三因素重复测量数据R语句编写,不存在交互作用
  • ¥15 微信会员卡等级和折扣规则
  • ¥15 微信公众平台自制会员卡可以通过收款码收款码收款进行自动积分吗
  • ¥15 随身WiFi网络灯亮但是没有网络,如何解决?
  • ¥15 gdf格式的脑电数据如何处理matlab
  • ¥20 重新写的代码替换了之后运行hbuliderx就这样了