dongying3744 2009-04-04 14:22
浏览 61

提取<object> </ object>之间的所有内容

I am using CURL to download a page. Now I want to extract this from the page:

<object classid="clsid:67DABFBF-D0AB-41fa-9C46-CC0F21721616" width="640"
        height="303.33333333333"
        codebase="http://go.divx.com/plugin/DivXBrowserPlugin.cab"
        id="object701207571">
    <param name="autoPlay" value="false" />
    <param name="custommode" value="Stage6" />
    <param name="src" value="" />
    <param name="movieTitle" value="Titanic" />
    <param name="bannerEnabled" value="false" />
    <param name="previewImage" 
           value="http://stagevu.com/img/thumbnail/oripmqeqzrccbig.jpg" />
    <embed type="video/divx" src="" width="640" height="303.33333333333"
           autoPlay="false" custommode="Stage6" movieTitle="Titanic"
           bannerEnabled="false"
           previewImage="http://stagevu.com/img/thumbnail/oripmqeqzrccbig.jpg"
           pluginspage="http://go.divx.com/plugin/download/"
           id="embed701207571">
    </embed>
</object>

Please help!

  • 写回答

5条回答 默认 最新

  • dtot74529 2009-04-04 14:33
    关注

    See Can you provide some examples of why it is hard to parse XML and HTML with a regex? for why this is probably the wrong thing to do.

    That said you might be able to get away with something like /(<object>.*?<\/object>)/s. This matches the string "<object>" followed by any number of characters up to the string "</object>". The s on the end tells . to match newlines (it normally doesn't).

    评论

报告相同问题?

悬赏问题

  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100