dpp34603 2016-08-14 19:29
浏览 111

获取给定URL的文章内容

Given a page's contents (its HTML), how could I get the contents of the article?

For example, this website returns the contents of articles given a URL:

http://embed.ly/docs/explore/extract?url=http%3A%2F%2Fwww.foxnews.com%2Fsports%2F2016%2F08%2F14%2Fryan-lochte-3-other-u-s-swimmers-robbed-in-brazil.html

However, I don't want to use their API. I've used file_get_contents($url), but I have no idea how I would go about getting the contents of just the article.

Any ideas?

  • 写回答

1条回答 默认 最新

  • duanjing4667 2016-08-14 19:39
    关注
    $url = 'http://www.foxnews.com/sports/2016/08/14/ryan-lochte-3-other-u-s-swimmers-robbed-in-brazil.html';
    $content = file_get_contents($url);
    $first_step = explode( '<div class="article-text">' , $content );
    $paras = explode("<p>" , $first_step[1] );
    
    foreach($paras as $para ) {
       echo $para;
    }
    

    here if you want to get contents with image also use article tag as used in their dom structure.

    评论

报告相同问题?

悬赏问题

  • ¥15 划分vlan后不通了
  • ¥15 GDI处理通道视频时总是带有白色锯齿
  • ¥20 用雷电模拟器安装百达屋apk一直闪退
  • ¥15 算能科技20240506咨询(拒绝大模型回答)
  • ¥15 自适应 AR 模型 参数估计Matlab程序
  • ¥100 角动量包络面如何用MATLAB绘制
  • ¥15 merge函数占用内存过大
  • ¥15 Revit2020下载问题
  • ¥15 使用EMD去噪处理RML2016数据集时候的原理
  • ¥15 神经网络预测均方误差很小 但是图像上看着差别太大