douna1892 2016-10-27 04:59
浏览 42
已采纳

PHP:从file_get_contents字符串测试html标记内的文本字符串

I need to perform a series of tests on a url. The first test is a word count, I have that working perfectly and the code is below:

if (isset($_GET[article_url])){
    $title = 'This is an example title';
    $str = @file_get_contents($_GET[article_url]);
    $test1 = str_word_count(strip_tags(strtolower($str)));
    if($test1 === FALSE) { $test = '0'; }
    if ($test1 > '550') {
        echo '<div><i class="fa fa-check-square-o" style="color:green"></i> This article has '.$test1.' words.';
    } else {
        echo '<div><i class="fa fa-times-circle-o" style="color:red"></i> This article has '.$test1.' words. You are required to have a minimum of 500 words.';
    }       
}

Next I need to get all h1 and h2 tags from $str and test them to see if any contain the text $title and echo yes if so and no if not. I am not really sure how to go about doing this.

I am looking for a pure php means of doing this without installing php libraries or third party functions.

  • 写回答

1条回答 默认 最新

  • dongyizhuang0134 2016-10-27 05:06
    关注

    please try below code.

    if (isset($_GET[article_url])){
        $title = 'This is an example title';
        $str = @file_get_contents($_GET[article_url]);
    
        $document = new DOMDocument();
        $document->loadHTML($str);
    
        $tags = array ('h1', 'h2');
        $texts = array ();
        foreach($tags as $tag)
        {
          //Fetch all the tags with text from the dom matched with passed tags
          $elementList = $document->getElementsByTagName($tag);
          foreach($elementList as $element)
          {
             //Store text in array from dom for tags
             $texts[] = strtolower($element->textContent);
          }
        }
        //Check passed title is inside texts array or not using php
        if(in_array(strtolower($title),$texts)){
            echo "yes";
        }else{
            echo "no";
        }
    }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥20 为什么我写出来的绘图程序是这样的,有没有lao哥改一下
  • ¥15 js,页面2返回页面1时定位进入的设备
  • ¥50 导入文件到网吧的电脑并且在重启之后不会被恢复
  • ¥15 (希望可以解决问题)ma和mb文件无法正常打开,打开后是空白,但是有正常内存占用,但可以在打开Maya应用程序后打开场景ma和mb格式。
  • ¥15 绘制多分类任务的roc曲线时只画出了一类的roc,其它的auc显示为nan
  • ¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
  • ¥20 腾讯企业邮箱邮件可以恢复么
  • ¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗?
  • ¥15 错误 LNK2001 无法解析的外部符号
  • ¥50 安装pyaudiokits失败