dongmie3987067
2018-03-13 13:52
浏览 37
已采纳

无法在大字符串上使用preg_replace

I'm trying to get rid of some word that is inside a <note> tag. I have a quite long string

string(4687) "~~PB~~ {{:en:iot-open:remotelab:logotyp_1_.png?200|}} <note>testtest</note> ====== RoofTop Thermo Laboratory - intelligent house and heating management ====== The laboratory is located at nowhere, xxx, xxxxx on the roof of bu...... => dumped result

The problem is that It wont remove this testtest string from between the note tag

I'm trying to use this function that I found in the manual of strip_tags.

      function strip_tags_content($text, $tags = '', $invert = FALSE) {

  preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
  $tags = array_unique($tags[1]);

  if(is_array($tags) AND count($tags) > 0) {
    if($invert == FALSE) {
      return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
    }
    else {
      return preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
    }
  }
  elseif($invert == FALSE) {
    return preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
  }
  return $text;
}

Here is my full code

foreach ($data as $line)
        {
            // Find list tag
            $posi = strpos($line, "* ");

            // No list ?
            if ($posi === false) {
                continue;
            }

            // Check indent
            if (($posi % 2) != 0){
                //echo "<li>Invalid indentation in TOC</li>
";
            }

            // Calculate indent
            $indent = ($posi - 2) / 2;
            // Search for header
            $posh = strpos($line, "]]");

            // No header ?
            if ($posh === false) {
                continue;
            }
            // Extract file path
            $page_path = substr($line, $posi + 4, $posh - $posi - 4);
            $file_path = str_replace(":", "/", $page_path);
            $file_path = $this->getConf("homelab_datapages_folder").$file_path.".txt";
      $indent2 = 0;


            // Page file exists ?
            if (file_exists($file_path))
            {
                // Open file
                $page_content = htmlspecialchars(file_get_contents($file_path));
        $page_content = $this->strip_tags_content($page_content,'note',TRUE);
        $page_cont = strip_tags(html_entity_decode($page_content));
                // Shorten header
                $book_content .= $this->shorten_header($page_content, $indent, $indent2)."
";

        var_dump($book_content);
        //$book_content .=
      }
            else
            {
                $book_content .= "---
 MISSING PAGE ---
";
            }

            // Display page
            //echo "    <li>".$page_path." (".$indent.")</li>
";
        }

What could be the problem?

Is my string too long to use preg_replase or I'm making a mistake here?

  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

2条回答 默认 最新

  • dousui8263 2018-03-14 16:42
    已采纳

    I got It working.

    The problem was with the htmlspecialchars() function.

    $page_content = htmlspecialchars(file_get_contents($file_path));

    to

    $page_content = file_get_contents($file_path);
    
    点赞 打赏 评论
  • douyong5476 2018-03-13 17:13

    When you're calling

    $this->strip_tags_content($page_content,'note',TRUE);
    

    the preg_match_all results with an empty array $tags, so all the tests after are false and the return value is allways $text without any modifications.

    Call the function:

    $this->strip_tags_content($page_content,'<note>',TRUE);
    //                                       ^____^
    
    点赞 打赏 评论

相关推荐 更多相似问题