dongmie3987067 2018-03-13 13:52
浏览 38
已采纳

无法在大字符串上使用preg_replace

I'm trying to get rid of some word that is inside a <note> tag. I have a quite long string

string(4687) "~~PB~~ {{:en:iot-open:remotelab:logotyp_1_.png?200|}} <note>testtest</note> ====== RoofTop Thermo Laboratory - intelligent house and heating management ====== The laboratory is located at nowhere, xxx, xxxxx on the roof of bu...... => dumped result

The problem is that It wont remove this testtest string from between the note tag

I'm trying to use this function that I found in the manual of strip_tags.

      function strip_tags_content($text, $tags = '', $invert = FALSE) {

  preg_match_all('/<(.+?)[\s]*\/?[\s]*>/si', trim($tags), $tags);
  $tags = array_unique($tags[1]);

  if(is_array($tags) AND count($tags) > 0) {
    if($invert == FALSE) {
      return preg_replace('@<(?!(?:'. implode('|', $tags) .')\b)(\w+)\b.*?>.*?</\1>@si', '', $text);
    }
    else {
      return preg_replace('@<('. implode('|', $tags) .')\b.*?>.*?</\1>@si', '', $text);
    }
  }
  elseif($invert == FALSE) {
    return preg_replace('@<(\w+)\b.*?>.*?</\1>@si', '', $text);
  }
  return $text;
}

Here is my full code

foreach ($data as $line)
        {
            // Find list tag
            $posi = strpos($line, "* ");

            // No list ?
            if ($posi === false) {
                continue;
            }

            // Check indent
            if (($posi % 2) != 0){
                //echo "<li>Invalid indentation in TOC</li>
";
            }

            // Calculate indent
            $indent = ($posi - 2) / 2;
            // Search for header
            $posh = strpos($line, "]]");

            // No header ?
            if ($posh === false) {
                continue;
            }
            // Extract file path
            $page_path = substr($line, $posi + 4, $posh - $posi - 4);
            $file_path = str_replace(":", "/", $page_path);
            $file_path = $this->getConf("homelab_datapages_folder").$file_path.".txt";
      $indent2 = 0;


            // Page file exists ?
            if (file_exists($file_path))
            {
                // Open file
                $page_content = htmlspecialchars(file_get_contents($file_path));
        $page_content = $this->strip_tags_content($page_content,'note',TRUE);
        $page_cont = strip_tags(html_entity_decode($page_content));
                // Shorten header
                $book_content .= $this->shorten_header($page_content, $indent, $indent2)."
";

        var_dump($book_content);
        //$book_content .=
      }
            else
            {
                $book_content .= "---
 MISSING PAGE ---
";
            }

            // Display page
            //echo "    <li>".$page_path." (".$indent.")</li>
";
        }

What could be the problem?

Is my string too long to use preg_replase or I'm making a mistake here?

  • 写回答

2条回答 默认 最新

  • dousui8263 2018-03-14 16:42
    关注

    I got It working.

    The problem was with the htmlspecialchars() function.

    $page_content = htmlspecialchars(file_get_contents($file_path));

    to

    $page_content = file_get_contents($file_path);
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥20 iqoo11 如何下载安装工程模式
  • ¥15 本题的答案是不是有问题
  • ¥15 关于#r语言#的问题:(svydesign)为什么在一个大的数据集中抽取了一个小数据集
  • ¥15 C++使用Gunplot
  • ¥15 这个电路是如何实现路灯控制器的,原理是什么,怎么求解灯亮起后熄灭的时间如图?
  • ¥15 matlab数字图像处理频率域滤波
  • ¥15 在abaqus做了二维正交切削模型,给刀具添加了超声振动条件后输出切削力为什么比普通切削增大这么多
  • ¥15 ELGamal和paillier计算效率谁快?
  • ¥15 蓝桥杯单片机第十三届第一场,整点继电器吸合,5s后断开出现了问题
  • ¥15 file converter 转换格式失败 报错 Error marking filters as finished,如何解决?