dongqiu3709 2017-04-14 05:38
浏览 45
已采纳

按规则替换文本中的链接

I have a text, and I want to replace all "www.domain.com" with no "?" symbol.

www.domain.com dsa dsad sad sad sa domain.com asdasds adas dsa www.domain.com/someurl/?d sad sadsad www.domain.com/someurl/ asd asd sa www.domain.com?id=123 sd asdsa d

So I am searching the text with the preg_match_all(), and find all links without "?". Run the loop and when I run str_replace() it replaces all of the "domain.com" at one time, even the one with the "?" and on the next iteration it add more "add_text" to replaced domain.com, so I get the situation with "domain.com?add_text?add_text" and so on. I have the start position of the text I want to replace from PREG_OFFSET_CAPTURE, but don't know if it helps me somehow. Thanks

$post_content = 'www.domain.com dsa dsad sad sad sa
domain.com asdasds adas dsa
www.domain.com/someurl/?d sad sadsad
www.domain.com/someurl/ asd asd sa
www.domain.com?id=123 sd asdsa d'.'<hr>';

     $pattern = '#(www\.|https?:\/\/)?(domain.com)\S*#i';
                if($num_found = preg_match_all($pattern, $post_content, $out, PREG_OFFSET_CAPTURE))
                {
                  if ($num_found>0){
                    foreach ($out[0] as $k => $v) {
                        if (strpos($v, '?') !== false) {
                            //skip
                        }else{
    //replace
                            $post_content = str_replace($v, $v.'?add_text, $post_content);
                        }
                    }
                  }
                }

Input:

www.domain.com dsa dsad sad sad sa domain.com asdasds adas dsa www.domain.com/someurl/?d sad sadsad www.domain.com/someurl/ asd asd sa www.domain.com?id=123 sd asdsa d

Expected Output:

www.domain.com?add_text dsa dsad sad sad sa domain.com?add_text asdasds adas dsa www.domain.com/someurl/?d sad sadsad www.domain.com/someurl/?add_text asd asd sa www.domain.com?id=123 sd asdsa d

So every URL have a some get param. Every URL with no "?" (get) must be with ?add_text, if there is already a ?something just skip it.

  • 写回答

2条回答 默认 最新

  • doumen9709 2017-04-14 06:27
    关注

    Your approach is fundamentally flawed, as you're not taking into account substrings when replacing. You'd likely end up with data being replaced multiple times and getting corrupted. Try using preg_replace() instead:

    <?php
    $post_content = 'www.domain.com dsa dsad sad sad sa
    domain.com asdasds adas dsa
    www.domain.com/someurl/?d sad sadsad
    www.domain.com/someurl/ asd asd sa
    www.domain.com?id=123 sd asdsa d'.'<hr>';
    $pattern = '/((?:https?:\/\/)?(?:www\.)?domain\.com(?!\S*\?))(\S*)/im';
    $post_content = preg_replace($pattern, "$1$2?add_text", $post_content);
    echo $post_content;
    

    The regular expression gets a bit tricky, with negative lookahead assertion checking for no question marks. The breakdown is here.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 请问一下这个运行结果是怎么来的
  • ¥15 这个复选框什么作用?
  • ¥15 单通道放大电路的工作原理
  • ¥30 YOLO检测微调结果p为1
  • ¥20 求快手直播间榜单匿名采集ID用户名简单能学会的
  • ¥15 DS18B20内部ADC模数转换器
  • ¥15 做个有关计算的小程序
  • ¥15 MPI读取tif文件无法正常给各进程分配路径
  • ¥15 如何用MATLAB实现以下三个公式(有相互嵌套)
  • ¥30 关于#算法#的问题:运用EViews第九版本进行一系列计量经济学的时间数列数据回归分析预测问题 求各位帮我解答一下