dongqiu3709 2017-04-14 05:38
浏览 45
已采纳

按规则替换文本中的链接

I have a text, and I want to replace all "www.domain.com" with no "?" symbol.

www.domain.com dsa dsad sad sad sa domain.com asdasds adas dsa www.domain.com/someurl/?d sad sadsad www.domain.com/someurl/ asd asd sa www.domain.com?id=123 sd asdsa d

So I am searching the text with the preg_match_all(), and find all links without "?". Run the loop and when I run str_replace() it replaces all of the "domain.com" at one time, even the one with the "?" and on the next iteration it add more "add_text" to replaced domain.com, so I get the situation with "domain.com?add_text?add_text" and so on. I have the start position of the text I want to replace from PREG_OFFSET_CAPTURE, but don't know if it helps me somehow. Thanks

$post_content = 'www.domain.com dsa dsad sad sad sa
domain.com asdasds adas dsa
www.domain.com/someurl/?d sad sadsad
www.domain.com/someurl/ asd asd sa
www.domain.com?id=123 sd asdsa d'.'<hr>';

     $pattern = '#(www\.|https?:\/\/)?(domain.com)\S*#i';
                if($num_found = preg_match_all($pattern, $post_content, $out, PREG_OFFSET_CAPTURE))
                {
                  if ($num_found>0){
                    foreach ($out[0] as $k => $v) {
                        if (strpos($v, '?') !== false) {
                            //skip
                        }else{
    //replace
                            $post_content = str_replace($v, $v.'?add_text, $post_content);
                        }
                    }
                  }
                }

Input:

www.domain.com dsa dsad sad sad sa domain.com asdasds adas dsa www.domain.com/someurl/?d sad sadsad www.domain.com/someurl/ asd asd sa www.domain.com?id=123 sd asdsa d

Expected Output:

www.domain.com?add_text dsa dsad sad sad sa domain.com?add_text asdasds adas dsa www.domain.com/someurl/?d sad sadsad www.domain.com/someurl/?add_text asd asd sa www.domain.com?id=123 sd asdsa d

So every URL have a some get param. Every URL with no "?" (get) must be with ?add_text, if there is already a ?something just skip it.

  • 写回答

2条回答 默认 最新

  • doumen9709 2017-04-14 06:27
    关注

    Your approach is fundamentally flawed, as you're not taking into account substrings when replacing. You'd likely end up with data being replaced multiple times and getting corrupted. Try using preg_replace() instead:

    <?php
    $post_content = 'www.domain.com dsa dsad sad sad sa
    domain.com asdasds adas dsa
    www.domain.com/someurl/?d sad sadsad
    www.domain.com/someurl/ asd asd sa
    www.domain.com?id=123 sd asdsa d'.'<hr>';
    $pattern = '/((?:https?:\/\/)?(?:www\.)?domain\.com(?!\S*\?))(\S*)/im';
    $post_content = preg_replace($pattern, "$1$2?add_text", $post_content);
    echo $post_content;
    

    The regular expression gets a bit tricky, with negative lookahead assertion checking for no question marks. The breakdown is here.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 BP神经网络控制倒立摆
  • ¥20 要这个数学建模编程的代码 并且能完整允许出来结果 完整的过程和数据的结果
  • ¥15 html5+css和javascript有人可以帮吗?图片要怎么插入代码里面啊
  • ¥30 Unity接入微信SDK 无法开启摄像头
  • ¥20 有偿 写代码 要用特定的软件anaconda 里的jvpyter 用python3写
  • ¥20 cad图纸,chx-3六轴码垛机器人
  • ¥15 移动摄像头专网需要解vlan
  • ¥20 access多表提取相同字段数据并合并
  • ¥20 基于MSP430f5529的MPU6050驱动,求出欧拉角
  • ¥20 Java-Oj-桌布的计算