dongma6326 2019-05-23 13:19
浏览 214
已采纳

PHP正则表达式用回调替换多个模式

I'm trying to run a simple replacement on some input data that could be described as follows:

  • take a regular expression
  • take an input data stream
  • on every match, replace the match through a callback

Unfortunately, preg_replace_callback() doesn't work as I'd expect. It gives me all the matches on the entire line, not individual matches. So I need to put the line together again after replacement, but I don't have the information to do that. Case in point:

<?php
echo replace("/^\d+,(.*),(.*),.*$/", "12,LOWERME,ANDME,ButNotMe")."
";
echo replace("/^\d+-\d+-(.*) .* (.*)$/", "13-007-THISLOWER ThisNot THISAGAIN")."
";


function replace($pattern, $data) {
    return preg_replace_callback(
        $pattern, 
        function($match) {
            return strtolower($match[0]);
        }, $data
    );
}

https://www.tehplayground.com/hE1ZBuJNtFiHbdHO

gives me 12,lowerme,andme,butnotme, but I want 12,lowerme,andme,ButNotMe.

I know using $match[0] is wrong. It's just to illustrate here. Inside the closure I need to run something like

foreach ($match as $m) { /* do something */ }

But as I said, I have no information about the position of the matches in the input string which makes it impossible to put the string together again.

I've digged through the PHP documentation as well as several searches and couldn't find a solution.


Clarifications:

I know that $match[1], $match[2]... etc contain the matches. But only a string, not a position. Imagine in my example the final string is also ANDME instead of ButNotMe - according to the regex, it should not be matched and the callback should not be applied to it. That's why I'm using regexes in the first place instead of string replacements.

Also, the reason I'm using capture groups this way is that I need the replacement process to be configurable. So I cannot hardcode something like "replace #1 and #2 but not #3". On a different input file, the positions might be different, or there might be more replacements needed, and only the regex used should change.

So if my input is "15,LOWER,ME,NotThis,AND,ME,AGAIN", I want to be able to just change the regex, not the code and get the desired result. Basically, both $pattern and $data are variable.

  • 写回答

2条回答 默认 最新

  • duangua5308 2019-05-23 13:51
    关注

    This uses preg_match() and PREG_OFFSET_CAPTURE to return the capture groups and the offset within the original string where it is found. This then uses substr_replace() with each capture group to replace only the part of the string which is to be changed - this stops any chance of replacing similar text which you do not want to be changed...

    function lowerParts (string $input, string $regex ) {
        preg_match($regex, $input, $matches, PREG_OFFSET_CAPTURE);
        array_shift($matches);
        foreach ( $matches as $match )  {
            $input = substr_replace($input, strtolower($match[0]),
                $match[1], strlen($match[0]));
        }
        return $input;
    }
    echo lowerParts ("12,LOWERME,ANDME,ButNotMe", "/^\d+,(.*),(.*),.*$/");
    

    gives...

    12,lowerme,andme,ButNotMe
    

    But also with

    echo lowerParts ("12,LOWERME,ANDME,LOWERME", "/^\d+,(.*),(.*),.*$/");
    

    it gives

    12,lowerme,andme,LOWERME
    

    Edit:

    If the replacement data is of different lengths, then you would need to chop the string up into parts and replace each one. The complication is that each change in length alters the relative position of the offsets, so this has to keep track of what this offset is. This version also has a parameter which is the process you want to apply to the strings (this example just passes "strtolower") ...

    function processParts (string $input, string $regex, callable $process ) {
        preg_match($regex, $input, $matches, PREG_OFFSET_CAPTURE);
        array_shift($matches);
        $offset = 0;
        foreach ( $matches as $match )  {
            $replacement = $process($match[0]);
            $input = substr($input, 0, $match[1]+$offset)
                     .$replacement.
                     substr($input, $match[1]+$offset+strlen($match[0]));
            $offset += strlen($replacement) - strlen($match[0]);
        }
        return $input;
    }
    echo processParts ("12,LOWERME,ANDME,LOWERME", "/^\d+,.*,(.*),(.*)$/", "strtolower");
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 乌班图ip地址配置及远程SSH
  • ¥15 怎么让点阵屏显示静态爱心,用keiluVision5写出让点阵屏显示静态爱心的代码,越快越好
  • ¥15 PSPICE制作一个加法器
  • ¥15 javaweb项目无法正常跳转
  • ¥15 VMBox虚拟机无法访问
  • ¥15 skd显示找不到头文件
  • ¥15 机器视觉中图片中长度与真实长度的关系
  • ¥15 fastreport table 怎么只让每页的最下面和最顶部有横线
  • ¥15 java 的protected权限 ,问题在注释里
  • ¥15 这个是哪里有问题啊?