是否可以知道主题字符串中匹配的位置

I have a file name where information has to be replaced. Here is a subject sample :

FileA-2014-11-01_K_1_A2_383.xxx

As many files are to be processed, this filename is first matched by a regex, say :

/[a-zA-Z]*-\d{4}-\d{2}-\d{2}_(\w)_(\d)_A2_(\d*)\.xxx$/

This regex will give me, using preg_match, the values to be replaced, here :

K=>A
1=>2
383=>666

My first try was to naively use "str_replace", but it fails when patterns are repeated in the string : here i will get :

FileA-2024-22-02_A_2_A2_666.xxx

So the date is also modified by the str_replace (as it was told to do..)

So, i wonder if there is a way to know where is a given match in the string to have a clean replacement. I'm now trying to revert the regex to be able to capture non-replacement blocks, and then insert replaced data. That regex would be :

/([a-zA-Z]*-\d{4}-\d{2}-\d{2}_)\w(_)\d(_A2_)\d*(\.xxx)$/

With that one, i'm able to keep non-replaced parts. I now have to find a kind of index to know the replacement position in the string. I guess I can achieve this way, but is seems somewhat complicated and error prone. Given I only have the initial regex and the map for to=>from replacement, is there a way to do that in a better way?

[EDIT : solution]

<?php

$filename = "FileA-2014-11-01_K_1_A2_383.xxx";
$expected = "FileA-2014-11-01_A_2_A2_666.xxx";

$regex = "/[a-zA-Z]*-\d{4}-\d{2}-\d{2}_(\w)_(\d)_A2_(\d*)\.xxx$/";


global $replacements;

$replacements["K"] = "A";
$replacements["1"] = "2";
$replacements["383"] = "666";


$result = preg_replace_callback($regex, function($matches){
    global $replacements;
    print_r($matches);
    // ended here. no way.
}, $filename);


if(strcmp($result,$expected)==0)
    echo "preg_replace_callback() : Yep
";
else
    echo "preg_replace_callback() : Nop
";


preg_match($regex, $filename, $matches, PREG_OFFSET_CAPTURE);

// remove useless global string match
array_shift($matches);

$result = $filename;
foreach($matches as $matchInfo){

    $match    = $matchInfo[0];
    $position = $matchInfo[1];

    $matchLength= strlen($match);

    $beforeReplacementPart = substr($result, 0, $position);
    $afterReplacementPart = substr($result, ($position + $matchLength));
    $result = $beforeReplacementPart . $replacements[$match] . $afterReplacementPart;

}


if(strcmp($result,$expected)==0)
    echo "preg_match() and substr game : Yep
";
else
    echo "preg_match() and substr game : Nop
";

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

5条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
drurhg37071 2015-02-19 10:49
关注
A regex that matches that filename:

$re = '/[a-zA-Z]*-\d{4}-\d{2}-\d{2}_(\w)_(\d)_A2_(\d*)\.xxx$/'; $str = 'FileA-2014-11-01_K_1_A2_383.xxx';

If you add PREG_OFFSET_CAPTURE as the fourth parameter ($flags) to the call to preg_match(), it will also return the offset of each captured string in the third parameter:

preg_match($re, $str, $matches, PREG_OFFSET_CAPTURE);

A print_r($matches) will reveal:

Array ( [0] => Array ( [0] => FileA-2014-11-01_K_1_A2_383.xxx [1] => 0 ) [1] => Array ( [0] => K [1] => 17 ) [2] => Array ( [0] => 1 [1] => 19 ) [3] => Array ( [0] => 383 [1] => 24 ) )

$matches[0] is the part that matched the entire regex. $matches[1] is the first capturing sub-expression, $matches[2] is the second and so on.

$matches[1][0] is the fragment from the input string that matched the first regex sub-expression (\w) and $matches[1][1] is the offset in the input string where it was found. The same for $matches[N][0] and $matches[N][1] for the N^th sub-expression.

If you need to do a simple replacement then you don't need to bother about offsets but use preg_replace() or, if the replacement expression is complex or dynamic, preg_replace_callback().

Using preg_replace() you need to capture the parts you want to keep:

$re = '/([a-zA-Z]*-\d{4}-\d{2}-\d{2}_)\w_\d_A2_\d*(\.xxx)$/'; $str = 'FileA-2014-11-01_K_1_A2_383.xxx'; $new = preg_replace($re, '$1A_2_A2_666$2', $str); echo($new." ");

In the replacement string, $1 and $2 denote the sub-expressions from the regex. We marked them for capturing in order to re-use them in the replacement string.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(4条)

报告相同问题？

关注问题

悬赏问题

¥15 目详情-五一模拟赛详情页
¥15 有了解d3和topogram.js库的吗？有偿请教
¥100 任意维数的K均值聚类
¥15 stamps做sbas-insar，时序沉降图怎么画
¥15 买了个传感器，根据商家发的代码和步骤使用但是代码报错了不会改，有没有人可以看看
¥15 关于#Java#的问题，如何解决？
¥15 加热介质是液体，换热器壳侧导热系数和总的导热系数怎么算
¥100 嵌入式系统基于PIC16F882和热敏电阻的数字温度计
¥15 cmd cl 0x000007b
¥20 BAPI_PR_CHANGE how to add account assignment information for service line

是否可以知道主题字符串中匹配的位置

5条回答 默认 最新

悬赏问题

5条回答默认最新