doumiang0597 2018-10-14 13:31
浏览 64
已采纳

从字母和标记之间的字符串中提取数字

I have a MySQL text field in an online diary I have, which sometimes contain text like D<num> <tag>, for example D109 MU.

Those references can appear in any part of the field - so might be:

D109 MU, worked from home today
Walked the dog, later took the kids to swimming. D110 MU. Went to the gym in the evening for the 9th time this month.

I have worked out an SQL query to pull out the references which include the D<num> <tag> content, via this - so for example, by going to URL:

example.com/tidy.php?v1=7346&v2=90000&tag=MU

The querystring data is used to get the data out of the field:

$config = HTMLPurifier_Config::createDefault();
$purifier = new HTMLPurifier($config);

if (!empty($_GET['v1'])) {
    $v1 = $purifier->purify($_GET['v1']);
}

if (!empty($_GET['v2'])) {
    $v2 = $purifier->purify($_GET['v2']);
}

if (!empty($_GET['tag'])) {
    $tag = $purifier->purify($_GET['tag']);
}

$sql = "select id, post_date, post_content from tbl_log_days where id between :v1 and :v2 and post_content REGEXP :exp ";
$stmt = $pdo->prepare($sql);
$stmt->bindParam(':v1', $v1);
$stmt->bindParam(':v2', $v2);
$stmt->bindValue(":exp" , "D[0-9]+ $tag", PDO::PARAM_STR); 
$stmt->execute();

That works okay - so I get the relevant post_content entries.

However, I am struggling working out the syntax to pull out only the number of the D part of the content.

I have got this far:

while ($row = $stmt->fetch()){

    $id = $row['id'];
    $dt = $row['post_date'];
    $pc = $row['post_content'];

    preg_match_all('/\d+/', $pc, $matches);
    $number = implode(' ', $matches[0]);

    echo "$number <hr>";

}

The trouble with that is often the content includes multiple numbers, but I only want to get the number that appears between the D and the tag value. So for D109 MU, I'd want to extract 109, and for the 2nd example, I'd want to extract 110 from D110 MU, but ignore the number 9 that appears later in that same field.

How could I achieve that?

  • 写回答

2条回答 默认 最新

  • dongyi0114 2018-10-14 13:50
    关注

    You are not specific if the MU is a reliable string to match, so I'm leaving that out. Match the D, restart the fullstring match with \K, then match 1 or more digits.

    Code: (Demo) (Regex101 Demo)

    $string = 'D109 MU, worked from home today
    Walked the dog, later took the kids to swimming. D110 MU. Went to the gym in the     evening for the 9th time this month.';
    
    var_export(preg_match_all('~D\K\d+~', $string, $out) ? $out[0] : 'fail');
    

    Output:

    array (
      0 => '109',
      1 => '110',
    )
    

    Extension: If you need to increase the pattern accuracy by adding the known tag value, you can add the $tag variable to the pattern as a lookahead.

    Code: (Demo)

    $tag = "MU";
    $string = 'D109 MU, worked from home today
    Walked the dog, later took the kids to swimming. D110 MU. Went to the gym in the     evening for the 9th time this month.';
    
    var_export(preg_match_all("~D\K\d+(?= $tag)~", $string, $out) ? $out[0] : 'fail');
    

    Furthermore, if your strings only contain one qualifying <num>, then preg_match() will suffice.

    Code: (Demo)

    $tag = "MU";
    $strings = [
        'D109 MU, worked from home today',
        'Walked the dog, later took the kids to swimming. D110 MU. Went to the gym in the     evening for the 9th time this month.'
    ];
    
    foreach ($strings as $string) {
        echo "
    ---
    " , preg_match("~D\K\d+(?= $tag)~", $string, $out) ? $out[0] : 'fail';
    }
    

    Output:

    ---
    109
    ---
    110
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥100 Jenkins自动化部署—悬赏100元
  • ¥15 关于#python#的问题:求帮写python代码
  • ¥20 MATLAB画图图形出现上下震荡的线条
  • ¥15 关于#windows#的问题:怎么用WIN 11系统的电脑 克隆WIN NT3.51-4.0系统的硬盘
  • ¥15 perl MISA分析p3_in脚本出错
  • ¥15 k8s部署jupyterlab,jupyterlab保存不了文件
  • ¥15 ubuntu虚拟机打包apk错误
  • ¥199 rust编程架构设计的方案 有偿
  • ¥15 回答4f系统的像差计算
  • ¥15 java如何提取出pdf里的文字?