如果preg_match与模式不匹配，则取消设置数组？

I have a multidimensional array that looks like this:

Array
(
    [0] => Array
        (
            [0] => Title 1
            [1] => Some text ... US5801351017 ...
        )

    [1] => Array
        (
            [0] => Title 2
            [1] => Some text ... US0378331005 ...
        )

    [2] => Array
        (
            [0] => Title 3
            [1] => Some text ... //Note here that it does not contain an ISIN Code
        )
...

I am trying to filter out the arrays that match my Regex containg an ISIN Code. The array above was produced from the following code:

$title = $html->find("h3.r a");
$titlearray = array_map(function($value){
    return trim($value->plaintext);
}, $title);

$description = $html->find("span.st");
$descriptionarray = array_map(function($value){
    $string = strip_tags($value);
    return $string;
}, $description);

$result1 = array();
foreach($titlearray as $key => $value) {
    $tmp = array($value);
    if (isset($descriptionarray[$key])) {
        $tmp[] = $descriptionarray[$key];
    }
    $result1[] = $tmp;
}

print_r($result1);

I have written some code that comes very close but does not really unset the arrays that do not contain an ISIN Code. The code I have is this:

$title = $html->find("h3.r a");
$titlearray = array_map(function($value){
    return trim($value->plaintext);
}, $title);

$description = $html->find("span.st");
$descriptionarray = array_map(function($value){
    $match = array();
    $string = strip_tags($value);
    $pattern = "/[BE|BM|FR|BG|VE|DK|HR|DE|JP|HU|HK|JO|US|BR|XS|FI|GR|IS|RU|LB|"
            . "PT|NO|TW|UA|TR|LK|LV|LU|TH|NL|PK|PH|RO|EG|PL|AA|CH|CN|CL|EE|CA|"
            . "IR|IT|ZA|CZ|CY|AR|AU|AT|IN|CS|CR|IE|ID|ES|PE|TN|PA|SG|IL|US|MX|"
            . "SK|KRSI|KW|MY|MO|SE|GB|GG|KY|JE|VG|NG|SA|MU]{2}[A-Z0-9]{10}/";
    preg_match($pattern, $string, $match);
    return $match;
}, $description);

$merged = array();
$i=0;
foreach($descriptionarray as $value){
  $merged[$i] = $value;
  $merged[$i][] = $titlearray[$i];
  $i++;
}

print_r($merged);

which gives me these arrays:

Array
(
    [0] => Array
        (
            [0] => US5801351017
            [1] => Title 1
        )

    [1] => Array
        (
            [0] => US0378331005
            [1] => Title 2
        )

    [2] => Array
        (
            [0] => Title 3
        )
...

How can I get rid of the arrays that do not match my Regex? What I am looking for is this output:

Array
(
    [0] => Array
        (
            [0] => Title 1
            [1] => US5801351017
        )

    [1] => Array
        (
            [0] => Title 2
            [1] => US0378331005
        )
...

EDIT

@CasimiretHippolyte

According to his answer, I have this code now:

$titles = $html->find("h3.r a");

$descriptions = $html->find("span.st");

$ISIN_PATTERN = "/[BE|BM|FR|BG|VE|DK|HR|DE|JP|HU|HK|JO|US|BR|XS|FI|GR|IS|RU|LB|"
            . "PT|NO|TW|UA|TR|LK|LV|LU|TH|NL|PK|PH|RO|EG|PL|AA|CH|CN|CL|EE|CA|"
            . "IR|IT|ZA|CZ|CY|AR|AU|AT|IN|CS|CR|IE|ID|ES|PE|TN|PA|SG|IL|US|MX|"
            . "SK|KRSI|KW|MY|MO|SE|GB|GG|KY|JE|VG|NG|SA|MU]{2}[A-Z0-9]{10}/";

$results = [];

foreach ($descriptions as $k => $v) {
    if (preg_match($ISIN_PATTERN, strip_tags($v), $m)) {
        $results[] = ['Title' => trim($titles[$k]->plaintext), 'ISIN' => $m[1]];
    }
}

print_r($results);

This narrows my array down selecting merely the elements that match the Regex, but it does not display the matches under 'ISIN' => $m[1] . It outputs this:

Array
(
    [0] => Array
        (
            [Title] => Title 1
            [ISIN] => 
        )

    [1] => Array
        (
            [Title] => Title 2
            [ISIN] => 
        )
...

FURTHER EDIT

This code solves the issue:

$titles = $html->find("h3.r a");

$descriptions = $html->find("span.st");

$ISIN_PATTERN = "/[BE|BM|FR|BG|VE|DK|HR|DE|JP|HU|HK|JO|US|BR|XS|FI|GR|IS|RU|LB|"
            . "PT|NO|TW|UA|TR|LK|LV|LU|TH|NL|PK|PH|RO|EG|PL|AA|CH|CN|CL|EE|CA|"
            . "IR|IT|ZA|CZ|CY|AR|AU|AT|IN|CS|CR|IE|ID|ES|PE|TN|PA|SG|IL|US|MX|"
            . "SK|KRSI|KW|MY|MO|SE|GB|GG|KY|JE|VG|NG|SA|MU]{2}[A-Z0-9]{10}/";

$results1 = [];

foreach ($descriptions as $k => $v) {
    if (preg_match($ISIN_PATTERN, strip_tags($v), $m)) {
        $results1[] = ['Title' => trim($titles[$k]->plaintext), 'ISIN' => $m[1]];
    }
}

$titlesarray = array_column($results1, 'Title');

$results2 = array_map(function($value){
    $match = array();
    $string = strip_tags($value);
    $pattern = "/[BE|BM|FR|BG|VE|DK|HR|DE|JP|HU|HK|JO|US|BR|XS|FI|GR|IS|RU|LB|"
            . "PT|NO|TW|UA|TR|LK|LV|LU|TH|NL|PK|PH|RO|EG|PL|AA|CH|CN|CL|EE|CA|"
            . "IR|IT|ZA|CZ|CY|AR|AU|AT|IN|CS|CR|IE|ID|ES|PE|TN|PA|SG|IL|US|MX|"
            . "SK|KRSI|KW|MY|MO|SE|GB|GG|KY|JE|VG|NG|SA|MU]{2}[A-Z0-9]{10}/";
    preg_match($pattern, $string, $match);
    return $match;
}, $descriptions);

$descriptionarray = array_column($results2, 0);

$result3 = array();
foreach($titlesarray as $key => $value) {
    $tmp = array($value);
    if (isset($descriptionarray[$key])) {
        $tmp[] = $descriptionarray[$key];
    }
    $result3[] = $tmp;
}

print_r($result3);

I scraped something together very fast as I needed a quick solution. This is highly inefficient given that I use an extra arrar_map(), simplify the arrays into a Simple Array and then join them back together. Apart from that, I repeat my Regex.

LAST EDIT

@CasimiretHippolyte answer is the most efficient solution and gives the answer for using either his pattern with $m[1] or my pattern with $m[0].

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
doumiebiao6827 2015-05-19 01:18
关注
You can design your code in an other way with a simple foreach and build the result items one by one only when the ISIN code is found:

$titles = $html->find("h3.r a"); $descriptions = $html->find("span.st"); define ('ISIN_PATTERN', '~ \b # there is probably a word boundary at the begin of the ISIN code (?=([A-Z]{2}[A-Z0-9]{10})\b) # check the format before testing the whole alternation # at the same time, the ISIN is captured in group 1 (?: # so, this alternation is only here to make the pattern fail or succeed C[AHLNRSYZ]|I[DELNRST]|P[AEHKLT]|S[AEIGK]|A[ARTU]|B[EGMR]|L[BKUV]|M[OUXY]|T[HNRW] |E[EGS]|G[BGR]|H[KRU]|J[EOP]|K[RWY]|N[GLO]|D[EK]|F[IR]|R[OU]|U[AS]|V[EG]|XS|ZA )~x'); $results = []; foreach ($descriptions as $k => $v) { if (preg_match(ISIN_PATTERN, strip_tags($v), $m)) $results[] = [ 'ISIN' => $m[1], 'Title' => trim($titles[$k]->plaintext) ]; } print_r($results);

Note: this code is not tested and can probably be improved. Several ideas:

stop to use simplehtml and use DOMDocument and DOMXPath

the hand driven pattern is designed with the assumption that all countries are equiprobable. If it isn't the case, rewrite it to check the most current countries in priority
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

如果preg_match与模式不匹配，则取消设置数组？ php
2015-05-18 23:24

回答 1 已采纳 You can design your code in an other way with a simple foreach and build the result items one by o
如果模式不匹配，如何使preg_match_all返回一个空数组值？ php
2017-10-23 16:11

回答 2 已采纳 It looks like each iteration can only return a maximum of one match, so preg_match_all with the in
使用正则表达式和php preg_match_all在括号之间获取字符串 php
2017-07-14 12:34

回答 2 已采纳 This method will extract your desired substrings and prepare the output data as you have requested
php preg match数组,关于preg_match_all返回的数组
2021-04-17 03:32

weixin_39806818的博客我想用正则表达式匹配html的标签内的用的正则表达式是 #...然后第二维是在一维的结果内再进行匹配吗但是实际出来第一维数组和第二维数组是一样的呀我找到一个preg_match参数参数说明：参数说明pattern 正则表达式...
Preg_Replace（删除）精确匹配单词PHP的数组 php
2018-06-03 07:08

回答 1 已采纳 Maybe instead of using preg_replace() you might just try turning your string into an array and the
Regex PCRE PHP preg_match_all：如何删除匹配数组中的空节点？ php
2011-03-22 11:59

回答 2 已采纳 Specifically for regex, you can see @Stephan's answer. More generally, when just manipulating arra
PHP preg_match使用RegEx来匹配一个单词 php
2012-06-24 07:19

回答 3 已采纳 The Solution I would start by removing all special characters and numbers from the string, and th
PHP中preg_match_all正则匹配出需要的内容
2020-08-19 11:45

夏已微凉、的博客目录一、需求二、分析1、共同特征2、详细分析1、匹配数字2、匹配英文问号：0个或1个3、匹配量词中的一个【桶，盒，对，只，根，条】4、匹配空格0个或多个5、针对汉字匹配 /u3、正则表达式三、代码四、打印五、正则...
使用preg_match_all获取空数组结果，以获取不匹配的值 php
2010-09-28 19:36

回答 2 已采纳 In default mode, preg_match_all returns an array of matches and submatches: PREG_PATTERN_ORDER
preg_replace_callback正则表达式问题，与（。*？）匹配返回数组 php
2013-09-25 06:11

回答 2 已采纳 With the help of Jack, I answered my own question here since srain did not make this point clear:
PHP preg_replace - 使用匹配作为键从数组中查找替换 php
2010-07-13 12:45

回答 4 已采纳 echo preg_replace('/%([a-z_]+)%/e', '$rep["$1"]', $str); gives: PHP does my head in! See the
preg_match_all正则匹配所有结果，数组输出
2021-04-02 11:09

sh2018的博客 <?php $Num=rand(0,7); ...idx=0&n=8"; $info=file_get_contents($url);...preg_match_all('|"url":"(.*?)"|',$info,$tp); $tp="http://cn.bing.com".$tp[1][$Num]; $tp=str_replace('jpg&pid=hp','png',$t
PHP数组仅获取其键与特定模式匹配的元素/值 php
2017-09-02 13:05

回答 1 已采纳 Just use array_filter and preg_match. return array_filter($data, function($key) { return preg
php正则匹配preg_match,php正则表达式中preg_match函数的详解
2021-04-08 11:06

王利芬的博客我们之前给大家介绍了php正则表达式的使用，入门，以及验证邮箱地址，那么我们今天就想大家介绍php正则表达式中的函数preg_...如果找到一个匹配，preg_match() 函数返回 1，否则返回 0。还有一个可选的第三参数可...
php正则匹配preg_match,PHP 正则表达式之正则处理函数小结(preg_match,preg_match_all,preg_replace,preg_split)...
2021-04-08 11:05

weixin_39915171的博客前面我们已经学习了正则表达式的基础语法，包括了定界符、原子、元字符和模式修正符。实际上正则表达式想要起作用的话，就必须借用正则表达式处理函数。本节我们就来介绍一下PHP中基于perl的正则表达式处理函数，...
php中pregmatch,php小经验:解析preg_match与preg_match_all 函数
2021-03-26 11:59

Kang He的博客正则表达式在 PHP 中的应用在 PHP 应用中，正则表达式主要用于：•正则匹配：根据正则表达式匹配相应的内容•正则替换：根据正则表达式匹配内容并替换•正则分割：根据正则表达式分割字符串在 PHP 中有两类正则...
php preg_match_all 图片,php用preg_match_all匹配文章中的图片
2021-04-20 11:49

weixin_39985365的博客 preg_match_all 函数：int preg_match_all ( string pattern, string subject, array matches [, int flags] )执行一个全局正则表达式匹配在 subject 中搜索所有与 pattern 给出的正则表达式匹配的内容并将结果以 ...
preg_match与preg_match_all区别
2021-11-23 17:18

两份煲仔饭的博客 preg_match_all：从左边开始一直到尾部，找出所有匹配的字符串。匹配结果$matches为二维数组，$matches[0]是匹配到的完整结果，$matches[1]是匹配到完整结果的字组。 preg_match：从左边开始，匹配到第一个符合...
php ereg preg_match,正则表达式 preg_match()与ereg()函数
2021-04-12 20:14

鸭梨梨呐的博客作用：分割，匹配，查找，替换例如：验证邮箱地址格式，手机号码格式等等php中常用的正则函数：preg_match(mode, string subject, array matches); 更加规范执行效率越高ereg(mode, string ...
php preg_match 只匹配第一个字符_PHP正则表达式核心技术完全详解第3节
2020-11-21 19:57

weixin_39731922的博客 PHP正则表达式核心技术详解第3节我们在第2节中学习了有关正则的原子、元字符、原子表、转义字符等重要知识点, 这一节我们来讲一下正则中的量词、断言匹配、逻辑匹配、等重要知识!1量词量词: 是用来修饰原子的, ...
没有解决我的问题, 去提问

悬赏问题

¥60 版本过低apk如何修改可以兼容新的安卓系统
¥25 由IPR导致的DRIVER_POWER_STATE_FAILURE蓝屏
¥50 有数据，怎么建立模型求影响全要素生产率的因素
¥50 有数据，怎么用matlab求全要素生产率
¥15 TI的insta-spin例程
¥15 完成下列问题完成下列问题
¥15 C#算法问题, 不知道怎么处理这个数据的转换
¥15 YoloV5 第三方库的版本对照问题
¥15 请完成下列相关问题！
¥15 drone 推送镜像时候 purge: true 推送完毕后没有删除对应的镜像,手动拷贝到服务器执行结果正确在样才能让指令自动执行成功删除对应镜像，如何解决？

如果preg_match与模式不匹配，则取消设置数组？

EDIT

1条回答 默认 最新

悬赏问题

1条回答默认最新