douchuo0730
2010-07-10 16:55
浏览 172
已采纳

用于匹配字符串中的双引号和/或单引号字符串的PHP正则表达式

I'm working on a template class and I've an issue when trying to parse out a list of quoted strings from a string argument list. Take for example the string:

$string = 'VAR_SELECTED, \'Hello m\'lady\', "null"';

I'm having a problem coming up with a regex that extracts the string "Hello m'lady" and "null". The closest I have got is

$string = 'VAR_SELECTED, \'Hello m\'lady\', "null", \'TE\'ST\'';
preg_match_all('/(?:[^\']|\\\\.)+|(?:[^"]|\\\\.)+/', $string, $matches);
print_r($matches);

Which outputs:

Array
(
    [0] => Array
        (
            [0] => VAR_SELECTED, 
            [1] => 'Hello m'lady', 
            [2] => "null", 
            [3] => 'TE'ST'
        )

)

However a more complex case of:

$string = 'VAR_SELECTED, \'Hello "Father"\', "Hello \'Luke\'"';
preg_match_all('/(?:[^\']|\\\\.)+|(?:[^"]|\\\\.)+/', $string, $matches);
print_r($matches);  

outputs:

Array
(
    [0] => Array
        (
            [0] => VAR_SELECTED, 
            [1] => 'Hello 
            [2] => "Father"
            [3] => ', 
            [4] => "Hello 
            [5] => 'Luke'
            [6] => "
        )

)

Can anyone help me solve this problem? Are multiple regexes the way forward?

Edit Maybe it would be easier to replace the commas within the strings with a placeholder and then break apart the strings with an explode?

Edit 2 Just thought of a simple insecure option (that I am not going to use), but generates an E_NOTICE error.

$string = 'return array(VAR_SELECTED, \'Hello , "Father"\', "Hello \'Luke\'4");';
$string = eval($string);
print_r($string);

图片转代码服务由CSDN问答提供 功能建议

我正在处理模板类,在尝试解析引用字符串列表时遇到问题 来自字符串参数列表。 以字符串为例:

  $ string ='VAR_SELECTED,\'Hello m \'lady \',“null”'; 
   
 
 

我遇到了一个提取字符串“Hello m'lady”和“null”的正则表达式的问题。 我最接近的是

  $ string ='VAR_SELECTED,\'Hello m \'lady \',“null”,\'TE \'ST \'';  
preg_match_all('/(?:[^ \'] | \\\\。)+ |(?:[^“] | \\\\。)+ /',$ string,$ matches); 
print_r(  $ matches); 
   
 
 

哪些输出:

  Array 
(
 [0] => 数组
(
 [0] => VAR_SELECTED,
 [1] =>'Hello m'lady',
 [2] =>“null”,
 [3] =>'  TE'ST'
)
 
)
   
 
 

然而,更复杂的情况是:

   $ string ='VAR_SELECTED,\'Hello“父亲”\“,”你好\'Luke \'“'; 
 
 
 npreg_match_all('/(?:[^ \'] | \\\\。)+ |(?:  [^“] | \\\\。)+ /',$ string,$ matches); 
print_r($ matches);  
   
 
 

输出:

  Array 
(
 [0] => Array 
(\  n [0] => VAR_SELECTED,
 [1] =>'Hello 
 [2] =>“父亲”
 [3] =>',
 [4] =>“你好 
 [5] =>'Luke'
 [6] =>“
)
 
)
   
 
 

任何人都可以帮我解决 这个问题? 多个正则表达式是前进的方向吗?

编辑也许用占位符替换字符串中的逗号会更容易,然后用爆炸拆分字符串 ?

编辑2 想到一个简单的不安全选项(我不打算使用),但会产生E_NOTICE错误。

  $ string ='return array(VAR_SELECTED,\'Hello,“Father”\',“Hello \'Luke \'4”);'; 
 $ string = eval($ string)  ; 
print_r($ string); 
   
 
  • 点赞
  • 写回答
  • 关注问题
  • 收藏
  • 邀请回答

3条回答 默认 最新

  • dps43378 2010-07-10 18:49
    已采纳

    Try this:

    /(?<=^|[\s,])(?:(['"]).*?\1|[^\s,'"]+)(?=[\s,]|$)/
    

    Or, as a PHP single-quoted string literal:

    '/(?<=^|[\s,])(?:([\'"]).*?\1|[^\s,\'"]+)(?=[\s,]|$)/'
    

    That regex yields the desired result, but I think you're going about this wrong. Usually, if a quoted string needs to contain a literal quote character, the quote is escaped, either with a backslash or with another quote. You aren't doing that, so I had to use a fragile hack based on lookarounds. Are you sure the data isn't supposed to look like this?

    $string = 'VAR_SELECTED, \'Hello m\\'lady\', "null"';
    
    $string = 'VAR_SELECTED, \'Hello "Father"\', "Hello \\'Luke\\'"';
    

    Come to think of it, doesn't PHP have built-in support for CSV data?

    点赞 打赏 评论
  • dtlhy0771 2010-07-10 17:15

    You want to use a back reference in the match string.

    preg_match_all('@([\'"]).*[^\\\\]\1@', $string, $matches);
    

    This will start matching with the first instance of " or ' and then match the longest string that ends with a matching " or ' that isn't escaped.

    Array (
    [0] => Array
        (
            [0] => 'Hello m'lady', "null", 'TE'ST'
        )
    
    [1] => Array
        (
            [0] => '
        )
    
    点赞 打赏 评论
  • dswfyq6201 2010-07-10 17:30

    Here's how i would do it:

    Break the task down into the component steps you want to take:

    1.) Explode the string on commas.

    For 'VAR_SELECTED, \'Hello m\'lady\', "null"' this gives me
    [0]=>"VAR_SELECTED"
    [1]=>" \'Hello m\'lady\'"
    [2]=>" "null""
    
    For 'VAR_SELECTED, \'Hello "Father"\', "Hello \'Luke\'"' this gives me
    [0]=>"VAR_SELECTED"
    [1]=>" \'Hello "Father"\'"
    [2]=>" "Hello \'Luke\'""
    

    2.) Run Trim on all three to get rid of any whitespace

    For 'VAR_SELECTED, \'Hello m\'lady\', "null"' this gives me
    [0]=>"VAR_SELECTED"
    [1]=>"\'Hello m\'lady\'"
    [2]=>""null""
    
    For 'VAR_SELECTED, \'Hello "Father"\', "Hello \'Luke\'"' this gives me
    [0]=>"VAR_SELECTED"
    [1]=>"\'Hello "Father"\'"
    [2]=>""Hello \'Luke\'""
    

    3.) Run str_replace(" \ "," ",$text) to get rid of the slashes. (remove spaces..added for readability only, so that should be a naked slash and an "empty" string)

    For 'VAR_SELECTED, \'Hello m\'lady\', "null"' this gives me
    [0]=>"VAR_SELECTED"
    [1]=>"'Hello m'lady'"
    [2]=>""null""
    
    For 'VAR_SELECTED, \'Hello "Father"\', "Hello \'Luke\'"' this gives me
    [0]=>"VAR_SELECTED"
    [1]=>"'Hello "Father"'"
    [2]=>""Hello 'Luke'""
    

    4.) Run trim again, only trim($text, " ' " ") (remove spaces..added for readability only)

    For 'VAR_SELECTED, \'Hello m\'lady\', "null"' this gives me
    [0]=>"VAR_SELECTED"
    [1]=>"Hello m'lady"
    [2]=>"null"
    
    For 'VAR_SELECTED, \'Hello "Father"\', "Hello \'Luke\'"' this gives me
    [0]=>"VAR_SELECTED"
    [1]=>"Hello "Father""
    [2]=>"Hello 'Luke'"
    

    I haven't tested this, but the logic is sound. A quick and dirty way to test 98% of all the regex's (in my experience) is to use http://rubular.com/ It's a great site. Usually if it starts to choke on a regex, it's my first sign that i should break the problem down more. (that's just opinion ~dons flameproof suit~)

    点赞 打赏 评论

相关推荐 更多相似问题