dtcwehta624485 2013-07-11 20:20
浏览 45
已采纳

正则表达式分裂字符串,如果它包含wordS或单词[关闭]

Edit again to trying to make it more clear.

Wich php regex pattern will give me a match array containing always 2 values wich is the 2 part of a string splitted by the "wordA" or "wordB". If the string do not containt those word, simply return the string as the first array an null in the second array.

Exemple:

preg_match("pattern","foo wordA bar",$match), $match will contain array['foo', 'bar']
preg_match("pattern","foo wordB bar",$match), $match will contain array['foo', 'bar']
preg_match("pattern","foo bar test",$match), $match will contain array['foo bar test', null]

I know that $match first value is always the string so I just don't write it.

OLD question:

I need to split a one line address into part. I can't find a way to capture street part but dont include the APP or APT word if present and if present, capture the words after it.

For exemple:

"5847A, rue Principal APP A" should match: (5847, A, rue Principal,A)

"5847A, rue Prince Arthur APT 22" should match: (5847, A, rue Prince Arthur, 22)

"1111, Sherwood street" should match: (1111, , Sherwood street, )

I'm using PHP.

What I have so far is: /^(\d+)(.*), (.*)(?:APP|APT)(?:\s*(.*))?$/i wich wook with exemple 1 and 2. If I try to make the alternative (APP|APT) optionnal by adding an ? after it, then the third match include the word APP or APT...

Any idea how to exclude the optionnal and alternative APP or APT word from match?

Thank you

EDIT:

I can simplify the problem: How can I regex a string so the match return the same string minus the word APP or APT if he is present in the middle of it.

  • 写回答

3条回答 默认 最新

  • douyue1481 2013-07-11 20:35
    关注

    As @MadaraUchiha pointed out, it's a bad idea to run a regex on an address since they can be in any format.

    If you know you have consistent addresses, then I guess you can use the regex:

    ^([0-9]+)([A-Z])?,\s(?:(.*?)\s(?:APP|APT)\s(.*)|(.*))$
    

    And the replace:

    $1,$2,$3$5,$4
    

    Here's how it's performing.

    It's pretty similar to yours (I changed few things) and added an or (|) operator to address the second type of addresses without APP or APT.

    If you want consistent number of matches, maybe this?

    ^([0-9]*)([A-Z]?),((?:(?!\sAPP|\sAPT).)*)(?:\sAPP|\sAPT)?(.*)$
    

    Regex101 example.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥60 pb数据库修改或者求完整pb库存系统,需为pb自带数据库
  • ¥15 spss统计中二分类变量和有序变量的相关性分析可以用kendall相关分析吗?
  • ¥15 拟通过pc下指令到安卓系统,如果追求响应速度,尽可能无延迟,是不是用安卓模拟器会优于实体的安卓手机?如果是,可以快多少毫秒?
  • ¥20 神经网络Sequential name=sequential, built=False
  • ¥16 Qphython 用xlrd读取excel报错
  • ¥15 单片机学习顺序问题!!
  • ¥15 ikuai客户端多拨vpn,重启总是有个别重拨不上
  • ¥20 关于#anlogic#sdram#的问题,如何解决?(关键词-performance)
  • ¥15 相敏解调 matlab
  • ¥15 求lingo代码和思路