douchuo0730 2015-12-14 15:50
浏览 78
已采纳

Php Regex在字符串中的第一个全大写字母后插入字符

I'm trying to use a preg_replace or similar php function to: - identify the first all capital letter word in a string, - and insert a character directly after it (a dash or semi-colon will do) - the all capital letter word should be 3 characters long or more.

So far I have the regular expression:

/(?<!\ )([^A-Z{3,}])/

But, this isn't working in terms of only words that are 3+ characters. I'm also not sure I have it 'strictly' only looking at the very first word.

I believe that once I have the regex sorted out - this

$string = "LONDON On November 12th twelve people...";
$replaced_string = preg_replace('/myregex/',': ', $string);

will output as the following

LONDON: On November 12th twelve people..."
  • 写回答

1条回答 默认 最新

  • dongxi7722 2015-12-14 15:51
    关注

    It's a fairly simple regex, really:

    $replacedString = preg_replace('/\b([A-Z]{3,})\b/', '$1: ', $string);
    

    It works like this:

    • \b: word boundary. This detects the start and end of a "word"
    • ([A-Z]{3,}): Match 3 or more upper-case characters. The brackets capture this part of the match, so we can use it in the replacement string
    • \b: Another word boundary

    Replace this match with:

    • '$1: ': the $1 refers back to the first captured group (the 3 or more upper case characters). To this, we're adding a colon and a space. That will be our replacement string

    This will add the colon and space after all upper-case words of 3 or more characters. To replace only 1 word, just pass a limit to preg_replace:

    $replaced = preg_replace('/\b([A-Z]{3,})\b/', '$1: ', $string, 1);
    

    Where that last argument is the number of matches you wish to replace. -1 for all, 1 for 1, 2 for 2, etc...

    Demo

    Judging by your sample string, the upper-case words are city names. It's possible for city names to contain a dash, or even a space. To address this, you might want to match all strings containing upper-case chars, dashes and spaces:

    $replaceAll = preg_replace('/\b([A-Z -]{2,}[A-Z])\b/', '$1: ', $string);
    

    Demo 2

    What changed:

    • ([A-Z -]{2,}: The capturing match start with upper-case chars (2 or more, not 3), but also matches spaces and dashes.
    • [A-Z]): The last character of the captured group must be an upper-case character, this avoids capturing the trailing spaces or dashes. The result is that we capture stuff like "NEW YORK" or "FOO-TOWN", but not "ON - Something".

    The rest is the same as before. If you want to allow for other characters that might occur (like a dot) just add them to the first part of the capturing group. The most complete pattern will probably be something like this:

    $replaced = preg_replace('/\b([A-Z][A-Z .-]+[A-Z])\b/', '$1: ', $string);
    

    This ensures the captured group starts, and ends with an upper case character, and contains any number of upper-case chars, spaces, dots and dashes in between. So this will match something like "ST. LEWIS", too

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥20 三菱FX系列PLC上位机串口下指令置位M64和M65,这两条指令分别是什么呢?
  • ¥15 有关结冰模拟程序咨询
  • ¥15 ubuntu服务器配置dns域名后无法访问其他域名
  • ¥50 本人复制了一个关于股票指标的代码,但是运行失败,有没有人帮我解决一下
  • ¥50 用matlab和numeca做透平机械流体力学和热力学模拟 价格可议
  • ¥15 Unity3D WebView
  • ¥20 论文AlphaTensor复现(有偿)
  • ¥15 (有偿)在ANSYS中 .anf文件
  • ¥45 关于#芯片#的问题:组合逻辑电路设计
  • ¥15 基与机器学习和时间序列分析预测养老服务需求趋势