dongzhuo2010 2019-01-06 18:58
浏览 36
已采纳

在php中使用正则表达式查找化学公式的所有实例

I have the following string: "AZS40G is Alumina Zircon Silicate material with ZrO2 content of 39% minimum, which serves as a great substitute in applications for production of sintered AZS refractories and where the Fused Zircon mullite is required. C1R5".

I would like to use regex to find all digits in chemical formulas in the text (Instances of letters preceding numbers, excluding the designates abbreviation i.e. "AZS40G" in this instance and wrap them with a <sub></sub> tag.

I am doing this all in php and since I do not know where to start with regex, I have provided the following pseudo code/php example:

$text = "AZS40G is Alumina Zircon Silicate material with ZrO2 content of 39% minimum, which serves as a great substitute in applications for production of sintered AZS refractories and where the Fused Zircon mullite is required. Zr5O2, M20R2, C1R5";
preg_replace('/(AZS40G!)(?<=[A-Z])\d+/', '<sub>${1}</sub>', $text);

The expected result would be all instances as follows:

I have the following string: "AZS40G is Alumina Zircon Silicate material with ZrO2 content of 39% minimum, which serves as a great substitute in applications for production of sintered AZS refractories and where the Fused Zircon mullite is required. C1R5".

  • 写回答

2条回答 默认 最新

  • douyan2821 2019-01-06 20:04
    关注

    Use skip/fail to move past the abbreviations.

    \b(?:AZS40G|BZS40G|CZS40G)\b(*SKIP)(*FAIL)|(?<=[A-Z])(\d+)

    https://regex101.com/r/VglQ3K/1

    Expanded

       \b                         
       (?: AZS40G | BZS40G | CZS40G )      # exclude the designates abbreviation
       \b 
       (*SKIP) (*FAIL)                     # Will move the current position past this,
                                           # then fail the match
    |                                    # or, 
       (?<= [A-Z] )
       ( \d+ )                             # (1)
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?