douxiaqin2062 2011-11-04 21:28
浏览 106
已采纳

正则表达式从java转换为php

I have a regular expression in php and I need to convert it to java. Is it possible to do so? If yes how can i do?

Thanks in advance

$region_pattern = "/<a href=\"#\"><img src=\"images\/ponto_[^\.]+\.gif\"[^>]*>[&nbsp;]*<strong>(?P<neighborhood>[^\(<]+)\((?P<region>[^\)]+)\)<\/strong><\/a>/i" ;
  • 写回答

3条回答 默认 最新

  • dongzong2017 2011-11-04 22:50
    关注

    There are some problems with the original regex that have to be cleared away first. First, there's [&nbsp;], which matches one of the characters &, n, b, s, p or ;. To match an actual non-breaking space character, you should use \xA0.

    You also have a lot of unneeded backslashes in there. You can get rid of some by changing the regex delimiter to something other than /; others aren't needed because they're inside character classes, where most metacharacters lose their special meanings. That leaves you with this PHP regex:

    "~<a href=\"#\"><img src=\"images/ponto_[^.]+\.gif\"[^>]*>\xA0*<strong>(?P<neighborhood>[^(<]+)\((?P<region>[^)]+)\)</strong></a>~i"
    

    There are three things that make this regex incompatible with Java. One is the delimiters (/ originally, ~ in the version above) along with the trailing i modifier. Java doesn't use regex delimiters at all, so just drop those. The modifier can be moved into the regex itself by using the inline form, (?i), at the beginning of the regex. (That will work in PHP too, by the way.)

    Next is the backslashes. The ones that are used to escape quotation marks remain as they are, but all the others get doubled because Java is more strict about escape sequences in string literals.

    Finally, there are the named groups. Up until Java 6, named groups weren't supported at all; Java 7 supports them, but they use the shorter (?<name>...) syntax favored by .NET, not the Pythonesque (?P<name>...) syntax. (By the way, the shorter (?<name>...) version should work in PHP, too (as should (?'name'...), also introduced by .NET).

    So the Java 7 version of your regex would be:

    "(?i)<a href=\"#\"><img src=\"images/ponto_[^.]+\\.gif\"[^>]*>\\xA0*<strong>(?<neighborhood>[^(<]+)\\((?<region>[^)]+)\\)</strong></a>"
    

    For Java 6 or earlier you would use:

    "(?i)<a href=\"#\"><img src=\"images/ponto_[^.]+\\.gif\"[^>]*>\\xA0*<strong>([^(<]+)\\(([^)]+)\\)</strong></a>"
    

    ...and you'd have to use numbers instead of names to refer to the group captures.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(2条)

报告相同问题?

悬赏问题

  • ¥15 matlab用simulink求解一个二阶微分方程,要求截图
  • ¥30 matlab解优化问题代码
  • ¥15 写论文,需要数据支撑
  • ¥15 identifier of an instance of 类 was altered from xx to xx错误
  • ¥100 反编译微信小游戏求指导
  • ¥15 docker模式webrtc-streamer 无法播放公网rtsp
  • ¥15 学不会递归,理解不了汉诺塔参数变化
  • ¥15 基于图神经网络的COVID-19药物筛选研究
  • ¥30 软件自定义无线电该怎样使用
  • ¥15 R语言mediation包做中介分析,直接效应和间接效应都很小,为什么?