dsgw8802 2015-06-29 03:02
浏览 45
已采纳

使用RegEx获取字符串的特定部分

I'm trying to make a json file with all my countries cities and states (called departamentos here). I never found a complete list but now I'm following the list made by Wikipedia users in this link:

https://es.wikipedia.org/wiki/Anexo:Municipios_de_Colombia

I have copied and pasted all the text within a document, making a new line for each city like this:

Yacopí es una población y municipio del departamento de Cundinamarca

Currently I am able to select the city using RegEx with this expression:

/.+?(?= es)/

It takes everything from the beginning of the line to where it meets " es" for the first time, which is a regular convention for each of the lines in the Wikipedia page.

Now what I want to achieve is with the same line of Regex, also get the state which can be the last or last two words. Which I think it can be reached by selecting anything after " de ". But I'm stuck.

Any help would be appreciated and maybe other people around the world can start making json files out of Wikipedia.

  • 写回答

1条回答 默认 最新

  • douwei1930 2015-06-29 03:17
    关注

    This seems to work for at least the cities starting with an A. I didn't test all of them though.

    /^(.*?) es.*de (.*)$/gm
    

    Play with it here. https://regex101.com/r/yJ3gK7/1 (the whitespace is from pasting from the wiki, and shouldn't really matter here.)

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 全部备份安卓app数据包括密码,可以复制到另一手机上运行
  • ¥15 Python3.5 相关代码写作
  • ¥20 测距传感器数据手册i2c
  • ¥15 RPA正常跑,cmd输入cookies跑不出来
  • ¥15 求帮我调试一下freefem代码
  • ¥15 matlab代码解决,怎么运行
  • ¥15 R语言Rstudio突然无法启动
  • ¥15 关于#matlab#的问题:提取2个图像的变量作为另外一个图像像元的移动量,计算新的位置创建新的图像并提取第二个图像的变量到新的图像
  • ¥15 改算法,照着压缩包里边,参考其他代码封装的格式 写到main函数里
  • ¥15 用windows做服务的同志有吗