douzhang1364 2015-08-05 20:56 采纳率: 0%
浏览 9
已采纳

编写正则表达式时不加否定

In a previous post I've asked for some help on rewriting a regex without negation

Starting regex:

https?:\/\/(?:.(?!https?:\/\/))+$

Ended up with:

https?:[^:]*$

This works fine but i've noticed that in case I will have : in my URL besides the : from http\s it will not select.

Here is a string which is not working:

sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/:query2

You can notice the :query2

How can I modify the second regex listed here so it will select urls which contain :.

Expected output:

http://websites.com/path/subpath/cc:query2

Also I would like to select everything till the first occurance of ?=param

Input: sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/cc:query2/text/?=param

Output:

http://websites.com/path/subpath/cc:query2/text/

  • 写回答

1条回答 默认 最新

  • douzhenzu0247 2015-08-05 21:11
    关注

    It is a pity that Go regex does not support lookarounds. However, you can obtain the last link with a sort of a trick: match all possible links and other characters greedily and capture the last link with a capturing group:

    ^(?:https?://|.)*(https?://\S+?)(?:\?=|$)
    

    Together with \S*? lazy whitespace matching, this also lets capture the link up to the ?=.

    See regex demo and Go demo

    var r = regexp.MustCompile(`^(?:https?://|.)*(https?://\S+?)(?:\?=|$)`)
    fmt.Printf("%q
    ", r.FindAllStringSubmatch("sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/:query2", -1)[0][1])
    fmt.Printf("%q
    ", r.FindAllStringSubmatch("sometextsometexhttp://websites.com/path/subpath/#query1sometexthttp://websites.com/path/subpath/cc:query2/text/?=param", -1)[0][1])
    

    Results:

    "http://websites.com/path/subpath/:query2"
    "http://websites.com/path/subpath/cc:query2/text/"
    

    In case there can be spaces in the last link, use just .+?:

    ^(?:https?://|.)*(https?://.+?)(?:\?=|$)
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥15 用windows做服务的同志有吗
  • ¥60 求一个简单的网页(标签-安全|关键词-上传)
  • ¥35 lstm时间序列共享单车预测,loss值优化,参数优化算法
  • ¥15 Python中的request,如何使用ssr节点,通过代理requests网页。本人在泰国,需要用大陆ip才能玩网页游戏,合法合规。
  • ¥100 为什么这个恒流源电路不能恒流?
  • ¥15 有偿求跨组件数据流路径图
  • ¥15 写一个方法checkPerson,入参实体类Person,出参布尔值
  • ¥15 我想咨询一下路面纹理三维点云数据处理的一些问题,上传的坐标文件里是怎么对无序点进行编号的,以及xy坐标在处理的时候是进行整体模型分片处理的吗
  • ¥15 一直显示正在等待HID—ISP
  • ¥15 Python turtle 画图