从字符串中提取信息

When given a string of the form https://website-name.some-domain.some-sub-domain.com/resourceId (type 1) or https://website-name.some-sub-domain.com/resourceId?randomContent (type 2), I need to extract out only two sub-strings. I need the website-name in one string and resourceId in an other string.

I have extracted the website name using the following code:

s := "https://website-name.some-domain.some-sub-domain.com/resourceId?randomContent"
w := regexp.MustCompile("https://(.*?)\\.")
website := w.FindStringSubmatch(s)
fmt.Println(website[1])

I have the other regex to get the resourceId

s := "https://website-name.some-domain.some-sub-domain.com/resourceId?randomContent"
r := regexp.MustCompile("com/(.*?)\\?")
resource := r.FindStringSubmatch(s)
fmt.Println(resource[1])

This works for any string that ends with ? or ?randomContent. But I have strings that don't have a trailing ? and I am not able to work with such cases (type 1).

I tried "(com/(.*?)\\?)|(com/(.*?).*)" to get resourceId which is of no use.

I am not able to find an elegant way to extract these two sub-strings.

Note: The randomContent is an arbitrarily long substring, the same goes for the resourceId as well. But the resourceId will not have ? in it. Upon encountering a ?, it can be said that the resourceId has ended.

Also, website-name can differ, but the pattern is the same - An arbitrary sub-domain and a .com will be present in the string.

Here is what I have tried: https://play.golang.org/p/MGQIT5XRuuh

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

3条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
douhu8851 2019-08-20 01:32
关注
The sample strings you show are ordinary HTTPS URLs, so you can use the net/url package to parse them. The website-name is the first part of the parsedUrl.Hostname(), and the resourceId is the parsedUrl.Path less a leading /.

u, err := url.Parse(s) if err != nil { panic(err) } host := u.Hostname() first := strings.SplitN(host, ".", 2)[0] fmt.Printf("website-name: %s ", first) fmt.Printf("resourceId: %s ", u.Path[1:])

https://play.golang.org/p/fnF2RTBuFxR has a complete example, including the two URL strings from the question. This works even if the hostname part of the URL doesn't end with .com, or the path part includes that string, or there is a port number or hash fragment, or other variations.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(2条)

报告相同问题？

关注问题

Python从给定字符串中提取单词 python
2022-05-31 21:07

回答 3 已采纳注：str dict 都是内置函数，尽量不要用它们作变量名。 zen = """The Zen of Python, by Tim Peters Beautiful is
C++,从字符串中提取数据 c++
2022-07-16 10:39

回答 6 已采纳流操作即可。 #include <iostream> #include <sstream> auto main() -> int { char arr[] =
Java从字符串中提取key和value java
2022-05-25 22:20

回答 2 已采纳参考下不懂问我，有用采纳一手 String data = "name=刘小溪&sport=篮球&sport=逛街&sex=女&fruit=苹果&fruit=梨子&fruit=杏";
Python从字符串中提取数字总结
2024-04-24 10:57

大数据采集及分析RPA的博客网上看了一圈没找到很完整的提取过程，自己刚好有时间总结一下。在Python中，有时候需要从字符串中提取特定的数字信息，这种操作很常见。例如，从一篇新闻报道中提取新闻发布日期、从一篇小说中提取章节编号等。
从字符串中提取票号 php
2018-05-03 06:40

回答 4 已采纳 Your regex is mostly correct. You just need to add a ? to make it non-greedy. /\[TICKET(.*?)([0-9
c 指针从字符串中提取整数 c语言有问必答
2021-11-13 12:23

回答 2 已采纳因为p是char类型，所以p就是char型，char型存储的是字符的ASCII码，比如'0'的ASCII码是48。如果把char转换为整数处理，那么'0'对应的整数0的转换方法就是将char值减去'
Regex的替代方法，用于从字符串中提取信息 php
2014-09-23 20:22

回答 4 已采纳 You could use regex easily. If you use this regex, you can get the name here: To:\s+(.*) Worki
python从字符串中提取单词_python字符串单词用法
2020-11-26 07:52

weixin_39957805的博客它可以从一个字符串（str）中找到另一个字符串（str）；如果找到了，则返回索引；如果没有找到，则返回－1；image.pngimage.png二、index跟find()方法一样，只不过如果字符串（str）不在我的字符串（mystr）中会报...
提取字符串中所由符合条件的数据 javascript 有问必答
2021-09-23 11:26

回答 2 已采纳 {\S*?}，加问号，不要贪婪匹配
C#正则表达式提取字符串 asp.net c# 正则表达式
2020-04-20 15:33

回答 3 已采纳 ``` (?<=\()\w+ \w+(?=\)) ```
从字符串中提取ip地址[重复] php
2016-03-18 10:28

回答 3 已采纳 if you get to a point where you have the IP mixed with a couple of tags, you could simple appy the
python从字符串中提取字符_从Python中的给定字符串中仅提取字符
2021-02-09 21:54

wangcg2001的博客如果我们只想提取数据字符串中的字母，则可以使用python中可用的各种选项。使用isalphaisalpha函数将检查给定字符是否为字母。我们将在for循环中使用它，该循环将从给定的字符串中获取每个字符，并检查它是否为字母...
python如何从字符串中提取数字_如何在Python中从字符串中提取数字？
2020-11-28 18:54

weixin_39724362的博客我将提取字符串中包含的所有数字。哪个更适合于目的，正则表达式或isdigit()方法？例：line = "hello 12 hi 89"结果：[12, 89]#1楼@jmnas，我很喜欢您的回答，但没有找到浮点数。我正在处理一个脚本，以分析要送入...
python从字符串中提取字母_【Python】从字符串中提取字母字符串的几种方法
2020-12-05 04:17

weixin_39586825的博客不说了，直接贴题目：题目： s = 'abc@124, efg opAs4'，请把其中的字母字符串拿出来，组合成新字符串。我就自己想到的方法列举如下，并且就各自性能对比如下：import refrom functools import wrapsdef fn...
C语言从字符串中提取数字
2020-11-28 14:23

Dear_YG的博客输入 A123.1c34.df.1 456.78cpc876.9er 849.1 输出 123.1 1 456.78 876.9 849.1 直接代码 #include "stdio.h" int CharToInt(char a){ switch(a){ case '0':return 0; ... case '5':ret.
没有解决我的问题, 去提问

悬赏问题

¥30 Matlab打开默认名称带有/的光谱数据
¥50 easyExcel模板动态单元格合并列
¥15 res.rows如何取值使用
¥15 在odoo17开发环境中，怎么实现库存管理系统，或独立模块设计与AGV小车对接？开发方面应如何设计和开发？请详细解释MES或WMS在与AGV小车对接时需完成的设计和开发
¥15 CSP算法实现EEG特征提取，哪一步错了？
¥15 游戏盾如何溯源服务器真实ip?需要30个字。后面的字是凑数的
¥15 vue3前端取消收藏的不会引用collectId
¥15 delphi7 HMAC_SHA256方式加密
¥15 关于#qt#的问题：我想实现qcustomplot完成坐标轴
¥15 下列c语言代码为何输出了多余的空格

从字符串中提取信息

3条回答 默认 最新

悬赏问题

3条回答默认最新