doudou130216 2016-05-01 11:18 采纳率: 0%
浏览 336
已采纳

正则表达式以html(golang)查找图像

I'm parsing an xml rss feed from a couple of different sources and I want to find the images in the html.

I did some research and I found a regex that I think might work

/<img[^>]+src="?([^"\s]+)"?\s*\/>/g

but I have trouble using it in go. It gives me errors because I don't know how to make it search with that expression.

I tried using it as a string, it doesn't escape properly with single or with double quotes. I tried using it just like that, bare, and it gives me an error.

Any ideas?

  • 写回答

2条回答 默认 最新

  • douya6606 2016-05-01 12:31
    关注

    Using a proper html parser is always better for parsing html, however a cheap / hackish regex can also work fine, here's an example:

    var imgRE = regexp.MustCompile(`<img[^>]+\bsrc=["']([^"']+)["']`)
    // if your img's are properly formed with doublequotes then use this, it's more efficient.
    // var imgRE = regexp.MustCompile(`<img[^>]+\bsrc="([^"]+)"`)
    func findImages(htm string) []string {
        imgs := imgRE.FindAllStringSubmatch(htm, -1)
        out := make([]string, len(imgs))
        for i := range out {
            out[i] = imgs[i][1]
        }
        return out
    }
    

    <kbd>playground</kbd>

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

悬赏问题

  • ¥15 BP神经网络控制倒立摆
  • ¥20 要这个数学建模编程的代码 并且能完整允许出来结果 完整的过程和数据的结果
  • ¥15 html5+css和javascript有人可以帮吗?图片要怎么插入代码里面啊
  • ¥30 Unity接入微信SDK 无法开启摄像头
  • ¥20 有偿 写代码 要用特定的软件anaconda 里的jvpyter 用python3写
  • ¥20 cad图纸,chx-3六轴码垛机器人
  • ¥15 移动摄像头专网需要解vlan
  • ¥20 access多表提取相同字段数据并合并
  • ¥20 基于MSP430f5529的MPU6050驱动,求出欧拉角
  • ¥20 Java-Oj-桌布的计算