duanrenchuo9244 2016-10-30 00:15
浏览 63

preg_replace在每个超链接之前添加自己的网站

For a project, I need to fetch a websites content and alter the HTML code. Every link on that website has to be replaced with my own aswell. I used str_replace until I realized that links sometimes have classes assigned to them.

I've tried the preg_replace function to add my own website before every href link that is also between <a> </a> tags. It shouldn't matter whether or not the fetched website in $content contains href="" or href=''.

$content = preg_replace('~(<a\b[^>]*\shref=")([^"]*)(")~igs', '\1http://website.com/fetch.php?url=\2\3', $content);

This does not work and I can't find the error. It should behave as follows:

<a class="link" href="http://google.com">Google</a>

should turn into

<a class="link" href="http://website.com/fetch.php?url=http://google.com">Google</a>

Can someone help me find the error? Thank you in advance.

  • 写回答

2条回答 默认 最新

  • douxing5598 2016-11-03 03:48
    关注

    Don't half-arse a regex that will miss plenty of cases. Just read each document into a DOM tree (give this html5 DOM parser a go), and use XPath to get all links with href attributes, and update them, then save the result.

    评论

报告相同问题?

悬赏问题

  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致
  • ¥15 apm2.8飞控罗盘bad health,加速度计校准失败
  • ¥15 求解O-S方程的特征值问题给出边界层布拉休斯平行流的中性曲线
  • ¥15 谁有desed数据集呀
  • ¥20 手写数字识别运行c仿真时,程序报错错误代码sim211-100