从Wiki XML语法中提取图像路径

i try to parse the Wikipedia XML which i get from the xml wikipedia export

In one case i need to extract all image path. The raw markup looks like,

  [[Bild:nameOfImage.png|image description]]

"Bild" can also be "Image", "File" or "Datei"

To extract the text for an Image i use this regex.

'|\[\[.*\|.*\]\]|U'

This works fine, if in the image description isn't an other '[[ .. ]]', like

[[Bild:nameOfImage.png|image Description with a [[new wiki link]] ]]

My question is, how can i modify the Regex to get all text between the first "[[" and the last "]]" without to count all '[' an ']' character.

thanks in advance

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

1条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
donljt2606 2013-03-28 15:50
关注
Since you're using PHP, you're probably able to use recursive patterns.
Considering you're not capturing anything:

/\[\[(((?>[^\[\]])|(?R))*)\]\]/U

Note that I haven't tried this regex since I have no way to use PHP.

Edit:

preg_match('/\[\[(?>[^\[\]]|(?R))*\]\]/U', '[[Bild:nameOfImage.png|image Description with a [[new wiki link]] ]]', $array); var_dump($array);

seems to work.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

报告相同问题？

关注问题

如何从Golang中的XML文件提取多个字段 xml
2018-09-28 03:22

回答 1 已采纳 Usually you should at least try something before posting a question on Stack Overflow, but since i
PHP-如何从提取的URL IMGS重新显示网站上的图像 php
2017-05-18 23:06

回答 1 已采纳 I was able to load the random image, and "print it" as an image directly (so you can embed the php
通过php从URL获取最大的图像 php
2017-01-28 18:01

回答 2 已采纳 You can use getimagesize() and max() in following way:- $size_array = array(); // create an new e
PHP – 最佳实践
2022-09-16 11:08

allway2的博客在 PHP 中开发 Web 应用程序时，您应该遵循一些好的做法。其中大多数都非常容易上手，其中一些甚至适用于一般的 Web 开发。这不是特定于 PHP 的。为避免用户刷新浏览器并两次提交相同的表单数据的情况，您应该始终...
使用PHP从MediaWiki数据库中提取压缩文本 mysql php
2012-11-26 20:05

回答 2 已采纳 From Text table: old_flags Comma-separated list of flags. Contains the following possibl
关于php7统一变量语法，嵌套函数 php
2018-09-07 19:59

回答 2 已采纳 Thanks all for help. It's something like this: <?php function test($a) { echo '<br/&gt
如何在从xml转换为json时获取输出中的属性？ json php xml
2015-05-23 14:03

回答 1 已采纳 A JSON string generated by json_encode() called on a SimpleXML object will not have attributes, js
K210 图像识别（加训练模型）
2020-12-05 13:54

位文杰TOP的博客本文简介K210的图像识别首先要明确的是图像识别，我们需要识别的是什么，如何让机器代替我们识别，我们人类认识世界万物知道这个是玫瑰花这个是豆腐这个是火车这个是我们一步一步的学习所得的，当然...
XML PHP，过滤掉特定的“类型” php xml
2014-07-07 09:56

回答 4 已采纳 Just made a simple code: $xml = simplexml_load_file('https://musicbrainz.org/ws/2/release-group/f
在PHP中，从基于RDF的RSS Feed创建ATOM源 php xml
2013-05-14 08:43

回答 1 已采纳 This is most probably an namespace issue. Try to add: xmlns="http://purl.org/rss/1.0/" as names
PHP正则表达式 - 如何提取所有出现的模式 php
2017-05-22 16:24

回答 2 已采纳 Try this code snippet here <?php ini_set('display_errors', 1); $subject = "this is a test [[T
精通响应式 Web 设计（三）
2024-07-30 10:33

绝不原创的飞龙的博客我们可以从提到的所有目标尺寸中得出结论，即适当的尺寸为（在低密度屏幕上）：推荐的目标尺寸为 48dp×48dp = 48px×48px。最小目标尺寸为 5 毫米×5 毫米= 30px×30px。最大目标尺寸为 10 毫米×10 毫米= 55px...
springcloud中文手册API
2018-10-12 23:28

浮生梦浮生的博客开箱即用，负责从外部源加载配置属性，还解密本地外部配置文件中的属性。这两个上下文共享一个 Environment ，这是任何Spring应用程序的外部属性的来源。Bootstrap属性的优先级高，因此默认情况下不能被本地配置覆盖...
国外程序员整理的 PHP 资源大全
2017-01-11 09:10

酷笔记的博客国外程序员整理的 PHP 资源大全 ziadoz 在 Github 发起维护的一个 [url=http://www.kubiji.cn/juhe_listing-idPHPXueXi.html]PHP [/url]资源列表，内容包括：库、框架、模板、安全、代码分析、日志、第三方库、...
使用PHP和Markdown构建ePub
2020-08-27 15:05

culi3182的博客 php使用markdownThe ePub format is a publishing standard built on top of XHTML, CSS, XML and more. And since PHP is well suited for working with HTML and friends, why not use it to build ebooks? In ...
没有解决我的问题, 去提问

悬赏问题

¥15 QTableWidget重绘程序崩溃
¥15 51寻迹小车定点寻迹
¥15 谁能帮我看看这拒稿理由啥意思啊阿啊
¥15 关于vue2中methods使用call修改this指向的问题
¥15 idea自动补全键位冲突
¥15 请教一下写代码，代码好难
¥15 iis10中如何阻止别人网站重定向到我的网站
¥15 滑块验证码移动速度不一致问题
¥15 Utunbu中vscode下cern root工作台中写的程序root的头文件无法包含
¥15 麒麟V10桌面版SP1如何配置bonding

从Wiki XML语法中提取图像路径

1条回答 默认 最新

悬赏问题

1条回答默认最新