doulin4844 2018-07-26 05:24
浏览 30

使用Regex从维基百科API中抓取信息框

I am trying to grab Infobox from wikipedia api using Regex in PHP. Iam getting some unnecessary informations also with the required info. How can i control it? I tried to get only the infobox class, but i guess its not correct. Anyone has worked on with grabbing the infobox contents? Is there any alternate solution other than Regex? Can anyone pls help me on this?

$url = "http://en.wikipedia.org/w/api.php?action=query&prop=revisions&rvprop=content&format=json&titles=Haw_Par_Villa&rvsection=0&rvparse";
$data = json_decode(file_get_contents($url), true);
$data = current($data['query']['pages']);
$regex = '#<\s*?table\b[^>]*>(.*)</table\b[^>]*>#s';
//$regex = '(?=\{Infobox)(\{([^{}]|(?1))*\})';
$code = preg_match($regex, $data["revisions"][0]['*'], $matches);
echo($matches[0]);
  • 写回答

0条回答 默认 最新

    报告相同问题?

    悬赏问题

    • ¥100 set_link_state
    • ¥15 虚幻5 UE美术毛发渲染
    • ¥15 CVRP 图论 物流运输优化
    • ¥15 Tableau online 嵌入ppt失败
    • ¥100 支付宝网页转账系统不识别账号
    • ¥15 基于单片机的靶位控制系统
    • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
    • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
    • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
    • ¥15 手机接入宽带网线,如何释放宽带全部速度