duancenxiao0482 2018-07-25 05:07
浏览 71
已采纳

从Wikipedia API获取图像,但.svg扩展名除外

I am trying to extract images from the Wikipedia API in my PHP page. But I am getting some unnecessary images in .svg extension. Is there a way i can avoid it .svg extension or just include .jpg extensions from the api request? I could see a variable called mediatype, but it was not working.

I am using the following api request url:

 https://en.wikipedia.org/w/api.php?&redirects=1&action=query&titles=Basilica%20Cistern&prop=images&format=json&imlimit=15

And the response i am getting is below:

{
  "continue": {
    "imcontinue": "1365761|Peacock-eyed_column_in_the_Basilica_Cistern_in_Istanbul,Turkey,January_20,2014.jpg ",
    "continue ": " || "
  },
  "query": {
    "pages": {
      "1365761": {
        "pageid": 1365761,
        "ns": 0,
        "title": "Basilica Cistern",
        "images": [{
            "ns": 6,
            "title": "File:20131203 Istanbul 269.jpg"
          },
          {
            "ns": 6,
            "title": "File:Archaeological site icon (red).svg"
          },
          {
            "ns": 6,
            "title": "File:Basilica Cistern.jpg"
          },
          {
            "ns": 6,
            "title": "File:Basilica Cistern Constantinople 2007.jpg"
          },
          {
            "ns": 6,
            "title": "File:Basilica Cistern Constantinople 2007 011.jpg"
          },
          {
            "ns": 6,
            "title": "File:Basilica cistern Art.jpg"
          },
          {
            "ns": 6,
            "title": "File:Carp at the Basilica Cistern, Istanbul 2007.JPG"
          },
          {
            "ns": 6,
            "title": "File:Commons-logo.svg"
          },
          {
            "ns": 6,
            "title": "File:Head of Medusa, Basilica Cistern, Constantinople 01.jpg"
          },
          {
            "ns": 6,
            "title": "File:Head of Medusa, Basilica Cistern, Constantinople 02.jpg"
          },
          {
            "ns": 6,
            "title": "File:Location map Istanbul.png"
          }
        ]
      }
    }
  }
}

PHP CODE:

function getResults($json){

$results = array();

$json_array = json_decode($json, true);

foreach($json_array['query']['pages'] as $page){
    if(count($page['images']) > 0){
        foreach($page['images'] as $image){

            $title = str_replace(" ", "_", $image["title"]);
            $imageinfourl = "https://en.wikipedia.org/w/api.php?&action=query&titles=".$title."&prop=imageinfo&iiprop=url&format=json";
            $imageinfo = curl($imageinfourl);
            $iamge_array = json_decode($imageinfo, true);
            $image_pages = $iamge_array["query"]["pages"];


            foreach($image_pages as $a){

                $results[] = $a["imageinfo"][0]["url"];
            }
        }
    }
}

return $results;

}
  • 写回答

1条回答 默认 最新

  • dongxie8856 2018-07-25 05:31
    关注

    Can't see anything in the API. I thought maybe you could use the imimages parameter but it's only useful for matching the entire title, eg

    ...&imimages=File%3A20131203%20Istanbul%20269.jpg
    

    What you can do is filter the results

    // snip
    if(count($page['images']) > 0) {
        $jpgs = array_filter($page['images'], function($img) {
            return strtolower(pathinfo($img['title'], PATHINFO_EXTENSION)) === 'jpg';
        });
    
        foreach($jpgs as $image) {
            // and continue
    

    Alternately, just check the extension in your foreach loop

    foreach($page['images'] as $image) {
        if (strtolower(pathinfo($img['title'], PATHINFO_EXTENSION)) !== 'jpg') {
            continue;
        }
    
    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论

报告相同问题?

悬赏问题

  • ¥100 set_link_state
  • ¥15 虚幻5 UE美术毛发渲染
  • ¥15 CVRP 图论 物流运输优化
  • ¥15 Tableau online 嵌入ppt失败
  • ¥100 支付宝网页转账系统不识别账号
  • ¥15 基于单片机的靶位控制系统
  • ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
  • ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?
  • ¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
  • ¥15 手机接入宽带网线,如何释放宽带全部速度