doujiao3346 2013-08-25 19:11
浏览 256
已采纳

如何通过pageid获取维基百科中特定页面的所有链接(id)

I`m trying to build query with Wiki API that will return all internal links from specific article in id format. I have pageId of some article. For example for article "Android (Operational System)" id is 12610483. In my client side i need to work only with id and later obtain all information only by id. My goal is to find all internal links(ids of articles) from give article id.

Unfortunately, the only possible way i found is to obtain links that represented by titles of articles: http://en.wikipedia.org/w/api.php?action=parse&format=json&pageid=12610483&prop=links

Is there any other way to obtain ids of links as well and not only titles?

  • 写回答

2条回答 默认 最新

  • doulou0882 2013-08-26 00:14
    关注

    What you want to do is to use action=query&prop=links to get data from the pagelinks database table, instead of parsing the page text.

    This will still give you only page titles (because a link can lead to a non-existent page, which implies no page id).

    But you can fix that by using prop=links as a generator:

    http://en.wikipedia.org/w/api.php?action=query&format=json&pageids=12610483&generator=links&gpllimit=max

    If the article has many links (like the one you suggested), you will need to use paging (see the gplcontinue element).

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?