dpa55065 2017-04-06 00:29
浏览 40
已采纳

如何下载网页数据库提供的内容?

I'm new in java programing so my question may be silly! I'm structuring a website via Django in python. I need to download some contents from another site and show them in mine real-time.sure i can do this by downloading that page HTML code and scrape them(with bs4 & ...) to extract data, But the problem is that my target site uses JavaScript for interactive behavior and when I try to download its contents(using python's urllib or requests) it just send me some java scripts for example: i expect the contents to be like:

<td><a>data to scrape 1</a></td>
<td><a>data to scrape 2</a></td>
<td><a>data to scrape 3</a></td>
...

but it is like:

<tr ng-repeat="toy in letter.list | filter:symbol_srch">
<td><a>{{toy.s}}</a></td>
<td>{{toy.n}}</td>
</tr>

and it seems that "toy" variable is provided by back-end from database.

of course i can use browser or packages(e.g. selenium) to render that site before scraping, but i have not any browser on my server and I'm not permitted to install or use portable versions!

i think since that site's back-end sends variables to my browser and my browser can read and render them, so i can grab those variables and read them without any browser. anyone has any idea?? or is there a way to render site content with python without any external software?

  • 写回答

1条回答 默认 最新

  • dongmao9217 2017-04-06 00:58
    关注

    Inspect for the API calls made from your browser on the Network tab on Chrome Developer tools when you browse to that page. Then you can inspect the traffic with the responses and all the things you need

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
编辑
预览

报告相同问题?