dongyan8896 2018-07-11 12:56
浏览 127
已采纳

如何使用带有Scrapy的admin-ajax.php从网站上抓取数据

I am trying to scrape the reviews about unibet casino on that website : https://casinoplacard.com/unibet-casino-reviews-and-bonuses/

As I did for other sources of reviews I used Scrapy on Python to scrape the reviews with the code below :

class slotRunner_spyder(scrapy.Spider):
count=0

name = "slotRunner_spyder"
start_urls = [

       'https://casinoplacard.com/unibet-casino-reviews-and-bonuses/'
]
def parse(self, response):

    parsed_uri = urlparse(response.url)
    domain = '{uri.scheme}://{uri.netloc}/'.format(uri=parsed_uri)

    for review in response.css('div.rwp-users-reviews > div.rwp-u-review') :
        self.count+=1
        yield {
            'name': review.css('td a::text').extract_first(),
            'date': review.css('td small::text').extract_first(),
            'review': review.css('div.rwp-u-review__content > div.rwp-u-review__comment').extract(),
            'url' : response.url
        }
    print(self.count)

But for that website it does not work. To understand better I have introduced the counter (self.count) and discover that it do only 1 iteration which is not normal...

Then I have spent some tiem studying the DevTools of that website and I have discover that when the page is loaded, a XHR POST request method is done automatically with the URL : https://casinoplacard.com/wp-admin/admin-ajax.php

And by looking into that request I have found the 182 reviews data in :

Preview >> Data >> Reviews

So could you guys please help me understand how it works to catch those data ?

Thank you very much !

  • 写回答

1条回答 默认 最新

      报告相同问题?

      相关推荐 更多相似问题

      悬赏问题

      • ¥100 IIC通讯数据算法分析
      • ¥15 matlab 绘制涡流场
      • ¥15 依存句法分析如何与BERT模型及GCN相结合
      • ¥66 有偿收一个会Python 与unitysocket通信,会简单mediapipe手势识别的哥
      • ¥15 药店卖药设计使利润最大
      • ¥15 模拟银行实现VIP服务
      • ¥20 ECU在实车上can通讯失败或不稳定
      • ¥15 关于VB.net调用Excel如何打包的问题?
      • ¥15 VB6.0+WebBrowser如何实现网页内嵌图片按钮点击
      • ¥30 请问纯C语言如何编写简易的easyx图形库