weixin_33698043 2019-12-26 10:50 采纳率: 0%
浏览 40

从AJAX调用中收集JSON

Background

Considering this url:

base_url = "https://www.olx.bg/ad/sobstvenik-tristaen-kamenitsa-1-CID368-ID81i3H.html"

I want to make the ajax call for the telephone number:

ajax_url = "https://www.olx.bg/ajax/misc/contact/phone/7XarI/?pt=e3375d9a134f05bbef9e4ad4f2f6d2f3ad704a55f7955c8e3193a1acde6ca02197caf76ffb56977ce61976790a940332147d11808f5f8d9271015c318a9ae729"

Wanted results

If I press the button through the site in my chrome browser in the console I would get the wanted result:

{"value":"088 *****"}

debugging

If I open a new tab and paste the ajax_url I would always get empty values:

{"value":"000 000 000"}

If I try something like:

Bash:

wget $ajax_url

Python:

import requests


json_response= requests.get(ajax_url)

I would just receive the html of the the site's handling page that there is an error.

Ideas

I have something more when I am opening the request with the browser. What more do I have? maybe a cookie?

How do I get the wanted result with Bash/Python ?

Edit

the code of the response html is 200

I have tried with curl I get the same html problem.

Kind of a fix.

I have noticed that if I copy the cookie of the browser, and make a request with all the headers INCLUDING the cookie from the browser, I get the correct result

# I think the most important header is the cookie
headers = DICT_WITH_HEADERS_FROM_BROWSER
json_response= requests.get(next_url,
                            headers=headers,
                            )

Final question

The only question left is how can I generate a cookie through a Python script?

  • 写回答

2条回答 默认 最新

  • weixin_33725270 2019-12-26 11:45
    关注
    from selenium import webdriver
    from bs4 import BeautifulSoup
    from selenium.webdriver.firefox.options import Options
    from bs4 import BeautifulSoup
    import time
    
    options = Options()
    options.add_argument('--headless')
    
    driver = webdriver.Firefox(options=options)
    driver.get(
        'https://www.olx.bg/ad/sobstvenik-tristaen-kamenitsa-1-CID368-ID81i3H.html')
    
    number = driver.find_element_by_xpath(
        "/html/body/div[3]/section/div[3]/div/div[1]/div[2]/div/ul[1]/li[2]/div/strong").click()
    time.sleep(2)
    source = driver.page_source
    soup = BeautifulSoup(source, 'html.parser')
    
    phone = soup.find("strong", {'class': 'xx-large'}).text
    
    print(phone)
    

    Output:

    088 558 9937
    
    评论

报告相同问题?

悬赏问题

  • ¥15 Abaqus打不开cae文件怎么办?
  • ¥20 双系统开机引导中windows系统消失问题?
  • ¥15 小程序准备上线,软件开发公司需要提供哪些资料给甲方
  • ¥15 关于生产日期批次退货退款,库存回退的问题
  • ¥15 手机应用的时间可以修改吗
  • ¥15 docker 运行OPEN-webui异常
  • ¥15 麒麟系统如何删除光盘刻录痕迹
  • ¥15 recipe通过gem协议传的是什么
  • ¥15 TS2307: Cannot find module 'cc'.
  • ¥15 100小时学会sap 书上pp章节5.22,标准成本计算逻辑?