weixin_52068710 2021-04-22 14:34 采纳率: 92.3%
浏览 1125
已采纳

python中selenium动态切换ip的问题(如何使每次切换页面只切换ip但不打开谷歌浏览器)

def ip_log(ip,port):
    PROXY = f"{ip}:{port}"  # 这里写你的代理

    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_argument('--proxy-server=%s' % PROXY)
    global browser
    browser = webdriver.Chrome(executable_path='./chromedriver', options=chrome_options)

def data(value):


    # 执行一组js程序,拉到页面底部
    browser.execute_script('window.scrollTo(0,document.body.scrollHeight)')
    sleep(2)
    global shop_name_list, shop_price_list, shop_people_list, shop_location_list,a
    shop_name_list = []
    shop_price_list = []
    shop_people_list = []
    shop_location_list = []
    ip_list = ['36.248.132.187','36.248.132.23','122.4.48.145']  #输三个代理ip
    port_list = [9999,9999,9999] #ip对应的port
    a = 0
    b = 44
    c = 0
    for i in range(1,6):
        page = browser.page_source
        soup = BeautifulSoup(page, 'lxml')
        shop_data_list = soup.find('div', class_='grid g-clearfix').find_all_next('div', class_='items')
        for shop_data in shop_data_list:
            # 商品名称
            shop_image_data = shop_data.find_all('div',class_='pic')
            for shop_data_a in shop_image_data:
                shop_data_a = shop_data_a.find_all('a',class_='pic-link J_ClickStat J_ItemPicA')
                for shop_name in shop_data_a:
                    shop_name = shop_name.find_all('img')[0]['alt']
                    shop_name_list.append(shop_name)
            # 商品价格
            shop_price_data = shop_data.find_all('div',class_='price g_price g_price-highlight')
            for shop_price in shop_price_data:
                shop_price_list.append(shop_price.text.strip())
            # 付款人数
            shop_people_number_data = shop_data.find_all('div',class_='deal-cnt')
            for shop_people_number in shop_people_number_data:
                shop_people_list.append(shop_people_number.text)
            #地址s
            shop_location_data = shop_data.find_all('div',class_='location')
            for shop_location in shop_location_data:
                shop_location_list.append(shop_location.text)
        # 实现动态加载代理ip
        if c == 0:
            ip_log(ip_list[c],port_list[c])
            c += 1
        if c == 1:
            ip_log(ip_list[c],port_list[c])
            c += 1
        if c == 2:
            ip_log(ip_list[c],port_list[c])
            c = 0
        shop_data = zip(shop_name_list,shop_price_list,shop_people_list,shop_location_list)
        for data in shop_data:
            print(data)
            a += 1
        b += 44
        browser.get(f"https://s.taobao.com/search?q={value}&s={b}")
        sleep(0.5)

    print('已成功爬取:%s条信息'%a)
    return shop_name_list, shop_price_list, shop_people_list, shop_location_list,a
  • 写回答

2条回答 默认 最新

  • coagenth 2021-04-22 17:15
    关注

    题主意思是每次切换时桌面不显示浏览器窗口吧,否则你不打开它,无法获取数据的,如果想要隐藏浏览器界面,在选项中设置,

    chrome_options = webdriver.ChromeOptions()

    chrome_options.add_argument('--proxy-server=%s' % PROXY)

    两行的后面加上下面一句即可。

    chrome_options.add_argument('--headless')#不显示浏览器窗口

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(1条)

报告相同问题?

问题事件

  • 已采纳回答 7月22日

悬赏问题

  • ¥15 python提取.csv文件中的链接会经常出现爬取失败
  • ¥15 数据结构中的数组地址问题
  • ¥15 maya的mel里,怎样先选择模型A,然后利用mel脚本自动选择有相同名字的模型B呢。
  • ¥15 Python题,根本不会啊
  • ¥15 会会信号与系统和python的来
  • ¥15 关于#python#的问题
  • ¥20 oracle RAC 怎么配置啊,配置
  • ¥15 excel 日常使用中出现问题
  • ¥20 pdusession建立失败
  • ¥15 为什么mqtt接收不到数据?