普通网友 2016-05-19 02:50 采纳率: 66.7%
浏览 1340

python 爬取网站,没有正确的返回值?

 #coding=utf-8

import sys
import time
import requests
#from lxml import etree
from PIL import Image
reload(sys)
sys.setdefaultencoding('utf-8')
time=int(time.time())

session=requests.session()
user_agent='Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.87 Safari/537.36'
headers={'User-Agent':user_agent,'Host':'218.22.14.70:8088'}
#cookies={'JSESSIONID':'23323B4638EBB7CF3D0272A51AC5A7C3', 'clientlanguage':'zh_CN'}
#start_url='http://218.22.14.70:8088/SMEDS/repository.jspx'
#html=session.get(start_url,headers=headers)
captchaUrl='http://218.22.14.70:8088/SMEDS/validateCode.jspx?type=1&id='+str(time)
print captchaUrl
html1=session.get(captchaUrl,headers=headers)
captcha=html1.content
print type(captcha)
with open('captcha.jpg', "wb") as output:
           output.write(captcha)
Image.open('captcha.jpg').show()
captcha = raw_input("enter captcha:")
url1='http://218.22.14.70:8088/SMEDS/repository.jspx?checkNo=40&searchType=CX&entName=安徽&pageNo=&textfield2='
html1=session.get(url1,headers=headers,cookies=html1.cookies)
info=(html1.content)
print type(info),info
print html1.headers

没有查询结果,求解。。。

  • 写回答

1条回答 默认 最新

  • 普通网友 2016-10-04 07:32
    关注

    g=utf-8

    import sys
    import time
    import requests
    #from lxml import etree
    from PIL import Image
    reload(sys)
    sys.setdefaultencoding('utf-8')
    time=int(time.time())

    session=requests.session()
    user_agent='Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.87 Safari/537.36'
    headers={'User-Agent':user_agent,'Host':'218.22.14.70:8088'}
    #cookies={'JSESSIONID':'23323B4638EBB7CF3D0272A51AC5A7C3', 'clientlanguage':'zh_CN'}
    #start_url='http://218.22.14.70:8088/SMEDS/repository.jspx'
    #html=session.get(start_url,headers=headers)
    captchaUrl='http://218.22.14.70:8088/SMEDS/validateCode.jspx?type=1&id='+str(time)
    print captchaUrl
    html1=session.get(captchaUrl,headers=headers)
    captcha=html1.content
    print type(captcha)
    with open('captcha.jpg', "wb") as output:
    output.write(captcha)

    评论

报告相同问题?

悬赏问题

  • ¥30 STM32 INMP441无法读取数据
  • ¥100 求汇川机器人IRCB300控制器和示教器同版本升级固件文件升级包
  • ¥15 用visualstudio2022创建vue项目后无法启动
  • ¥15 x趋于0时tanx-sinx极限可以拆开算吗
  • ¥500 把面具戴到人脸上,请大家贡献智慧
  • ¥15 任意一个散点图自己下载其js脚本文件并做成独立的案例页面,不要作在线的,要离线状态。
  • ¥15 各位 帮我看看如何写代码,打出来的图形要和如下图呈现的一样,急
  • ¥30 c#打开word开启修订并实时显示批注
  • ¥15 如何解决ldsc的这条报错/index error
  • ¥15 VS2022+WDK驱动开发环境