cannot import name etree ------tf.app.run()命令行中运行脚本时报错 5C

在py文件中运行时没有问题,在python shell中也可以,但是运行脚本时就会报错没有该模块,没有发现自己有用同名文件(夹),
python 3.6
lxml 4.3.3
想问一下各位大神该怎么处理这一问题?

1个回答

不要使用和框架名字相同的文件名,即便两个文件夹不在同一个目录中,也会导致程序运行出现莫名的错误。

aguo718
aguo718 请问这个是全盘搜索吗,也没有找到名字相同的文件
10 个月之前 回复
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
其他相关推荐
raise etree.ParserError( lxml.etree.ParserError: Document is empty
原始代码 ``` import requests import lxml.html import csv doubanUrl = 'https://movie.douban.com/top250?start={}&filter=' def getSource(url): response = requests.get(url) response.encoding = 'utf-8' return response.content def getEveryItem(source): selector = lxml.html.document_fromstring(source) movieitemlist = selector.Xpath('//div[@class="info"]') movieList = []  for eachMovie in movieitemlist: movieDict = {} title = eachMovie.Xpath('div[@class="hd"/a/span[@class="title"]/text()') otherTitle = eachMovie.Xpath('div[@class="hd"/a/span[@class="other"]/text()') link = eachMovie.Xpath('div[@class="hd"/a/@href') star = eachMovie.Xpath('div[@class="bd"/div[@class="star"]/span[@class="rating_num"]/text()') quote = eachMovie.Xpath('div[@class="bd"/p[@class="quote"]/span/text()') movieDict['title'] = ''.join(title+otherTitle) movieDict['url'] = link movieDict['star'] = star movieDict['quote'] = quote print(movieDict) movieList.append(movieDict) return movieList def writeData(movieList): with open('MovieDouban.csv','w',encoding='UTF-8') as f: writer = csv.DictWriter(f,fieldnames=['title','star','quote','url']) writer.writeheader() for each in movieList: write.writerow(each) if __name__=='__main__': movieList = [] for i in range(10): pageLink = doubanUrl.format(i * 25) print(pageLink) source = getSource(pageLink) movieList += getEveryItem(source) #movieList = movieList + getEveryItem(source) print(movieList[:10]) writeData(movieList) ``` 报错如下 ``` C:\Users\abc\AppData\Local\Programs\Python\Python38-32\python.exe C:/Users/abc/.PyCharmCE2019.3/config/scratches/scratch_1.py https://movie.douban.com/top250?start=0&filter= Traceback (most recent call last): File "C:/Users/abc/.PyCharmCE2019.3/config/scratches/scratch_1.py", line 63, in <module> movieList += getEveryItem(source) File "C:/Users/abc/.PyCharmCE2019.3/config/scratches/scratch_1.py", line 18, in getEveryItem selector = lxml.html.document_fromstring(source) File "C:\Users\abc\AppData\Local\Programs\Python\Python38-32\lib\site-packages\lxml\html\__init__.py", line 763, in document_fromstring raise etree.ParserError( lxml.etree.ParserError: Document is empty Process finished with exit code 1 ``` 系统报错该怎么解决?
python3.5.0安装lxml,导入lxml.html和lxml.etree出错
系统是win7,安装的是python3.5.0版本,lxml安装的是lxml-3.7.1-cp35-cp35m-win_amd64.whl(通过pip安装的)。安装成功后import lxml没有报错,但import lxml.html 和 lxml.etree时就报错了,报错信息如下: >>> import lxml.html Traceback (most recent call last): File "<pyshell#11>", line 1, in <module> import lxml.html File "C:\Program Files\Python 3.5\lib\site-packages\lxml\html\__init__.py", line 54, in <module> from .. import etree File "type.pxd", line 9, in init lxml.etree (src\lxml\lxml.etree.c:220742) ValueError: builtins.type has the wrong size, try recompiling. Expected 840, got 864 >>> import lxml.etree Traceback (most recent call last): File "<pyshell#13>", line 1, in <module> import lxml.etree File "type.pxd", line 9, in init lxml.etree (src\lxml\lxml.etree.c:220742) ValueError: builtins.type has the wrong size, try recompiling. Expected 840, got 864
XPath里面,etree.parse() 和 etree.HTML() 有啥区别?
如题,XPath里面,etree.parse() 和 etree.HTML() 有啥区别?不是很懂,望大神指导。
ImportError: No module named etree.ElementTree
import xml.etree.ElementTree as ET per=ET.parse('test.xml') lst_node = root.getiterator("video") for node in lst_node: print_node(node) 版本python2.7.3 居然报错:No module named etree.ElementTree 重装了2.7.9依然有错。。。菜鸟求大神
lxml.etree有个getchildren()的方法被废弃了,那该用什么来替代?
情况比题目问的复杂一些。我抓了个响应体,然后用etree的方式解析: r=etree.HTML(requests.post(link,{"filter_by_status":pattern},headers).text) 这样我就可以使用xpath了。但后来遇到一个问题:lxml有一个getparent的方法,可是getchildren的方法却被废弃了,我无法获得某个节点的子元素。 我不是要//a[@id='3']/following-sibling::a[1] 这样使用路径获得节点的方法,我要的是getchildren()这种使用函数返回节点的方法。 我查了一下 python的文档( http://omz-software.com/pythonista/docs/library/xml.etree.elementtree.html ) ,它对getchildren的描述如下:getchildren(): Deprecated since version 3.2: Use list(elem) or iteration. 我不太清楚list(elem)是啥;至于后面那个iteration就更不清楚,再接着查,不知道哪里查到了个ET.SubElement()的方法(需要先import xml.etree.ElementTree as ET),这个方法似乎最接近我的需求,然而当我使用这个方法来查询某节点的子节点时候,程序报错了:TypeError: SubElement() argument 1 must be xml.etree.ElementTree.Element, not lxml.etree._Element。 也就是说我使用xpath获得的那个节点是 lxml.etree._Element,但这个方法只有xml.etree.ElementTree.Element才能使用。但网络上关于lxml.etree._Element少之又少。我真的很懵逼,为什么查询一个子节点就那么难呢?是不是我哪里一开始就弄错了?希望能给个例子,谢谢!
XPath无法准确获取怎么办
参照《从零开始学网络爬虫》案例,爬取豆瓣图书Top250的信息 https://book.douban.com/top250 爬取前需要用XPath获取书名、作者等标签信息,在浏览器中检查网页信息,并右击,copy XPath获取元素的XPath ![图片说明](https://img-ask.csdn.net/upload/202002/15/1581778537_466127.png) 书中原版代码如下 ``` import csv from lxml import etree import requests headers = { 'user-agent':'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.87 Safari/537.36' } urls = ['https://book.douban.com/top250?start={}'.format(str(i)) for i in range(0,250,25)] wenben = open('E:\demo.csv','wt',newline='',encoding='utf-8') writer = csv.writer(wenben) writer.writerow(('name','url','author','publisher','date','price','rate','comment')) for url in urls: html = requests.get(url,headers=headers) selector = etree.HTML(html.text) infos = selector.xpath('//tr[@class="item"]') for info in infos: name = info.xpath('td/div/a/@title')[0] url = info.xpath('td/div/a/@href')[0] book_infos = info.xpath('td/p/text()')[0] author = book_infos.split('/')[0] publisher = book_infos.split('/')[-3] date = book_infos.split('/')[-2] price = book_infos.split('/')[-1] rate = info.xpath('td/div/span[2]/text()')[0] comments = info.xpath('td/div/span[2]/text()')[0] comment = comments[0] if len(comments) != 0 else "空" writer.writerow((name,url,author,publisher,date,price,rate,comment)) print(name) wenben.close() print("输出完成!") ``` 可以发现,以书名为例,原版中获取的XPath如下 ``` 'td/div/a/@title' ``` 但是我通过浏览器检查元素获取到的XPath如下 ``` *[@id="content"]/div/div[1]/div/table[1]/tbody/tr/td[2]/div[1]/a ``` 而且按照自己获取的XPath进行爬取,并不能爬取到网页信息。只有按照原版的XPath才能正确爬取到网页信息。 请问各位大神,为什么从浏览器端获取的XPath与案例并不一致,如何自行获取正确的XPath
python2.7.3安装libxml2,导入import lxml.html报错,求大神指教
系统是red hat ,自带的是2.6.6版本的python,但最近需要使用scrapy需要安装2.7.3版本的 ,通过yum install 安装的libxml2,安装成功后import lxml没有报错,但import lxml.html 时就报错了,报错信息如下: >>> import lxml.html Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python2.7/site-packages/lxml/html/__init__.py", line 12, in <module> from lxml import etree File "lxml.etree.pyx", line 89, in init lxml.etree (src/lxml/lxml.etree.c:140164) TypeError: encode() argument 1 must be string without null bytes, not unicode 求各位python大神指导...
爬虫返回的response内容完整,但是用etree.HTML解析后,内容就变少了,导致不能用xpath定位,是为啥?
1、爬虫返回的response内容完整,但是用etree.HTML解析后,内容就变少了,导致不能用xpath定位,是为啥? ``` import requests from lxml import etree url = "https://tieba.baidu.com/f?fr=wwwt&kw=%E4%B8%8D%E8%89%AF%E4%BA%BA" headers = { "User-Agent":"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.77 Safari/537.36" } response = requests.get(url,headers=headers).content.decode() print(response) html_str = etree.HTML(response) print(etree.tostring(html_str).decode()) # li = html_str.xpath("//ul[@id='thread_list']/li[@class='j_thread_list clearfix']") # print(li) ```
Xpath爬虫获取数据不完整
尝试学习Xpath爬虫,通过Xpath helper获得数据99条,其中最后一条为“$PORT”,如图 ![图片说明](https://img-ask.csdn.net/upload/202001/15/1579057925_476322.png) 代码如下,使用这个Xpath路径只能返回"$PORT",其他98条数据没了....... ``` import requests import csv from lxml import etree url = 'https://www.msccruisesusa.com/webapp/wcs/stores/servlet/MSC_SearchCruiseManagerRedirectCmd?storeId=12264&langId=-1004&catalogId=10001&monthsResult=&areaFilter=MED%40NOR%40&embarkFilter=&lengthFilter=&departureFrom=01.11.2020&departureTo=04.11.2020&ships=&category=&onlyAvailableCruises=true&packageTrf=false&packageTpt=false&packageCrol=false&packageCrfl=false&noAdults=2&noChildren=0&noJChildren=0&noInfant=0&dealsInput=false&tripSpecificationPanel=true&shipPreferencesPanel=false&dealsPanel=false' headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.117 Safari/537.36'} source = requests.get(url,headers=headers).content.decode('UTF-8') html = etree.HTML(source) portList = html.xpath('//*[@class="cr-city-name"]') for port in portList: print(port.xpath('string()')) ``` 求各位大神搭救,不知道是哪里出了问题......网上遍寻不到相似的情况
pyinstaller打包处理的程序用不了
warn-xxx.txt文件的信息如下,求大佬处理 This file lists modules PyInstaller was not able to find. This does not necessarily mean this module is required for running you program. Python and Python 3rd-party packages include a lot of conditional or optional module. For example the module 'ntpath' only exists on Windows, whereas the module 'posixpath' only exists on Posix systems. Types if import: * top-level: imported at the top-level - look at these first * conditional: imported within an if-statement * delayed: imported from within a function * optional: imported within a try-except-statement IMPORTANT: Do NOT post this list to the issue-tracker. Use it as a basis for yourself tracking down the missing module. Thanks! missing module named pyimod03_importers - imported by PyInstaller.loader.pyimod02_archive (delayed, conditional), c:\program files\python37\lib\site-packages\PyInstaller\loader\rthooks\pyi_rth_pkgres.py (top-level) missing module named 'pkg_resources.extern.pyparsing' - imported by pkg_resources._vendor.packaging.requirements (top-level), pkg_resources._vendor.packaging.markers (top-level) missing module named 'com.sun' - imported by pkg_resources._vendor.appdirs (delayed, conditional, optional) missing module named com - imported by pkg_resources._vendor.appdirs (delayed) missing module named win32api - imported by distutils.msvccompiler (optional), pkg_resources._vendor.appdirs (delayed, conditional, optional) missing module named win32com.shell - imported by pkg_resources._vendor.appdirs (delayed, conditional, optional) missing module named _uuid - imported by uuid (optional) missing module named netbios - imported by uuid (delayed) missing module named win32wnet - imported by uuid (delayed) missing module named __builtin__ - imported by numpy.core.numerictypes (conditional), numpy.core.numeric (conditional), numpy.lib.function_base (conditional), numpy.lib._iotools (conditional), numpy.ma.core (conditional), numpy.distutils.misc_util (delayed, conditional), numpy (conditional), pymysql._compat (conditional), pkg_resources._vendor.pyparsing (conditional), setuptools._vendor.pyparsing (conditional) missing module named ordereddict - imported by pkg_resources._vendor.pyparsing (optional), setuptools._vendor.pyparsing (optional) missing module named StringIO - imported by PyInstaller.lib.modulegraph._compat (conditional), PyInstaller.lib.modulegraph.zipio (conditional), setuptools._vendor.six (conditional), numpy.lib.utils (delayed, conditional), numpy.lib.format (delayed, conditional), numpy.testing._private.utils (conditional), six (conditional), urllib3.packages.six (conditional), requests.compat (conditional), selenium.webdriver.remote.webelement (optional), pkg_resources._vendor.six (conditional) missing module named _scproxy - imported by urllib.request (conditional) missing module named 'macholib.MachO' - imported by PyInstaller.depend.dylib (delayed), PyInstaller.depend.bindepend (delayed), PyInstaller.utils.osx (top-level) missing module named macholib - imported by PyInstaller.depend.dylib (delayed, conditional) missing module named _pkgutil - imported by PyInstaller.lib.modulegraph.modulegraph (delayed, optional) missing module named dis3 - imported by PyInstaller.lib.modulegraph._compat (conditional) missing module named urllib.pathname2url - imported by urllib (conditional), PyInstaller.lib.modulegraph._compat (conditional) missing module named pyimod00_crypto_key - imported by PyInstaller.loader.pyimod02_archive (delayed, optional) missing module named thread - imported by numpy.core.arrayprint (conditional, optional), PyInstaller.loader.pyimod02_archive (conditional) missing module named 'macholib.dyld' - imported by PyInstaller.depend.bindepend (delayed) missing module named 'macholib.mach_o' - imported by PyInstaller.depend.bindepend (delayed) missing module named Crypto - imported by PyInstaller.building.makespec (delayed, conditional, optional) missing module named win32ctypes.core._time - imported by win32ctypes.core (top-level), win32ctypes.pywin32.win32api (top-level) missing module named win32ctypes.core._system_information - imported by win32ctypes.core (top-level), win32ctypes.pywin32.win32api (top-level) missing module named win32ctypes.core._resource - imported by win32ctypes.core (top-level), win32ctypes.pywin32.win32api (top-level) missing module named win32ctypes.core._dll - imported by win32ctypes.core (top-level), win32ctypes.pywin32.win32api (top-level) missing module named win32ctypes.core._common - imported by win32ctypes.core (top-level), win32ctypes.pywin32.win32api (top-level), win32ctypes.pywin32.win32cred (top-level) missing module named win32ctypes.core._authentication - imported by win32ctypes.core (top-level), win32ctypes.pywin32.win32cred (top-level) missing module named cffi - imported by win32ctypes.core (optional) missing module named UserDict - imported by PyInstaller.compat (conditional), pytz.lazy (optional) missing module named multiprocessing.set_start_method - imported by multiprocessing (top-level), multiprocessing.spawn (top-level) missing module named multiprocessing.get_start_method - imported by multiprocessing (top-level), multiprocessing.spawn (top-level) missing module named multiprocessing.TimeoutError - imported by multiprocessing (top-level), multiprocessing.pool (top-level) missing module named multiprocessing.get_context - imported by multiprocessing (top-level), multiprocessing.pool (top-level), multiprocessing.managers (top-level), multiprocessing.sharedctypes (top-level) missing module named multiprocessing.BufferTooShort - imported by multiprocessing (top-level), multiprocessing.connection (top-level) missing module named multiprocessing.AuthenticationError - imported by multiprocessing (top-level), multiprocessing.connection (top-level) missing module named pkg_resources.extern.packaging - imported by pkg_resources.extern (top-level), pkg_resources (top-level) missing module named pkg_resources.extern.appdirs - imported by pkg_resources.extern (top-level), pkg_resources (top-level) missing module named 'pkg_resources.extern.six.moves' - imported by pkg_resources (top-level), pkg_resources._vendor.packaging.requirements (top-level) missing module named pkg_resources.extern.six - imported by pkg_resources.extern (top-level), pkg_resources (top-level) missing module named 'multiprocessing.forking' - imported by c:\program files\python37\lib\site-packages\PyInstaller\loader\rthooks\pyi_rth_multiprocessing.py (optional) missing module named resource - imported by posix (top-level), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named posix - imported by os (conditional, optional), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named _posixsubprocess - imported by subprocess (conditional), multiprocessing.util (delayed), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named readline - imported by cmd (delayed, conditional, optional), code (delayed, conditional, optional), pdb (delayed, optional), E:\yxrj\dingzhi\cj\231.py (top-level) excluded module named _frozen_importlib - imported by importlib (optional), importlib.abc (optional), PyInstaller.loader.pyimod02_archive (delayed, conditional), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named _frozen_importlib_external - imported by importlib._bootstrap (delayed), importlib (optional), importlib.abc (optional), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named _winreg - imported by platform (delayed, optional), numpy.distutils.cpuinfo (delayed, conditional, optional), requests.utils (delayed, conditional, optional), selenium.webdriver.firefox.firefox_binary (delayed, optional), E:\yxrj\dingzhi\cj\231.py (top-level), pkg_resources._vendor.appdirs (delayed) missing module named java - imported by platform (delayed), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named 'java.lang' - imported by platform (delayed, optional), xml.sax._exceptions (conditional), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named vms_lib - imported by platform (delayed, conditional, optional), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named termios - imported by tty (top-level), getpass (optional), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named urllib.getproxies_environment - imported by urllib (conditional), requests.compat (conditional) missing module named urllib.proxy_bypass_environment - imported by urllib (conditional), requests.compat (conditional) missing module named urllib.proxy_bypass - imported by urllib (conditional), requests.compat (conditional) missing module named urllib.getproxies - imported by urllib (conditional), requests.compat (conditional) missing module named urllib.unquote_plus - imported by urllib (conditional), requests.compat (conditional) missing module named urllib.quote_plus - imported by urllib (conditional), requests.compat (conditional) missing module named urllib.unquote - imported by urllib (conditional), requests.compat (conditional) missing module named urllib.urlencode - imported by urllib (optional), urllib3.packages.rfc3986.compat (optional), requests.compat (conditional) missing module named urllib.quote - imported by urllib (optional), urllib3.packages.rfc3986.compat (optional), requests.compat (conditional) missing module named grp - imported by shutil (optional), tarfile (optional), pathlib (delayed), distutils.archive_util (optional), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named 'org.python' - imported by pickle (optional), xml.sax (delayed, conditional), setuptools.sandbox (conditional), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named org - imported by copy (optional), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named pwd - imported by posixpath (delayed, conditional), shutil (optional), tarfile (optional), http.server (delayed, optional), webbrowser (delayed), pathlib (delayed, conditional, optional), distutils.util (delayed, conditional), distutils.archive_util (optional), netrc (delayed, conditional), getpass (delayed), E:\yxrj\dingzhi\cj\231.py (top-level) missing module named urllib2 - imported by numpy.lib._datasource (delayed, conditional), requests.compat (conditional), selenium.webdriver.common.utils (delayed, optional), selenium.webdriver.common.service (delayed, optional) missing module named urlparse - imported by numpy.lib._datasource (delayed, conditional), requests.compat (conditional), selenium.webdriver.remote.remote_connection (optional) runtime module named urllib3.packages.six.moves - imported by http.client (top-level), urllib3.connectionpool (top-level), urllib3.util.response (top-level), 'urllib3.packages.six.moves.urllib' (top-level), urllib3.response (top-level), urllib3.util.queue (top-level) missing module named 'OpenSSL.crypto' - imported by urllib3.contrib.pyopenssl (delayed) missing module named 'cryptography.x509' - imported by urllib3.contrib.pyopenssl (delayed, optional) missing module named 'cryptography.hazmat' - imported by pymysql._auth (optional), urllib3.contrib.pyopenssl (top-level) missing module named cryptography - imported by pymysql._auth (optional), urllib3.contrib.pyopenssl (top-level), requests (optional) missing module named OpenSSL - imported by urllib3.contrib.pyopenssl (top-level) missing module named 'backports.ssl_match_hostname' - imported by setuptools.ssl_support (optional), urllib3.packages.ssl_match_hostname (optional) missing module named brotli - imported by urllib3.util.request (optional), urllib3.response (optional) missing module named "'urllib3.packages.six.moves.urllib'.parse" - imported by urllib3.request (top-level), urllib3.poolmanager (top-level) missing module named Queue - imported by urllib3.util.queue (conditional) missing module named httplib - imported by selenium.webdriver.safari.webdriver (optional), selenium.webdriver.blackberry.webdriver (optional), selenium.webdriver.webkitgtk.webdriver (optional) missing module named cStringIO - imported by selenium.webdriver.firefox.firefox_profile (optional) missing module named copy_reg - imported by numpy.core (conditional), soupsieve.util (conditional), cStringIO (top-level) missing module named 'backports.functools_lru_cache' - imported by soupsieve.util (conditional) missing module named iconv_codec - imported by bs4.dammit (optional) missing module named cchardet - imported by bs4.dammit (optional) missing module named lxml - imported by bs4.builder._lxml (top-level) missing module named 'html5lib.treebuilders' - imported by bs4.builder._html5lib (optional) missing module named 'html5lib.constants' - imported by bs4.builder._html5lib (top-level) missing module named html5lib - imported by bs4.builder._html5lib (top-level) missing module named Cookie - imported by requests.compat (conditional) missing module named cookielib - imported by requests.compat (conditional) missing module named simplejson - imported by pandas.util._print_versions (delayed, conditional, optional), requests.compat (optional) missing module named socks - imported by urllib3.contrib.socks (optional) missing module named _dummy_threading - imported by dummy_threading (optional) missing module named ConfigParser - imported by numpy.distutils.system_info (conditional), numpy.distutils.npy_pkg_config (conditional), pymysql.optionfile (conditional) missing module named scipy - imported by numpy.testing._private.nosetester (delayed, conditional), pandas.core.missing (delayed) missing module named numexpr - imported by pandas.core.computation.expressions (conditional), pandas.core.computation.engines (delayed) missing module named 'scipy.stats' - imported by pandas.plotting._matplotlib.hist (delayed), pandas.plotting._matplotlib.misc (delayed, conditional), pandas.core.nanops (delayed, conditional) missing module named 'scipy.signal' - imported by pandas.core.window (delayed, conditional) missing module named commands - imported by numpy.distutils.cpuinfo (conditional) missing module named setuptools.extern.packaging - imported by setuptools.extern (top-level), setuptools.dist (top-level), setuptools.command.egg_info (top-level) missing module named 'setuptools.extern.six' - imported by setuptools (top-level), setuptools.extension (top-level) missing module named setuptools.extern.six.moves.filterfalse - imported by setuptools.extern.six.moves (top-level), setuptools.dist (top-level), setuptools.msvc (top-level) missing module named setuptools.extern.six.moves.filter - imported by setuptools.extern.six.moves (top-level), setuptools.dist (top-level), setuptools.ssl_support (top-level), setuptools.command.py36compat (top-level) missing module named _manylinux - imported by setuptools.pep425tags (delayed, optional) missing module named wincertstore - imported by setuptools.ssl_support (delayed, optional) missing module named backports - imported by setuptools.ssl_support (optional) missing module named 'setuptools._vendor.six.moves' - imported by 'setuptools._vendor.six.moves' (top-level) missing module named 'setuptools.extern.pyparsing' - imported by setuptools._vendor.packaging.requirements (top-level), setuptools._vendor.packaging.markers (top-level) missing module named 'setuptools.extern.packaging.version' - imported by setuptools.msvc (top-level) missing module named setuptools.extern.six.moves.map - imported by setuptools.extern.six.moves (top-level), setuptools.dist (top-level), setuptools.command.easy_install (top-level), setuptools.sandbox (top-level), setuptools.package_index (top-level), setuptools.ssl_support (top-level), setuptools.command.egg_info (top-level), setuptools.namespaces (top-level) runtime module named setuptools.extern.six.moves - imported by setuptools.dist (top-level), setuptools.py33compat (top-level), setuptools.command.easy_install (top-level), setuptools.sandbox (top-level), setuptools.command.setopt (top-level), setuptools.package_index (top-level), setuptools.ssl_support (top-level), setuptools.command.egg_info (top-level), setuptools.command.py36compat (top-level), setuptools.namespaces (top-level), setuptools.msvc (top-level), 'setuptools._vendor.six.moves' (top-level) missing module named setuptools.extern.six - imported by setuptools.extern (top-level), setuptools.monkey (top-level), setuptools.dist (top-level), setuptools.extern.six.moves (top-level), setuptools.py33compat (top-level), setuptools.config (top-level), setuptools.command.easy_install (top-level), setuptools.sandbox (top-level), setuptools.py27compat (top-level), setuptools.package_index (top-level), setuptools.wheel (top-level), setuptools.command.egg_info (top-level), setuptools.command.sdist (top-level), setuptools.command.bdist_egg (top-level), setuptools.unicode_utils (top-level), setuptools.glob (top-level), setuptools.command.develop (top-level) missing module named 'numpy_distutils.cpuinfo' - imported by numpy.f2py.diagnose (delayed, conditional, optional) missing module named 'numpy_distutils.fcompiler' - imported by numpy.f2py.diagnose (delayed, conditional, optional) missing module named 'numpy_distutils.command' - imported by numpy.f2py.diagnose (delayed, conditional, optional) missing module named numpy_distutils - imported by numpy.f2py.diagnose (delayed, optional) missing module named 'nose.plugins' - imported by numpy.testing._private.noseclasses (top-level), numpy.testing._private.nosetester (delayed) missing module named numpy.core.number - imported by numpy.core (delayed), numpy.testing._private.utils (delayed) missing module named numpy.core.signbit - imported by numpy.core (delayed), numpy.testing._private.utils (delayed) missing module named numpy.core.float64 - imported by numpy.core (delayed), numpy.testing._private.utils (delayed) missing module named numpy.core.integer - imported by numpy.core (top-level), numpy.fft.helper (top-level) missing module named numpy.core.conjugate - imported by numpy.core (top-level), numpy.fft.pocketfft (top-level) missing module named numpy.core.sign - imported by numpy.core (top-level), numpy.linalg.linalg (top-level) missing module named numpy.core.divide - imported by numpy.core (top-level), numpy.linalg.linalg (top-level) missing module named numpy.core.object_ - imported by numpy.core (top-level), numpy.linalg.linalg (top-level) missing module named numpy.core.geterrobj - imported by numpy.core (top-level), numpy.linalg.linalg (top-level) missing module named numpy.core.sqrt - imported by numpy.core (top-level), numpy.linalg.linalg (top-level), numpy.fft.pocketfft (top-level) missing module named numpy.core.add - imported by numpy.core (top-level), numpy.linalg.linalg (top-level) missing module named numpy.core.complexfloating - imported by numpy.core (top-level), numpy.linalg.linalg (top-level) missing module named numpy.core.inexact - imported by numpy.core (top-level), numpy.linalg.linalg (top-level) missing module named numpy.core.cdouble - imported by numpy.core (top-level), numpy.linalg.linalg (top-level) missing module named numpy.core.csingle - imported by numpy.core (top-level), numpy.linalg.linalg (top-level) missing module named numpy.core.double - imported by numpy.core (top-level), numpy.linalg.linalg (top-level) missing module named numpy.core.single - imported by numpy.core (top-level), numpy.linalg.linalg (top-level) missing module named numpy.core.float32 - imported by numpy.core (top-level), numpy.testing._private.utils (top-level) missing module named numpy.core.intp - imported by numpy.core (top-level), numpy.testing._private.utils (top-level), numpy.linalg.linalg (top-level) missing module named numpy.eye - imported by numpy (delayed), numpy.core.numeric (delayed) missing module named dummy_thread - imported by numpy.core.arrayprint (conditional, optional) missing module named 'nose.util' - imported by numpy.testing._private.noseclasses (top-level) missing module named nose - imported by numpy.testing._private.utils (delayed, optional), numpy.testing._private.decorators (delayed), numpy.testing._private.noseclasses (top-level) missing module named win32pdh - imported by numpy.testing._private.utils (delayed, conditional) missing module named __svn_version__ - imported by numpy.f2py.__version__ (optional) missing module named numarray - imported by numpy.distutils.system_info (delayed, conditional, optional) missing module named Numeric - imported by numpy.distutils.system_info (delayed, conditional, optional) missing module named win32con - imported by distutils.msvccompiler (optional) missing module named _curses - imported by curses (top-level), curses.has_key (top-level) missing module named pytest - imported by numpy._pytesttester (delayed), pandas.util._tester (delayed, optional), pandas.util.testing (delayed, conditional, optional) missing module named future_builtins - imported by numpy.lib.npyio (conditional) missing module named cpickle - imported by numpy.compat.py3k (conditional) missing module named pickle5 - imported by numpy.compat.py3k (conditional, optional) missing module named numpy.histogramdd - imported by numpy (delayed), numpy.lib.twodim_base (delayed) missing module named numpy.lib.i0 - imported by numpy.lib (top-level), numpy.dual (top-level) missing module named 'scipy.sparse' - imported by pandas.core.sparse.scipy_sparse (delayed), pandas.core.arrays.sparse (delayed), pandas.core.dtypes.common (delayed, conditional, optional) missing module named botocore - imported by pandas.io.s3 (delayed) missing module named 'pyarrow.parquet' - imported by pandas.io.parquet (delayed) missing module named pyarrow - imported by pandas.io.feather_format (delayed) missing module named contextmanager - imported by dateutil.tz.tz (optional) runtime module named six.moves - imported by dateutil.tz.tz (top-level), dateutil.tz.win (top-level), dateutil.rrule (top-level) missing module named six.moves.range - imported by six.moves (top-level), dateutil.rrule (top-level) missing module named dateutil.tz.tzfile - imported by dateutil.tz (top-level), dateutil.zoneinfo (top-level) missing module named dateutil.tz.tzlocal - imported by dateutil.tz (top-level), dateutil.rrule (top-level) missing module named dateutil.tz.tzutc - imported by dateutil.tz (top-level), dateutil.rrule (top-level) missing module named PyQt4 - imported by pandas.io.clipboard.clipboards (delayed, optional), pandas.io.clipboard (delayed, conditional, optional) missing module named PyQt5 - imported by pandas.io.clipboard.clipboards (delayed, optional), pandas.io.clipboard (delayed, conditional, optional) missing module named qtpy - imported by pandas.io.clipboard.clipboards (delayed, optional), pandas.io.clipboard (delayed, conditional, optional) missing module named 'sqlalchemy.types' - imported by pandas.io.sql (delayed, conditional) missing module named 'sqlalchemy.schema' - imported by pandas.io.sql (delayed, conditional) missing module named sqlalchemy - imported by pandas.io.sql (delayed, conditional, optional) missing module named tables - imported by pandas.io.pytables (delayed, conditional) missing module named xlwt - imported by pandas.io.excel._xlwt (delayed) missing module named xlsxwriter - imported by pandas.io.excel._xlsxwriter (delayed) missing module named 'openpyxl.styles' - imported by pandas.io.excel._openpyxl (delayed) missing module named 'openpyxl.style' - imported by pandas.io.excel._openpyxl (delayed) missing module named openpyxl - imported by pandas.io.excel._openpyxl (delayed, conditional) missing module named xlrd - imported by pandas.io.excel._xlrd (delayed) missing module named 'odf.namespaces' - imported by pandas.io.excel._odfreader (delayed) missing module named 'odf.table' - imported by pandas.io.excel._odfreader (delayed) missing module named 'odf.opendocument' - imported by pandas.io.excel._odfreader (delayed) missing module named odf - imported by pandas.io.excel._odfreader (delayed) missing module named matplotlib - imported by pandas.plotting._matplotlib.boxplot (top-level), pandas.plotting._matplotlib.compat (delayed, optional), pandas.plotting._matplotlib.timeseries (delayed), pandas.plotting._matplotlib.core (delayed), pandas.io.formats.style (optional) missing module named 'matplotlib.pyplot' - imported by pandas.plotting._matplotlib.style (delayed), pandas.plotting._matplotlib.tools (delayed), pandas.plotting._matplotlib.core (delayed), pandas.plotting._matplotlib.timeseries (delayed), pandas.plotting._matplotlib.boxplot (delayed), pandas.plotting._matplotlib.hist (delayed), pandas.plotting._matplotlib.misc (delayed), pandas.plotting._matplotlib (delayed), pandas.io.formats.style (optional), pandas.util.testing (delayed) missing module named numpy.array - imported by numpy (top-level), numpy.ma.core (top-level), numpy.ma.extras (top-level), numpy.ma.mrecords (top-level), numpy.ctypeslib (top-level) missing module named numpy.recarray - imported by numpy (top-level), numpy.ma.mrecords (top-level) missing module named numpy.ndarray - imported by numpy (top-level), numpy.ma.core (top-level), numpy.ma.extras (top-level), numpy.ma.mrecords (top-level), numpy.ctypeslib (top-level), pandas.compat.numpy.function (top-level) missing module named numpy.dtype - imported by numpy (top-level), numpy.ma.mrecords (top-level), numpy.ctypeslib (top-level) missing module named numpy.bool_ - imported by numpy (top-level), numpy.ma.core (top-level), numpy.ma.mrecords (top-level) missing module named 'matplotlib.ticker' - imported by pandas.plotting._matplotlib.converter (top-level), pandas.plotting._matplotlib.tools (top-level), pandas.plotting._matplotlib.core (delayed) missing module named 'matplotlib.table' - imported by pandas.plotting._matplotlib.tools (top-level) missing module named 'matplotlib.colors' - imported by pandas.plotting._matplotlib.style (top-level) missing module named 'matplotlib.cm' - imported by pandas.plotting._matplotlib.style (top-level) missing module named 'matplotlib.patches' - imported by pandas.plotting._matplotlib.misc (top-level) missing module named 'matplotlib.lines' - imported by pandas.plotting._matplotlib.misc (top-level) missing module named 'matplotlib.axes' - imported by pandas.plotting._matplotlib.core (delayed) missing module named 'matplotlib.units' - imported by pandas.plotting._matplotlib.converter (top-level) missing module named 'matplotlib.transforms' - imported by pandas.plotting._matplotlib.converter (top-level) missing module named 'matplotlib.dates' - imported by pandas.plotting._matplotlib.converter (top-level) missing module named numpy.expand_dims - imported by numpy (top-level), numpy.ma.core (top-level) missing module named numpy.iscomplexobj - imported by numpy (top-level), numpy.ma.core (top-level) missing module named numpy.amin - imported by numpy (top-level), numpy.ma.core (top-level) missing module named numpy.amax - imported by numpy (top-level), numpy.ma.core (top-level) missing module named 'IPython.core' - imported by pandas.io.formats.printing (delayed, conditional) missing module named IPython - imported by pandas.io.formats.printing (delayed) missing module named s3fs - imported by pandas.io.common (delayed, optional) missing module named sets - imported by pytz.tzinfo (optional) missing module named numpy.random.randn - imported by numpy.random (top-level), pandas.util.testing (top-level) missing module named numpy.random.rand - imported by numpy.random (top-level), pandas.util.testing (top-level) missing module named hypothesis - imported by pandas.util._tester (delayed, optional) missing module named 'lxml.etree' - imported by pandas.io.html (delayed) missing module named 'lxml.html' - imported by pandas.io.html (delayed)
openpyxl 保存的时候出错
使用save之后就会报错, TypeError: got invalid input value of type <class 'xml.etree.ElementTree.Element'>, expected string or Element 是库装的不对吗?
Python爬虫抓取信息存储到excel表格后,怎么实行数据可视化
我用python爬去了起点中文网的一些信息,并且存储到excel中,现在想要实现数据可视化怎么写应该 import requests from lxml import etree from openpyxl import Workbook class Book(): def __init__(p): p.url = 'https://www.qidian.com/rank/hotsales?page={页数}' p.wb = Workbook() # class实例化 p.ws = p.wb.active # 激活工具表 p.ws.append(['书名', '作者', '类型', '连载状态']) # 添加对应的表头 def geturl(p): url = [p.url.format(页数 =i) for i in range(1,15)] return url def parse_url(p,url): response =requests.get(url,timeout = 5) return response.content.decode('utf-8','ignore') def get_list(p,html_str): html = etree.HTML(html_str) connect_list = [] lists = html.xpath("//div[@class='book-img-text']/ul/li//div[@class='book-mid-info']") for list in lists: item = {} item['书名'] = ''.join(list.xpath("./h4/a/text()")) item['作者'] = ''.join(list.xpath("./p[@class='author']/a[1]/text()")) item['类型'] = ''.join(list.xpath("./p[@class='author']/a[2]/text()")) item['连载状态'] = ''.join(list.xpath("./p[@class='author']/span/text()")) connect_list.append(item) return connect_list def save_list(p, connects): for connect in connects: p.ws.append([connect['书名'], connect['作者'], connect['类型'], connect['连载状态']]) print('保存小说信息成功') def run(p): url_list = p.geturl() for url in url_list: html_url =p.parse_url(url) connects = p.get_list(html_url) p.save_list(connects[:]) p.wb.save('book.xlsx') if __name__=='__main__': spider = Book() spider.run()
openpyxl 生产xlsx文件报错
python 3.6 openpyxl 3.0.2 ```python from openpyxl import Workbook from openpyxl.utils import get_column_letter wb = Workbook() dest_filename = 'empty_book.xlsx' ws1 = wb.active ws1.title = "range names" for row in range(1, 40): ws1.append(range(600)) ws2 = wb.create_sheet(title="Pi") ws2['F5'] = 3.14 ws3 = wb.create_sheet(title="Data") for row in range(10, 20): for col in range(27, 54): _ = ws3.cell(column=col, row=row, value="{0}".format(get_column_letter(col))) print(ws3['AA10'].value) wb.save(filename = dest_filename) ``` Traceback (most recent call last): File "C:/Users/Administrator/Desktop/python 报价单生成脚本/111.py", line 36, in <module> wb.save(filename = dest_filename) File "D:\Anaconda3\lib\site-packages\openpyxl\workbook\workbook.py", line 408, in save save_workbook(self, filename) File "D:\Anaconda3\lib\site-packages\openpyxl\writer\excel.py", line 293, in save_workbook writer.save() File "D:\Anaconda3\lib\site-packages\openpyxl\writer\excel.py", line 275, in save self.write_data() File "D:\Anaconda3\lib\site-packages\openpyxl\writer\excel.py", line 75, in write_data self._write_worksheets() File "D:\Anaconda3\lib\site-packages\openpyxl\writer\excel.py", line 215, in _write_worksheets self.write_worksheet(ws) File "D:\Anaconda3\lib\site-packages\openpyxl\writer\excel.py", line 200, in write_worksheet writer.write() File "D:\Anaconda3\lib\site-packages\openpyxl\worksheet\_writer.py", line 354, in write self.write_top() File "D:\Anaconda3\lib\site-packages\openpyxl\worksheet\_writer.py", line 98, in write_top self.write_properties() File "D:\Anaconda3\lib\site-packages\openpyxl\worksheet\_writer.py", line 60, in write_properties self.xf.send(props.to_tree()) File "D:\Anaconda3\lib\site-packages\openpyxl\worksheet\_writer.py", line 294, in get_stream xf.write(el) File "src/lxml/serializer.pxi", line 1230, in lxml.etree._IncrementalFileWriter.write TypeError: got invalid input value of type <class 'xml.etree.ElementTree.Element'>, expected string or Element
scrape创建第一个项目失败
cannot import name ‘etree‘from ‘lxml‘
请问以下爬虫程序错在哪里,爬取到的数据存不进MQSQL数据库
1、请问以下爬虫程序错在哪里,爬取到的数据存不进MQSQL数据库,我在MYSQL里面已经建立了数据表: ``` mysql> CREATE TABLE `VERSION`( -> `index` INT, -> `code` INT, -> `name` VARCHAR(20) -> )ENGINE=InnoDB DEFAULT CHARSET=UTF8MB4; ``` 2、全部代码如下 ``` import requests from lxml import etree import pymysql import re class GovementSpider(object): def __init__(self): self.one_url = 'http://www.mca.gov.cn/article/sj/xzqh/2019/' self.headers = { "User-Agent": "Mozilla/5.0 (Windows NT 6.3; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.79 Safari/537.36" } self.db = pymysql.connect('localhost', '***', ***', 'reptile_db', charset='utf8') self.cursor = self.db.cursor() # 提取二级页面链接(假链接) def get_false_link(self): html = requests.get(url=self.one_url, headers=self.headers).content.decode('utf-8', 'ignore') parse_html = etree.HTML(html) # xpath://a[@class='artitlelist'] r_list = parse_html.xpath("//a[@class='artitlelist']") for r in r_list: # 或者这么找title属性值 # title = r.get('title') title = r.xpath("./@title")[0] # 利用正则找到第一个自己需要的title里面的地址(第一个一般都是最新的) if re.findall(r'.*?中华人民共和国县以上行政区划代码.*?', title, re.RegexFlag.S): # 获取到第1个就停止即可,第1个永远是最新的链接 two_link = 'http://www.mca.gov.cn' + r.xpath('./@href')[0] return two_link # 提取真是的二级页面链接(返回数据的链接) def get_true_link(self): two_false_link = self.get_false_link() html = requests.get(url=two_false_link, headers=self.headers).text pattern = re.compile(r'window.location.href="(.*?)"', re.RegexFlag.S) real_link = pattern.findall(html)[0] self.get_data(real_link) # 真正提取数据函数 def get_data(self, real_link): html = requests.get(url=real_link, headers=self.headers).text # 基本xpath: //tr[@height="19"] parse_html = etree.HTML(html) tr_list = parse_html.xpath('//tr[@height="19"]') k=0 index=[] for tr in tr_list: # code: ./td[2]/text() code = tr.xpath('./td[2]/text()')[0] # name: ./td[3]/text() name = tr.xpath('./td[3]/text()')[0] print(code, name) k+=1 index.append(k) self.save_sql(index,code,name) def save_sql(self,index,code,name): n=0 for index in index: code=code[n].strip() name=name[n].strip() self.cursor.execute("insert into version(index,code,name) values (%s,%s,%s)",(index,code,name)) self.db.commit() n+=1; # 主函数 def main(self): self.get_true_link() self.cursor.close() self.db.close() if __name__ == "__main__": spider = GovementSpider() spider.main() ``` 3、数据能够爬取到,但存不进数据库,提示以下错误: pymysql.err.ProgrammingError: (1064, "You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'index,code,name) values (1,'8','澳')' at line 1")
CNN网络不知道载入的数据集是什么格式的?
CNN初学者,最近自己在github上拿了个项目练手,问题是数据集不公开,只能自己做数据集,但是却看不懂数据集应该怎么制作。 代码如下 应该就是DAC_DATASET类中 class DAC_Dataset(RNGDataFlow): def __init__(self, dataset_dir, train, all_classes): self.images = [] if all_classes == 1: for directory in listdir(dataset_dir): for file in listdir(dataset_dir + '/' + directory): if '.jpg' in file: for c in classes: if c[0] in directory: label = c[1] break self.images.append([dataset_dir + '/' + directory + '/' + file, label]) else: for file in listdir(dataset_dir): if '.jpg' in file: self.images.append([dataset_dir + '/' + file, 0]) shuffle(self.images) if train == 0: self.images = self.images[0:1000] def get_data(self): for image in self.images: xml_name = image[0].replace('jpg','xml') im = cv2.imread(image[0], cv2.IMREAD_COLOR) im = cv2.resize(im, (square_size, square_size)) im = im.reshape((square_size, square_size, 3)) meta = None if os.path.isfile(image[0].replace('jpg','xml')): meta = xml.etree.ElementTree.parse(xml_name).getroot() label = np.array(image[1]) bndbox = {} bndbox['xmin'] = 0 bndbox['xmax'] = 0 bndbox['ymin'] = 0 bndbox['ymax'] = 0 if meta is not None: obj = meta.find('object') if obj is not None: box = obj.find('bndbox') if box is not None: bndbox['xmin'] = int(box.find('xmin').text) bndbox['xmax'] = int(box.find('xmax').text) bndbox['ymin'] = int(box.find('ymin').text) bndbox['ymax'] = int(box.find('ymax').text) bndbox['xmin'] = int(bndbox['xmin']*(square_size/IMAGE_WIDTH)) bndbox['xmax'] = int(bndbox['xmax']*(square_size/IMAGE_WIDTH)) bndbox['ymin'] = int(bndbox['ymin']*(square_size/IMAGE_HEIGHT)) bndbox['ymax'] = int(bndbox['ymax']*(square_size/IMAGE_HEIGHT)) iou = np.zeros( (height_width, height_width) ) for h in range(0, height_width): for w in range(0, height_width): rect = {} rect['xmin'] = int(w*down_sample_factor) rect['xmax'] = int((w+1)*down_sample_factor) rect['ymin'] = int(h*down_sample_factor) rect['ymax'] = int((h+1)*down_sample_factor) if DEMO_DATASET == 0: if intersection(rect, bndbox) == 0.0: iou[h,w] = 0.0 else: iou[h,w] = 1.0 else: if intersection(rect, bndbox) < 0.5: iou[h,w] = 0.0 else: iou[h,w] = 1.0 # if iou[h,w] > 0: # cv2.rectangle(im, (int(rect['xmin']),int(rect['ymin'])), (int(rect['xmax']),int(rect['ymax'])), (0,0,iou[h,w]*255), 1) iou = iou.reshape( (height_width, height_width, 1) ) valid = np.zeros((height_width, height_width, 4), dtype='float32') relative_bndboxes = np.zeros((height_width, height_width, 4), dtype='float32') for h in range(0, height_width): for w in range(0, height_width): if iou[h, w] > 0.0: valid[h,w,0] = 1.0 valid[h,w,1] = 1.0 valid[h,w,2] = 1.0 valid[h,w,3] = 1.0 relative_bndboxes[h, w, 0] = bndbox['xmin'] - w*down_sample_factor relative_bndboxes[h, w, 1] = bndbox['ymin'] - h*down_sample_factor relative_bndboxes[h, w, 2] = bndbox['xmax'] - w*down_sample_factor relative_bndboxes[h, w, 3] = bndbox['ymax'] - h*down_sample_factor else: relative_bndboxes[h, w] = np.zeros(4) # cv2.rectangle(im, (bndbox['xmin'],bndbox['ymin']), (bndbox['xmax'],bndbox['ymax']), (255,0,0), 1) # cv2.imshow('image', im) # cv2.waitKey(1000) yield [im, label, iou, valid, relative_bndboxes] def size(self): return len(self.images) class Model(ModelDesc): def _get_inputs(self): return [InputDesc(tf.float32, [None, square_size, square_size, 3], 'input'), InputDesc(tf.int32, [None], 'label'), InputDesc(tf.float32, [None, height_width, height_width, 1], 'ious'), InputDesc(tf.float32, [None, height_width, height_width, 4], 'valids'), InputDesc(tf.float32, [None, height_width, height_width, 4], 'bndboxes')] def _build_graph(self, inputs): image, label, ious, valids, bndboxes = inputs image = tf.round(image) fw, fa, fg = get_dorefa(BITW, BITA, BITG) old_get_variable = tf.get_variable def monitor(x, name): if MONITOR == 1: return tf.Print(x, [x], message='\n\n' + name + ': ', summarize=1000, name=name) else: return x def new_get_variable(v): name = v.op.name if not name.endswith('W') or 'conv1' in name or 'conv_obj' in name or 'conv_box' in name: return v else: logger.info("Quantizing weight {}".format(v.op.name)) if MONITOR == 1: return tf.Print(fw(v), [fw(v)], message='\n\n' + v.name + ', Quantized weights are:', summarize=100) else: return fw(v) def activate(x): if BITA == 32: return tf.nn.relu(x) else: return fa(tf.nn.relu(x)) def bn_activate(name, x): x = BatchNorm(name, x) x = monitor(x, name + '_noact_out') return activate(x) def halffire(name, x, num_squeeze_filters, num_expand_3x3_filters, skip): out_squeeze = Conv2D('squeeze_conv_' + name, x, out_channel=num_squeeze_filters, kernel_shape=1, stride=1, padding='SAME') out_squeeze = bn_activate('bn_squeeze_' + name, out_squeeze) out_expand_3x3 = Conv2D('expand_3x3_conv_' + name, out_squeeze, out_channel=num_expand_3x3_filters, kernel_shape=3, stride=1, padding='SAME') out_expand_3x3 = bn_activate('bn_expand_3x3_' + name, out_expand_3x3) if skip == 0: return out_expand_3x3 else: return tf.add(x, out_expand_3x3) def halffire_noact(name, x, num_squeeze_filters, num_expand_3x3_filters): out_squeeze = Conv2D('squeeze_conv_' + name, x, out_channel=num_squeeze_filters, kernel_shape=1, stride=1, padding='SAME') out_squeeze = bn_activate('bn_squeeze_' + name, out_squeeze) out_expand_3x3 = Conv2D('expand_3x3_conv_' + name, out_squeeze, out_channel=num_expand_3x3_filters, kernel_shape=3, stride=1, padding='SAME') return out_expand_3x3 with remap_variables(new_get_variable), \ argscope([Conv2D, FullyConnected], use_bias=False, nl=tf.identity), \ argscope(BatchNorm, decay=0.9, epsilon=1e-4): image = monitor(image, 'image_out') l = Conv2D('conv1', image, out_channel=32, kernel_shape=3, stride=2, padding='SAME') l = bn_activate('bn1', l) l = monitor(l, 'conv1_out') l = MaxPooling('pool1', l, shape=3, stride=2, padding='SAME') l = monitor(l, 'pool1_out') l = halffire('fire1', l, NUM_SQUEEZE_FILTERS, NUM_EXPAND_FILTERS, 0) l = monitor(l, 'fire1_out') l = MaxPooling('pool2', l, shape=3, stride=2, padding='SAME') l = monitor(l, 'pool2_out') l = halffire('fire2', l, NUM_SQUEEZE_FILTERS, NUM_EXPAND_FILTERS, 0) l = monitor(l, 'fire2_out') l = MaxPooling('pool3', l, shape=3, stride=2, padding='SAME') l = monitor(l, 'pool3_out') l = halffire('fire3', l, NUM_SQUEEZE_FILTERS, NUM_EXPAND_FILTERS, 0) l = monitor(l, 'fire3_out') l = halffire('fire4', l, NUM_SQUEEZE_FILTERS, NUM_EXPAND_FILTERS, 0) l = monitor(l, 'fire4_out') l = halffire('fire5', l, NUM_SQUEEZE_FILTERS, NUM_EXPAND_FILTERS, 0) l = monitor(l, 'fire5_out') l = halffire('fire6', l, NUM_SQUEEZE_FILTERS, NUM_EXPAND_FILTERS, 0) l = monitor(l, 'fire6_out') l = halffire('fire7', l, NUM_SQUEEZE_FILTERS, NUM_EXPAND_FILTERS, 0) l = monitor(l, 'fire7_out') # Classification classify = Conv2D('conv_class', l, out_channel=12, kernel_shape=1, stride=1, padding='SAME') classify = bn_activate('bn_class', classify) classify = monitor(classify, 'conv_class_out') logits = GlobalAvgPooling('pool_class', classify) class_loss = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=label) class_loss = tf.reduce_mean(class_loss, name='cross_entropy_loss') wrong = prediction_incorrect(logits, label, 1, name='wrong-top1') add_moving_summary(tf.reduce_mean(wrong, name='train-error-top1')) # Object Detection l = tf.concat([l, classify], axis=3) objdetect = Conv2D('conv_obj', l, out_channel=1, kernel_shape=1, stride=1, padding='SAME') objdetect = tf.identity(objdetect, name='objdetect_out') objdetect_loss = tf.losses.hinge_loss(labels=ious, logits=objdetect) bndbox = Conv2D('conv_box', l, out_channel=4, kernel_shape=1, stride=1, padding='SAME') bndbox = tf.identity(bndbox, name='bndbox_out') bndbox = tf.multiply(bndbox, valids, name='mult0') bndbox_loss = tf.losses.mean_squared_error(labels=bndboxes, predictions=bndbox) # weight decay on all W of fc layers # reg_cost = regularize_cost('(fire7|conv_obj|conv_box).*/W', l2_regularizer(1e-5), name='regularize_cost') # cost = class_loss*objdetect_loss*bndbox_loss # cost = class_loss + objdetect_loss + bndbox_loss + reg_cost cost = class_loss + 10*objdetect_loss + bndbox_loss add_moving_summary(class_loss, objdetect_loss, bndbox_loss, cost) self.cost = cost tf.get_variable = old_get_variable def _get_optimizer(self): lr = tf.get_variable('learning_rate', initializer=1e-2, trainable=False) opt = tf.train.AdamOptimizer(lr, epsilon=1e-5) # lr = tf.get_variable('learning_rate', initializer=1e-1, trainable=False) # opt = tf.train.MomentumOptimizer(lr, momentum=0.9) return opt def get_data(dataset_dir, train): if DEMO_DATASET == 0: all_classes = 1 else: all_classes = 0 ds = DAC_Dataset(dataset_dir, train, all_classes) ds = BatchData(ds, BATCH_SIZE, remainder=False) ds = PrefetchDataZMQ(ds, nr_proc=8, hwm=6) return ds def get_config(): logger.auto_set_dir() data_train = get_data(args.data, 1) data_test = get_data(args.data, 0) if DEMO_DATASET == 0: return TrainConfig( dataflow=data_train, callbacks=[ ModelSaver(max_to_keep=10), HumanHyperParamSetter('learning_rate'), ScheduledHyperParamSetter('learning_rate', [(40, 0.001), (60, 0.0001), (90, 0.00001)]) ,InferenceRunner(data_test, [ScalarStats('cross_entropy_loss'), ClassificationError('wrong-top1', 'val-error-top1')]) ], model=Model(), max_epoch=150 ) else: return TrainConfig( dataflow=data_train, callbacks=[ ModelSaver(max_to_keep=10), HumanHyperParamSetter('learning_rate'), ScheduledHyperParamSetter('learning_rate', [(100, 0.001), (200, 0.0001), (250, 0.00001)]) ], model=Model(), max_epoch=300 ) def run_image(model, sess_init, image_dir): print('Running image!') output_names = ['objdetect_out', 'bndbox_out'] pred_config = PredictConfig( model=model, session_init=sess_init, input_names=['input'], output_names=output_names ) predictor = OfflinePredictor(pred_config) images = [] metas = [] for file in listdir(image_dir): if '.jpg' in file: images.append(file) if '.xml' in file: metas.append(file) images.sort() metas.sort() THRESHOLD = 0 index = 0 for image in images: meta = xml.etree.ElementTree.parse(image_dir + '/' + metas[index]).getroot() true_bndbox = {} true_bndbox['xmin'] = 0 true_bndbox['xmax'] = 0 true_bndbox['ymin'] = 0 true_bndbox['ymax'] = 0 if meta is not None: obj = meta.find('object') if obj is not None: box = obj.find('bndbox') if box is not None: true_bndbox['xmin'] = int(box.find('xmin').text) true_bndbox['xmax'] = int(box.find('xmax').text) true_bndbox['ymin'] = int(box.find('ymin').text) true_bndbox['ymax'] = int(box.find('ymax').text) index += 1 im = cv2.imread(image_dir + '/' + image, cv2.IMREAD_COLOR) im = cv2.resize(im, (square_size, square_size)) im = im.reshape((1, square_size, square_size, 3)) outputs = predictor([im]) im = cv2.imread(image_dir + '/' + image, cv2.IMREAD_COLOR) objdetect = outputs[0] bndboxes = outputs[1] max_pred = -100 max_h = -1 max_w = -1 for h in range(0, objdetect.shape[1]): for w in range(0, objdetect.shape[2]): if objdetect[0, h, w] > max_pred: max_pred = objdetect[0, h, w] max_h = h max_w = w sum_labels= 0; bndbox = {} bndbox['xmin'] = 0 bndbox['ymin'] = 0 bndbox['xmax'] = 0 bndbox['ymax'] = 0 for h in range(0, objdetect.shape[1]): for w in range(0, objdetect.shape[2]): if (objdetect[0, h, w] > THRESHOLD and (h == max_h-1 or h == max_h or h == max_h+1) and (w == max_w-1 or w == max_w or w == max_w+1)) or (h == max_h and w == max_w): sum_labels += 1 bndbox['xmin'] += int( (bndboxes[0,h,w,0] + w*down_sample_factor) ) bndbox['ymin'] += int( (bndboxes[0,h,w,1] + h*down_sample_factor) ) bndbox['xmax'] += int( (bndboxes[0,h,w,2] + w*down_sample_factor) ) bndbox['ymax'] += int( (bndboxes[0,h,w,3] + h*down_sample_factor) ) temp_xmin = int( (bndboxes[0,h,w,0] + w*down_sample_factor) *(IMAGE_WIDTH/square_size) ) temp_ymin = int( (bndboxes[0,h,w,1] + h*down_sample_factor) *(IMAGE_HEIGHT/square_size) ) temp_xmax = int( (bndboxes[0,h,w,2] + w*down_sample_factor) *(IMAGE_WIDTH/square_size) ) temp_ymax = int( (bndboxes[0,h,w,3] + h*down_sample_factor) *(IMAGE_HEIGHT/square_size) ) cv2.rectangle(im, (temp_xmin,temp_ymin), (temp_xmax,temp_ymax), (255,0,0), 1) bndbox['xmin'] = int(bndbox['xmin']*(1/sum_labels)) bndbox['ymin'] = int(bndbox['ymin']*(1/sum_labels)) bndbox['xmax'] = int(bndbox['xmax']*(1/sum_labels)) bndbox['ymax'] = int(bndbox['ymax']*(1/sum_labels)) bndbox['xmin'] = int(bndbox['xmin']*(IMAGE_WIDTH/square_size)) bndbox['ymin'] = int(bndbox['ymin']*(IMAGE_HEIGHT/square_size)) bndbox['xmax'] = int(bndbox['xmax']*(IMAGE_WIDTH/square_size)) bndbox['ymax'] = int(bndbox['ymax']*(IMAGE_HEIGHT/square_size)) bndbox2 = {} bndbox2['xmin'] = int( bndboxes[0,max_h,max_w,0] + max_w*down_sample_factor) bndbox2['ymin'] = int( bndboxes[0,max_h,max_w,1] + max_h*down_sample_factor) bndbox2['xmax'] = int( bndboxes[0,max_h,max_w,2] + max_w*down_sample_factor) bndbox2['ymax'] = int( bndboxes[0,max_h,max_w,3] + max_h*down_sample_factor) bndbox2['xmin'] = int(bndbox2['xmin']*(IMAGE_WIDTH/square_size)) bndbox2['ymin'] = int(bndbox2['ymin']*(IMAGE_HEIGHT/square_size)) bndbox2['xmax'] = int(bndbox2['xmax']*(IMAGE_WIDTH/square_size)) bndbox2['ymax'] = int(bndbox2['ymax']*(IMAGE_HEIGHT/square_size)) print('----------------------------------------') print(str(max_h*14+max_w)) print('xmin: ' + str(bndbox2['xmin'])) print('xmax: ' + str(bndbox2['xmax'])) print('ymin: ' + str(bndbox2['ymin'])) print('ymax: ' + str(bndbox2['ymax'])) cv2.rectangle(im, (int(max_w*down_sample_factor*(IMAGE_WIDTH/square_size)),int(max_h*down_sample_factor*(IMAGE_HEIGHT/square_size))), (int((max_w+1)*down_sample_factor*(IMAGE_WIDTH/square_size)),int((max_h+1)*down_sample_factor*(IMAGE_HEIGHT/square_size))), (0,0,255), 1) cv2.rectangle(im, (true_bndbox['xmin'], true_bndbox['ymin']), (true_bndbox['xmax'], true_bndbox['ymax']), (255,0,0), 2) cv2.rectangle(im, (bndbox2['xmin'], bndbox2['ymin']), (bndbox2['xmax'],bndbox2['ymax']), (0,255,0), 2) cv2.imshow('image', im) cv2.imwrite('images_log/' + image, im) cv2.waitKey(800) def run_single_image(model, sess_init, image): print('Running single image!') if MONITOR == 1: monitor_names = ['conv_class_out', 'image_out', 'conv1_out', 'pool1_out', 'fire1_out', 'pool2_out', 'pool3_out', 'fire5_out', 'fire6_out', 'fire7_out'] else: monitor_names = [] output_names = ['objdetect_out', 'bndbox_out'] output_names.extend(monitor_names) pred_config = PredictConfig( model=model, session_init=sess_init, input_names=['input'], output_names=output_names ) predictor = OfflinePredictor(pred_config) if REAL_IMAGE == 1: im = cv2.imread(image, cv2.IMREAD_COLOR) im = cv2.resize(im, (square_size, square_size)) cv2.imwrite('test_image.png', im) im = im.reshape((1, square_size, square_size, 3)) else: im = np.zeros((1, square_size, square_size, 3)) k = 0 for h in range(0, square_size): for w in range(0,square_size): for c in range (0,3): # im[0][h][w][c] = 0 im[0][h][w][c] = k%256 k += 1 outputs = predictor([im]) objdetect = outputs[0] bndboxes = outputs[1] max_pred = -100 max_h = -1 max_w = -1 for h in range(0, objdetect.shape[1]): for w in range(0, objdetect.shape[2]): if objdetect[0, h, w] > max_pred: max_pred = objdetect[0, h, w] max_h = h max_w = w bndbox2 = {} bndbox2['xmin'] = int( bndboxes[0,max_h,max_w,0] + max_w*down_sample_factor) bndbox2['ymin'] = int( bndboxes[0,max_h,max_w,1] + max_h*down_sample_factor) bndbox2['xmax'] = int( bndboxes[0,max_h,max_w,2] + max_w*down_sample_factor) bndbox2['ymax'] = int( bndboxes[0,max_h,max_w,3] + max_h*down_sample_factor) bndbox2['xmin'] = int(bndbox2['xmin']*(640/square_size)) bndbox2['ymin'] = int(bndbox2['ymin']*(360/square_size)) bndbox2['xmax'] = int(bndbox2['xmax']*(640/square_size)) bndbox2['ymax'] = int(bndbox2['ymax']*(360/square_size)) # im = cv2.imread(image, cv2.IMREAD_COLOR) # cv2.rectangle(im, (bndbox2['xmin'], bndbox2['ymin']), (bndbox2['xmax'],bndbox2['ymax']), (0,255,0), 2) # cv2.imshow('image', im) # cv2.waitKey(2000) print('max_h: ' + str(max_h)) print('max_w: ' + str(max_w)) print('objdetect: ' + str(objdetect)) print('bndboxes: ' + str(bndboxes[0,max_h,max_w])) index = 2 for o in monitor_names: print(o + ', shape: ' + str(outputs[index].shape) ) if 'image' not in o: print(str(outputs[index])) if len(outputs[index].shape) == 4: file_name = o.split('/')[-1] print('Writing file... ' + file_name) if not os.path.exists('./log'): os.makedirs('./log') with open('./log/' + file_name + '.log', 'w') as f: for sample in range(0, outputs[index].shape[0]): for h in range(0, outputs[index].shape[1]): for w in range(0, outputs[index].shape[2]): res = '' for c in range(0, outputs[index].shape[3]): if 'image' in file_name: res = hexFromInt( int(outputs[index][sample, h, w, c]), 8 ) + '_' + res elif 'noact' in file_name: temp = (2**FACTOR_SCALE_BITS)*outputs[index][sample, h, w, c] res = hexFromInt( int(temp), 32 ) + '_' + res else: res = hexFromInt( int(outputs[index][sample, h, w, c]), BITA) + '_' + res f.write('0x' + res + '\n') index += 1 def dump_weights(meta, model, output): fw, fa, fg = get_dorefa(BITW, BITA, BITG) with tf.Graph().as_default() as G: tf.train.import_meta_graph(meta) init = get_model_loader(model) sess = tf.Session(config=tf.ConfigProto(allow_soft_placement=True)) sess.run(tf.global_variables_initializer()) init.init(sess) with sess.as_default(): if output: if output.endswith('npy') or output.endswith('npz'): varmanip.dump_session_params(output) else: var = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES) var.extend(tf.get_collection(tf.GraphKeys.MODEL_VARIABLES)) var_dict = {} for v in var: name = varmanip.get_savename_from_varname(v.name) var_dict[name] = v logger.info("Variables to dump:") logger.info(", ".join(var_dict.keys())) saver = tf.train.Saver( var_list=var_dict, write_version=tf.train.SaverDef.V2) saver.save(sess, output, write_meta_graph=False) network_model = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES) network_model.extend(tf.get_collection(tf.GraphKeys.MODEL_VARIABLES)) target_frequency = 200000000 target_FMpS = 300 non_quantized_layers = ['conv1/Conv2D', 'conv_obj/Conv2D', 'conv_box/Conv2D'] json_out, layers_list, max_cycles = generateLayers(sess, BITA, BITW, non_quantized_layers, target_frequency, target_FMpS) achieved_FMpS = target_frequency/max_cycles if DEMO_DATASET == 0: generateConfig(layers_list, 'halfsqueezenet-config.h') genereateHLSparams(layers_list, network_model, 'halfsqueezenet-params.h', fw) else: generateConfig(layers_list, 'halfsqueezenet-config_demo.h') genereateHLSparams(layers_list, network_model, 'halfsqueezenet-params_demo.h', fw) print('|---------------------------------------------------------|') print('target_FMpS: ' + str(target_FMpS) ) print('achieved_FMpS: ' + str(achieved_FMpS) ) if __name__ == '__main__': print('Start') parser = argparse.ArgumentParser() parser.add_argument('dump2_train1_test0', help='dump(2), train(1) or test(0)') parser.add_argument('--model', help='model file') parser.add_argument('--meta', help='metagraph file') parser.add_argument('--output', help='output for dumping') parser.add_argument('--gpu', help='the physical ids of GPUs to use') parser.add_argument('--data', help='DAC dataset dir') parser.add_argument('--run', help='directory of images to test') parser.add_argument('--weights', help='weights file') args = parser.parse_args() print('Using GPU ' + str(args.gpu)) if args.gpu: os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu print(str(args.dump2_train1_test0)) if args.dump2_train1_test0 == '1': if args.data == None: print('Provide DAC dataset path with --data') sys.exit() config = get_config() if args.model: config.session_init = SaverRestore(args.model) SimpleTrainer(config).train() elif args.dump2_train1_test0 == '0': if args.run == None: print('Provide images with --run ') sys.exit() if args.weights == None: print('Provide weights file (.npy) for testing!') sys.exit() assert args.weights.endswith('.npy') run_image(Model(), DictRestore(np.load(args.weights, encoding='latin1').item()), args.run) elif args.dump2_train1_test0 == '2': if args.meta == None: print('Provide meta file (.meta) for dumping') sys.exit() if args.model == None: print('Provide model file (.data-00000-of-00001) for dumping') sys.exit() dump_weights(args.meta, args.model, args.output) elif args.dump2_train1_test0 == '3': if args.run == None: print('Provide image with --run ') sys.exit() if args.weights == None: print('Provide weights file (.npy) for testing!') sys.exit() assert args.weights.endswith('.npy') run_single_image(Model(), DictRestore(np.load(args.weights, encoding='latin1').item()), args.run)
用PyQt5,通过url获取页面,然后把整个页面截屏
import sys from PyQt5.QtCore import QUrl from PyQt5.QtWidgets import QApplication from PyQt5.QtWebEngineWidgets import QWebEnginePage, QWebEngineView app = QApplication(sys.argv) browser = QWebEngineView() browser.load(QUrl("http://news.baidu.com/?tn=news")) browser.show() app.exec_() r = WebRender(url) html = r.frame.toHtml() page = etree.HTML(html.encode('utf-8')) 这个页面整个截屏
爬虫过程中遇到报错:ValueError: can only parse strings
源代码如下: import requests import json from requests.exceptions import RequestException import time from lxml import etree def get_one_page(url): try: headers = { 'User-Agent': 'Mozilla/5.0(Macintosh;Intel Mac OS X 10_13_3) AppleWebKit/537.36(KHTML,like Gecko) Chorme/65.0.3325.162 Safari/537.36' } response = requests.get(url,headers = headers) if response.status_code == 200: return response.text return None except RequestException: return None def parse_one_page(html): html_coner = etree.HTML(html) pattern = html_coner.xpath('//div[@id="container"]/div[@id="main"/div[@class = "ywnr_box"]//a/text()') return pattern def write_to_file(content): with open('results.txt','a',encoding='utf-8') as f: f.write(json.dumps(content,ensure_ascii=False)+'\n') def main(offset): url = 'http://www.cdpf.org.cn/yw/index_'+str(offset)+'.shtml' html = get_one_page(url) for item in parse_one_page(html): print(item) write_to_file(item) if __name__ == '__main__': for i in range(6): main(offset=i*10) time.sleep(1) 请问各位大佬到底是哪里出了错??
scrapy新手:Scrapy报错 报错如下 请问是什么问题导致的
请问这个问题是怎么回事?网上昨天搜了一天也没找到答案。 [scrapy] ERROR: Spider error processing <GET https://www.douban.com/doulist/1264675/> (referer: None) Traceback (most recent call last): File "F:\PythonPacket\lib\site-packages\scrapy\utils\defer.py", line 102, in iter_errback yield next(it) File "F:\PythonPacket\lib\site-packages\scrapy\spidermiddlewares\offsite.py", line 29, in process_spider_output for x in result: File "F:\PythonPacket\lib\site-packages\scrapy\spidermiddlewares\referer.py", line 22, in <genexpr> return (_set_referer(r) for r in result or ()) File "F:\PythonPacket\lib\site-packages\scrapy\spidermiddlewares\urllength.py", line 37, in <genexpr> return (r for r in result or () if _filter(r)) File "F:\PythonPacket\lib\site-packages\scrapy\spidermiddlewares\depth.py", line 58, in <genexpr> return (r for r in result or () if _filter(r)) File "F:\doubanbook\doubanbook\spiders\dbbook.py", line 22, in parse author = re.search('<div class="abstract">(.*?)<br',each.extract(),re.S).group(1) File "F:\PythonPacket\lib\site-packages\parsel\selector.py", line 251, in extract with_tail=False) File "lxml.etree.pyx", line 2624, in lxml.etree.tostring (src/lxml/lxml.etree.c:49461) File "serializer.pxi", line 105, in lxml.etree._tostring (src/lxml/lxml.etree.c:79144) LookupError: unknown encoding: 'unicode'
终于明白阿里百度这样的大公司,为什么面试经常拿ThreadLocal考验求职者了
点击上面↑「爱开发」关注我们每晚10点,捕获技术思考和创业资源洞察什么是ThreadLocalThreadLocal是一个本地线程副本变量工具类,各个线程都拥有一份线程私有的数
《奇巧淫技》系列-python!!每天早上八点自动发送天气预报邮件到QQ邮箱
此博客仅为我业余记录文章所用,发布到此,仅供网友阅读参考,如有侵权,请通知我,我会删掉。 补充 有不少读者留言说本文章没有用,因为天气预报直接打开手机就可以收到了,为何要多此一举发送到邮箱呢!!!那我在这里只能说:因为你没用,所以你没用!!! 这里主要介绍的是思路,不是天气预报!不是天气预报!!不是天气预报!!!天气预报只是用于举例。请各位不要再刚了!!! 下面是我会用到的两个场景: 每日下
面试官问我:什么是消息队列?什么场景需要他?用了会出现什么问题?
你知道的越多,你不知道的越多 点赞再看,养成习惯 GitHub上已经开源 https://github.com/JavaFamily 有一线大厂面试点脑图、个人联系方式,欢迎Star和完善 前言 消息队列在互联网技术存储方面使用如此广泛,几乎所有的后端技术面试官都要在消息队列的使用和原理方面对小伙伴们进行360°的刁难。 作为一个在互联网公司面一次拿一次Offer的面霸,打败了无数
8年经验面试官详解 Java 面试秘诀
    作者 | 胡书敏 责编 | 刘静 出品 | CSDN(ID:CSDNnews) 本人目前在一家知名外企担任架构师,而且最近八年来,在多家外企和互联网公司担任Java技术面试官,前后累计面试了有两三百位候选人。在本文里,就将结合本人的面试经验,针对Java初学者、Java初级开发和Java开发,给出若干准备简历和准备面试的建议。   Java程序员准备和投递简历的实
究竟你适不适合买Mac?
我清晰的记得,刚买的macbook pro回到家,开机后第一件事情,就是上了淘宝网,花了500元钱,找了一个上门维修电脑的师傅,上门给我装了一个windows系统。。。。。。 表砍我。。。 当时买mac的初衷,只是想要个固态硬盘的笔记本,用来运行一些复杂的扑克软件。而看了当时所有的SSD笔记本后,最终决定,还是买个好(xiong)看(da)的。 已经有好几个朋友问我mba怎么样了,所以今天尽量客观
MyBatis研习录(01)——MyBatis概述与入门
C语言自学完备手册(33篇) Android多分辨率适配框架 JavaWeb核心技术系列教程 HTML5前端开发实战系列教程 MySQL数据库实操教程(35篇图文版) 推翻自己和过往——自定义View系列教程(10篇) 走出思维困境,踏上精进之路——Android开发进阶精华录 讲给Android程序员看的前端系列教程(40集免费视频教程+源码) 版权声明 本文原创作者:谷哥的小弟 作者博客
程序员一般通过什么途径接私活?
二哥,你好,我想知道一般程序猿都如何接私活,我也想接,能告诉我一些方法吗? 上面是一个读者“烦不烦”问我的一个问题。其实不止是“烦不烦”,还有很多读者问过我类似这样的问题。 我接的私活不算多,挣到的钱也没有多少,加起来不到 20W。说实话,这个数目说出来我是有点心虚的,毕竟太少了,大家轻喷。但我想,恰好配得上“一般程序员”这个称号啊。毕竟苍蝇再小也是肉,我也算是有经验的人了。 唾弃接私活、做外
Python爬虫爬取淘宝,京东商品信息
小编是一个理科生,不善长说一些废话。简单介绍下原理然后直接上代码。 使用的工具(Python+pycharm2019.3+selenium+xpath+chromedriver)其中要使用pycharm也可以私聊我selenium是一个框架可以通过pip下载 pip install selenium -i https://pypi.tuna.tsinghua.edu.cn/simple/ 
阿里程序员写了一个新手都写不出的低级bug,被骂惨了。
你知道的越多,你不知道的越多 点赞再看,养成习惯 本文 GitHub https://github.com/JavaFamily 已收录,有一线大厂面试点思维导图,也整理了很多我的文档,欢迎Star和完善,大家面试可以参照考点复习,希望我们一起有点东西。 前前言 为啥今天有个前前言呢? 因为你们的丙丙啊,昨天有牌面了哟,直接被微信官方推荐,知乎推荐,也就仅仅是还行吧(心里乐开花)
Java工作4年来应聘要16K最后没要,细节如下。。。
前奏: 今天2B哥和大家分享一位前几天面试的一位应聘者,工作4年26岁,统招本科。 以下就是他的简历和面试情况。 基本情况: 专业技能: 1、&nbsp;熟悉Sping了解SpringMVC、SpringBoot、Mybatis等框架、了解SpringCloud微服务 2、&nbsp;熟悉常用项目管理工具:SVN、GIT、MAVEN、Jenkins 3、&nbsp;熟悉Nginx、tomca
Python爬虫精简步骤1 获取数据
爬虫的工作分为四步: 1.获取数据。爬虫程序会根据我们提供的网址,向服务器发起请求,然后返回数据。 2.解析数据。爬虫程序会把服务器返回的数据解析成我们能读懂的格式。 3.提取数据。爬虫程序再从中提取出我们需要的数据。 4.储存数据。爬虫程序把这些有用的数据保存起来,便于你日后的使用和分析。 这一篇的内容就是:获取数据。 首先,我们将会利用一个强大的库——requests来获取数据。 在电脑上安装
Python绘图,圣诞树,花,爱心 | Turtle篇
1.画圣诞树 import turtle screen = turtle.Screen() screen.setup(800,600) circle = turtle.Turtle() circle.shape('circle') circle.color('red') circle.speed('fastest') circle.up() square = turtle.Turtle()
作为一个程序员,CPU的这些硬核知识你必须会!
CPU对每个程序员来说,是个既熟悉又陌生的东西? 如果你只知道CPU是中央处理器的话,那可能对你并没有什么用,那么作为程序员的我们,必须要搞懂的就是CPU这家伙是如何运行的,尤其要搞懂它里面的寄存器是怎么一回事,因为这将让你从底层明白程序的运行机制。 随我一起,来好好认识下CPU这货吧 把CPU掰开来看 对于CPU来说,我们首先就要搞明白它是怎么回事,也就是它的内部构造,当然,CPU那么牛的一个东
破14亿,Python分析我国存在哪些人口危机!
2020年1月17日,国家统计局发布了2019年国民经济报告,报告中指出我国人口突破14亿。 猪哥的朋友圈被14亿人口刷屏,但是很多人并没有看到我国复杂的人口问题:老龄化、男女比例失衡、生育率下降、人口红利下降等。 今天我们就来分析一下我们国家的人口数据吧! 更多有趣分析教程,扫描下方二维码关注vx公号「裸睡的猪」 即可查看! 一、背景 1.人口突破14亿 2020年1月17日,国家统计局发布
web前端javascript+jquery知识点总结
Javascript javascript 在前端网页中占有非常重要的地位,可以用于验证表单,制作特效等功能,它是一种描述语言,也是一种基于对象(Object)和事件驱动并具有安全性的脚本语言 ,语法同java类似,是一种解释性语言,边执行边解释。 JavaScript的组成: ECMAScipt 用于描述: 语法,变量和数据类型,运算符,逻辑控制语句,关键字保留字,对象。 浏览器对象模型(Br
Python实战:抓肺炎疫情实时数据,画2019-nCoV疫情地图
文章目录1. 前言2. 数据下载3. 数据处理4. 数据可视化 1. 前言 今天,群里白垩老师问如何用python画武汉肺炎疫情地图。白垩老师是研究海洋生态与地球生物的学者,国家重点实验室成员,于不惑之年学习python,实为我等学习楷模。先前我并没有关注武汉肺炎的具体数据,也没有画过类似的数据分布图。于是就拿了两个小时,专门研究了一下,遂成此文。 2月6日追记:本文发布后,腾讯的数据源多次变更u
听说想当黑客的都玩过这个Monyer游戏(1~14攻略)
第零关 进入传送门开始第0关(游戏链接) 请点击链接进入第1关: 连接在左边→ ←连接在右边 看不到啊。。。。(只能看到一堆大佬做完的留名,也能看到菜鸡的我,在后面~~) 直接fn+f12吧 &lt;span&gt;连接在左边→&lt;/span&gt; &lt;a href="first.php"&gt;&lt;/a&gt; &lt;span&gt;←连接在右边&lt;/span&gt; o
在家远程办公效率低?那你一定要收好这个「在家办公」神器!
相信大家都已经收到国务院延长春节假期的消息,接下来,在家远程办公可能将会持续一段时间。 但是问题来了。远程办公不是人在电脑前就当坐班了,相反,对于沟通效率,文件协作,以及信息安全都有着极高的要求。有着非常多的挑战,比如: 1在异地互相不见面的会议上,如何提高沟通效率? 2文件之间的来往反馈如何做到及时性?如何保证信息安全? 3如何规划安排每天工作,以及如何进行成果验收? ......
作为一个程序员,内存和磁盘的这些事情,你不得不知道啊!!!
截止目前,我已经分享了如下几篇文章: 一个程序在计算机中是如何运行的?超级干货!!! 作为一个程序员,CPU的这些硬核知识你必须会! 作为一个程序员,内存的这些硬核知识你必须懂! 这些知识可以说是我们之前都不太重视的基础知识,可能大家在上大学的时候都学习过了,但是嘞,当时由于老师讲解的没那么有趣,又加上这些知识本身就比较枯燥,所以嘞,大家当初几乎等于没学。 再说啦,学习这些,也看不出来有什么用啊!
渗透测试-灰鸽子远控木马
木马概述 灰鸽子( Huigezi),原本该软件适用于公司和家庭管理,其功能十分强大,不但能监视摄像头、键盘记录、监控桌面、文件操作等。还提供了黑客专用功能,如:伪装系统图标、随意更换启动项名称和表述、随意更换端口、运行后自删除、毫无提示安装等,并采用反弹链接这种缺陷设计,使得使用者拥有最高权限,一经破解即无法控制。最终导致被黑客恶意使用。原作者的灰鸽子被定义为是一款集多种控制方式于一体的木马程序
Python:爬取疫情每日数据
前言 有部分同学留言说为什么412,这是因为我代码里全国的cookies需要你自己打开浏览器更新好后替换,而且这个cookies大概只能持续20秒左右! 另外全国卫健委的数据格式一直在变,也有可能会导致爬取失败! 我现在已根据2月14日最新通报稿的格式修正了! 目前每天各大平台,如腾讯、今日头条都会更新疫情每日数据,他们的数据源都是一样的,主要都是通过各地的卫健委官网通报。 为什么已经有大量平台做
这个世界上人真的分三六九等,你信吗?
偶然间,在知乎上看到一个问题 一时间,勾起了我深深的回忆。 以前在厂里打过两次工,做过家教,干过辅导班,做过中介。零下几度的晚上,贴过广告,满脸、满手地长冻疮。   再回首那段岁月,虽然苦,但让我学会了坚持和忍耐。让我明白了,在这个世界上,无论环境多么的恶劣,只要心存希望,星星之火,亦可燎原。   下文是原回答,希望能对你能有所启发。   如果我说,这个世界上人真的分三六九等,
B 站上有哪些很好的学习资源?
哇说起B站,在小九眼里就是宝藏般的存在,放年假宅在家时一天刷6、7个小时不在话下,更别提今年的跨年晚会,我简直是跪着看完的!! 最早大家聚在在B站是为了追番,再后来我在上面刷欧美新歌和漂亮小姐姐的舞蹈视频,最近两年我和周围的朋友们已经把B站当作学习教室了,而且学习成本还免费,真是个励志的好平台ヽ(.◕ฺˇд ˇ◕ฺ;)ノ 下面我们就来盘点一下B站上优质的学习资源: 综合类 Oeasy: 综合
雷火神山直播超两亿,Web播放器事件监听是怎么实现的?
Web播放器解决了在手机浏览器和PC浏览器上播放音视频数据的问题,让视音频内容可以不依赖用户安装App,就能进行播放以及在社交平台进行传播。在视频业务大数据平台中,播放数据的统计分析非常重要,所以Web播放器在使用过程中,需要对其内部的数据进行收集并上报至服务端,此时,就需要对发生在其内部的一些播放行为进行事件监听。 那么Web播放器事件监听是怎么实现的呢? 01 监听事件明细表 名
3万字总结,Mysql优化之精髓
本文知识点较多,篇幅较长,请耐心学习 MySQL已经成为时下关系型数据库产品的中坚力量,备受互联网大厂的青睐,出门面试想进BAT,想拿高工资,不会点MySQL优化知识,拿offer的成功率会大大下降。 为什么要优化 系统的吞吐量瓶颈往往出现在数据库的访问速度上 随着应用程序的运行,数据库的中的数据会越来越多,处理时间会相应变慢 数据是存放在磁盘上的,读写速度无法和内存相比 如何优化 设计
Python新型冠状病毒疫情数据自动爬取+统计+发送报告+数据屏幕(三)发送篇
今天介绍的项目是使用 Itchat 发送统计报告 项目功能设计: 定时爬取疫情数据存入Mysql 进行数据分析制作疫情报告 使用itchat给亲人朋友发送分析报告(本文) 基于Django做数据屏幕 使用Tableau做数据分析 来看看最终效果 目前已经完成,预计2月12日前更新 使用 itchat 发送数据统计报告 itchat 是一个基于 web微信的一个框架,但微信官方并不允
作为程序员的我,大学四年一直自学,全靠这些实用工具和学习网站!
我本人因为高中沉迷于爱情,导致学业荒废,后来高考,毫无疑问进入了一所普普通通的大学,实在惭愧...... 我又是那么好强,现在学历不行,没办法改变的事情了,所以,进入大学开始,我就下定决心,一定要让自己掌握更多的技能,尤其选择了计算机这个行业,一定要多学习技术。 在进入大学学习不久后,我就认清了一个现实:我这个大学的整体教学质量和学习风气,真的一言难尽,懂的人自然知道怎么回事? 怎么办?我该如何更好的提升
粒子群算法求解物流配送路线问题(python)
粒子群算法求解物流配送路线问题(python) 1.查找论文文献 找一篇物流配送路径优化+粒子群算法求解的论文 参考文献:基于混沌粒子群算法的物流配送路径优化 2.了解粒子群算法的原理 讲解通俗易懂,有数学实例的博文:https://blog.csdn.net/daaikuaichuan/article/details/81382794 3.确定编码方式和解码策略 3.1编码方式 物流配送路线的
教你如何编写第一个简单的爬虫
很多人知道爬虫,也很想利用爬虫去爬取自己想要的数据,那么爬虫到底怎么用呢?今天就教大家编写一个简单的爬虫。 下面以爬取笔者的个人博客网站为例获取第一篇文章的标题名称,教大家学会一个简单的爬虫。 第一步:获取页面 #!/usr/bin/python # coding: utf-8 import requests #引入包requests link = "http://www.santostang.
前端JS初级面试题二 (。•ˇ‸ˇ•。)老铁们!快来瞧瞧自己都会了么
1. 传统事件绑定和符合W3C标准的事件绑定有什么区别? 传统事件绑定 &lt;div onclick=""&gt;123&lt;/div&gt; div1.onclick = function(){}; &lt;button onmouseover=""&gt;&lt;/button&gt; 注意: 如果给同一个元素绑定了两次或多次相同类型的事件,那么后面的绑定会覆盖前面的绑定 (不支持DOM事...
相关热词 c# 时间比天数 c# oracle查询 c# 主动推送 事件 c# java 属性 c# 控制台 窗体 c# 静态类存值 c#矢量作图 c#窗体调用外部程式 c# enum是否合法 c# 如何卸载引用
立即提问