网址:# http://georoc.mpch-mainz.gwdg.de/georoc/Start.asp #
想爬取网页左侧导航rock内的所有csv文件,但是爬取的源代码里没有csv地址,请问如何解决,谢谢各位能人!
这是我的代码:
archive_url = "http://georoc.mpch-mainz.gwdg.de/georoc/Start.asp" # 网址链接
def get_video_links():
r = requests.get(archive_url)
soup = BeautifulSoup(r.content, 'html.parser')
print(soup)
item = soup.find_all('tb', class_="arialtb12") #csv所在代码位置
print(item)
if __name__ == "__main__":
video_links = get_video_links()
以下是我获取的html:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN" "http://www.w3.org/TR/html4/frameset.dtd">
<!-- saved from url=(0049)http://georoc.mpch-mainz.gwdg.de/georoc/Start.asp -->
<html><head><meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes">
<meta name="description" content="GEOROC - Geochemical Database on magmatic Rocks">
<meta name="keywords" lang="en-us" content="database, analyses, volcanic rocks, mantle xenoliths, major elements, trace element,
concentrations, radiogenic, nonradiogenic, isotope ratios, analytical, ages, whole rocks, volcanic, glasses, minerals, inclusions">
<meta name="keywords" lang="en" content="database, analyses, volcanic, rocks, mantle xenoliths, major elements, trace element,
concentrations, radiogenic, nonradiogenic, isotope ratios, igneous, analytical, ages, whole rocks, volcanic, glasses, minerals, inclusions">
<meta name="keywords" lang="de" content="Datenbank, geochemie, spurenelement, oxid, gehalte, vulkanite, isotopenverh鋖tnisse, magmatite, xenolithe, analytik, minerale">
<meta name="keywords" lang="it" content="Database , geochimica , contenuti, ossido, oligoelementi , vulcanici , rapporti isotopici , rocce ignee , xenoliti , analisi , minerali">
<title>Geochemical Rock Database-Query</title>
</head>
<frameset rows="100%" cols="18%,82%">
<frame frameborder="0" marginwidth="0" src="./Geochemical Rock Database-Query_files/Query.html" name="Query">
<frame frameborder="0" marginwidth="0" src="./Geochemical Rock Database-Query_files/QueryBlank.html" name="Search">
<noframes>
<body>
<table style="background-color:#99CCFF; padding:1px; border-width:3px; border-color: #000099; height: 8%; width: 40%; margin-left:auto; margin-right:auto; ">
<tr>
<td style="text-align:center;">
<a href="Start.asp"><b>Home</b> |</a> <a href="Content.htm"><b>Content</b> |</a>
</td>
</tr>
</table>
<h1 style="text-align:center;"><b>Query by</b></h1>
<p style="text-align:center;">
</p>
<h2 style="text-align:center;">
<a href="Authors.asp?Frames=no">1. Bibliography</a></h2>
<br/>
<h2 style="text-align:center;">
<a href="QueryLoc.asp?Frames=no">2. Location</a></h2>
<br/>
<h2 style="text-align:center;">
<a href="QueryChem.asp?Frames=no">3. Chemistry</a></h2>
<br/>
<br/>
<br/>
<table style="height: 8%; width:auto; margin-left:auto; margin-right:auto;">
<tr>
<td>
<a href="http://www.mpic.de">© MPI für Chemie, Mainz, Germany</a>
</td>
</tr>
</table>
<p style="text-align:center;">
<span style="font-family: Arial; color: #000033"> State: 05/01/2018 </span></p>
</body>
</noframes>
</frameset>
</html>