疫情期间助力学生的学习,某微课网有不少微课专辑,希望能抓取微课专辑中的超链接与对应的标题,制作移动码课。
http://wk.zjer.cn/v-play-40037-98669.htm
<div class="class_hg">
<a href="/v-cplay1-40037-154639.htm" target="_blank">
<img src="http://wkfile.zjer.cn/upload_dir/40037/vod/2019/10/18/87ce7d0330bfc9517270381e3b167cb5.jpg"/>
</a>
<a href="/v-cplay1-40037-154639.htm" class="class_hg_a">乙烯</a>
</div>
<div class="class_hg">
<a href="/v-cplay1-40037-154922.htm" target="_blank">
<img src="http://wkfile.zjer.cn/upload_dir/40037/vod/2019/10/19/d5fca1495fed68702ad55756d1983883.jpg"/>
</a>
<a href="/v-cplay1-40037-154922.htm" class="class_hg_a">乙炔</a>
</div>
<div class="class_hg">
<a href="/v-cplay1-40037-154923.htm" target="_blank">
<img src="http://wkfile.zjer.cn/upload_dir/40037/vod/2019/10/19/cbe3fc59e44dc4abab193e7daa2a1a36.jpg"/>
</a>
<a href="/v-cplay1-40037-154923.htm" class="class_hg_a">苯</a>
</div>
<div class="class_hg">
<a href="/v-cplay1-40037-154760.htm" target="_blank">
<img src="http://wkfile.zjer.cn/upload_dir/40037/vod/2019/10/19/f5145b48f6457698e04ed67f50301e80.jpg"/>
</a>
<a href="/v-cplay1-40037-154760.htm" class="class_hg_a">乙醇</a>
</div>
<div class="class_hg">
<a href="/v-cplay1-40037-154761.htm" target="_blank">
<img src="http://wkfile.zjer.cn/upload_dir/40037/vod/2019/10/19/6b952046bba05b91d363abdfe79b0e21.jpg"/>
</a>
<a href="/v-cplay1-40037-154761.htm" class="class_hg_a">乙醛</a>
</div>
<div class="class_hg">
<a href="/v-cplay1-40037-154924.htm" target="_blank">
<img src="http://wkfile.zjer.cn/upload_dir/40037/vod/2019/10/19/ba40caf43ed4d510d8257746364411d6.jpg"/>
</a>
<a href="/v-cplay1-40037-154924.htm" class="class_hg_a">乙酸</a>
</div>
<div class="class_hg">
<a href="/v-cplay1-40037-154961.htm" target="_blank">
<img src="http://wkfile.zjer.cn/upload_dir/40037/vod/2019/10/19/2c56ade872e083b3e71952ad0629c047.jpg"/>
</a>
<a href="/v-cplay1-40037-154961.htm" class="class_hg_a">乙酸乙酯</a>
</div>
<div class="class_hg">
<a href="/v-cplay1-40037-154937.htm" target="_blank">
<img src="http://wkfile.zjer.cn/upload_dir/40037/vod/2019/10/19/32659b9bf297f66bfc0798fcbdabcfeb.jpg"/>
</a>
<a href="/v-cplay1-40037-154937.htm" class="class_hg_a">糖类</a>
</div>
<div class="class_hg">
<a href="/v-cplay1-40037-154977.htm" target="_blank">
<img src="http://wkfile.zjer.cn/upload_dir/40037/vod/2019/10/19/1862de78c7f8981e8bba7a5f587f0ba7.jpg"/>
</a>
<a href="/v-cplay1-40037-154977.htm" class="class_hg_a">氨基酸</a>
</div>
初学python3.7数天,通过斗牛网页链接提取到了微课专辑中的微课超链接,也得到了网页的标题,但是微课专辑中的微课有多个,效率有点低,希望:
1、能希望能抓取微课专辑中的超链接与对应的标题,输出到txt文档中,方便分享以及制作移动码课。
2、希望能批量下载微课视频(不能实现也没有关系)
# coding:utf-8
#!/usr/bin/python
# -*- coding: UTF-8 -*-
from urllib.request import urlopen
from bs4 import BeautifulSoup
headers =("Use-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:74.0) Gecko/20100101 Firefox/74.0")
url = "http://wk.zjer.cn/v-play-40037-98669.htm"
html = urlopen("http://wk.zjer.cn/v-play-40037-98669.htm")
bsObj = BeautifulSoup(html.read(), 'lxml')
print(bsObj.title, url)
微课专辑中微课视频的超链接分别为:
http://wk.zjer.cn/v-cplay1-40037-154638.htm
http://wk.zjer.cn/v-cplay1-40037-154639.htm
http://wk.zjer.cn/v-cplay1-40037-154922.htm
http://wk.zjer.cn/v-cplay1-40037-154923.htm
http://wk.zjer.cn/v-cplay1-40037-154760.htm
http://wk.zjer.cn/v-cplay1-40037-154761.htm
http://wk.zjer.cn/v-cplay1-40037-154924.htm
http://wk.zjer.cn/v-cplay1-40037-154961.htm
http://wk.zjer.cn/v-cplay1-40037-154937.htm
http://wk.zjer.cn/v-cplay1-40037-154977.htm