对于这个问题,我写的代码是这样的
# coding: utf-8
import csv
TITLE = "【来源篇名】"
AUTHOR = "【来源作者】"
ORG = "【第一机构】"
KEYWORD = "【关 键 词】"
REFER = "【参考文献】"
# By WilliamsCarl。2019/3/24 2:21。
# 作者:王木槿天下第一。Copyright.
item_start = True
refer_start = False
headers = [TITLE, AUTHOR, ORG, KEYWORD, REFER]
items = []
with open('E:\\textOut.txt', encoding='gbk') as f:
row = {
'refer': []
}
for line in f.readlines():
line = line.strip()
# if not line:
# continue
if TITLE in line:
column = 0
tl = line.split(TITLE)[1]
refer_start = False
row['tl'] = tl
if AUTHOR in line:
column = 1
au = line.split(AUTHOR)[1]
row['au'] = au
if ORG in line:
column = 2
og = line.split(ORG)[1]
row['og'] = og
if KEYWORD in line:
column = 3
kw = line.split(KEYWORD)[1]
row['kw'] = kw
if REFER in line:
refer_start = True
continue
if refer_start:
row['refer'].append(line)
if refer_start and (not line):
refer_start = False
item = [row['tl'], row['au'], row['og'], row['kw']]
for r in row['refer']:
item.append(r)
items.append(item)
row = {
'refer': []
}
with open('E:\\result.csv','w') as c:
f_csv = csv.writer(c)
f_csv.writerow(headers)
f_csv.writerows(items)
目前的问题是,
比如【来源篇名】某一组不存在,他就会说keyerror,
当我将 i= [row['tl'], row['au'], row['og'], row['kw']] 变成row.setdefult,他的过滤又将一些存在的东西过滤掉了,如下图,