问题:我需要把下面wikidata的某些数据提取出来,最后提取结果保存为json,xlsx等文件。
1.文件太大,就不上传了,文件内容来自wikidata。
2.具体来说,提取某个人的father,mother,sibling,spouse,child,relative,sex or gender, country of citizenship这些。
1.文件太大,就不上传了,文件内容来自wikidata。
2.具体来说,提取某个人的father,mother,sibling,spouse,child,relative,sex or gender, country of citizenship这些。
import ijson
user_to_repos = {}
with open("large-file.json", "rb") as f:
for record in ijson.items(f, "item"):
user = record["actor"]["login"]
repo = record["repo"]["name"]
if user not in user_to_repos:
user_to_repos[user] = set()
user_to_repos[user].add(repo)
with open('big_json_array.json', 'w') as out:
json.dump(data, out)