问题遇到的现象和发生背景
从3个文件中分别将数据读取出来之后,想利用merge方法将数据合并,当进行到第二次合并时,报keyerror的错误
问题相关代码,请勿粘贴截图
unit_data = pd.read_excel('../data/unit.xlsx', header=None, names=['id', 'name'])
group_data = pd.read_excel('../data/group.xlsx', header=None, names=['id', 'name', 'unit_id'])
person_data = pd.read_excel('../data/person.xlsx', header=None, names=['id', ' team_id', ' name', ' id_number', ' project_id', ' native_place', ' nation_code', ' birthday', ' type', ' is_duty', ' is_attendance', 'status', ' profession', ' manager_station'])
print('--------------------')
print(person_data.head)
df1 = pd.merge(left=group_data, right=unit_data, how='left', left_on='unit_id', right_on='id').rename(columns={'id_x': 'team_id', 'name_x': 'team_name', 'id_y': 'unit_id', 'name_y': 'unit_name'})
print(df1.head)
df2 = pd.merge(left=person_data, right=df1, how='left', left_on='team_id', right_on='team_id')
print(df2)
运行结果及报错内容
Traceback (most recent call last):
File "/Users/dengdai/workspace/bigdata/bigdata/pandas_test/code/data_contact.py", line 28, in <module>
df2 = pd.merge(left=person_data, right=df1, how='left', left_on='team_id', right_on='team_id')
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 87, in merge
validate=validate,
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 668, in __init__
) = self._get_merge_keys()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/reshape/merge.py", line 1046, in _get_merge_keys
left_keys.append(left._get_label_or_level_values(lk))
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/generic.py", line 1684, in _get_label_or_level_values
raise KeyError(key)
KeyError: 'team_id'
我的解答思路和尝试过的方法
我尝试修改person_data的columns的列名,将id定义成person_id,team_id定义成ID,运行下来还是报错
我想要达到的结果
我想按照person_data和df1按照team_id的值进行合并,得到完整的用户信息表