问个问题,各位小伙伴:
# 查找某个字段是否重复 '单号',删除所有重复的行,一个不留,形成 df2表,原来df1和df2做差吧(广义上的),单拉出一张表 df3,发现行数df3+df2<df1?
import pandas as pd
df1 = pd.read_pickle(r'E:\data.pickle')
df2 = df1.drop_duplicates(subset='单号', keep=False)
df2_list = df2.index.tolist() # 列表,索引依据
df3 = df1[~df1.index.isin(df2_list)] # 广义做差