import pandas as pd
from statsmodels.graphics.mosaicplot import mosaic
data=pd.read_csv('adult.csv')
cols=['age','education_num','capital_gain','capital_loss','hours_per_week','label']
data=data[cols]
cross1=pd.crosstab(pd.qcut(data['education_num'],[0,0.25,0.5,0.75,1]),data['label'],margins=True)
print(cross1)
最后的输出的结果是这样:
label <=50K >50K All
education_num
(0.999, 9.0] 12835 1919 7291
(9.0, 10.0] 5904 1387 2449
(10.0, 12.0] 1823 626 8067
(12.0, 16.0] 4158 3909 14754
All 24720 7841 32561
为什么右边All那一栏会全错位了呢?
而且cross1这个对象在pycharm里不能View as DataFram,点了之后是个空的
还有,qcut函数里有个参数好像异常,一直显示一个高亮框不知道为什么