Gustab.M2 2021-12-29 07:23 采纳率: 100%
浏览 88
已结题

Pandas读取文件后列名无法对应

使用cosb(禁止词汇)ench进行测试的时候,生成的测试结果为csv文件 ,本来打算用pandas进行一下数据分析,却在第一步就卡住了。虽然现在使用读取普通文件的方式暂时解决了问题,但是pandas中遇到的问题却还没有解决,所以在这里请教一下各位朋友。

有一个文件,文件名为w108-8K-80%Read20%Write-160Thread.csv,共有7行,内容如下:

Stage,Op-Name,Op-Type,Op-Count,Byte-Count,Avg-ResTime,Avg-ProcTime,60%-ResTime,80%-ResTime,90%-ResTime,95%-ResTime,99%-ResTime,100%-ResTime,Throughput,Bandwidth,Succ-Ratio,Status,Detailed Status
s1-init,init-write,init,0,0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0,0,N/A,completed,waiting @ 2021-12-28 17:50:53,booting @ 2021-12-28 17:50:53,submitting @ 2021-12-28 17:50:53,authing @ 2021-12-28 17:50:53,launching @ 2021-12-28 17:50:53,running @ 2021-12-28 17:50:53,closing @ 2021-12-28 17:50:58,completed @ 2021-12-28 17:50:58
s2-prepare,prepare-write,prepare,100000,800000000,18.4,18.34,20,30,30,40,110,440,8870.62,70964947.96,100%,completed,waiting @ 2021-12-28 17:50:53,booting @ 2021-12-28 17:51:00,submitting @ 2021-12-28 17:50:59,authing @ 2021-12-28 17:50:59,launching @ 2021-12-28 17:50:59,running @ 2021-12-28 17:50:59,closing @ 2021-12-28 17:51:14,completed @ 2021-12-28 17:51:14
s3-main,read,read,15249196,121993568000,8.96,8.57,10,20,20,30,40,670,8471.9,67775177.22,100%,completed,waiting @ 2021-12-28 17:50:53,booting @ 2021-12-28 17:51:17,submitting @ 2021-12-28 17:51:17,authing @ 2021-12-28 17:51:17,launching @ 2021-12-28 17:51:17,running @ 2021-12-28 17:51:17,closing @ 2021-12-28 18:21:19,completed @ 2021-12-28 18:21:19
s3-main,write,write,3812554,30500432000,39.58,39.53,20,40,150,180,270,1430,2118.12,16944927.76,100%,completed,waiting @ 2021-12-28 17:50:53,booting @ 2021-12-28 17:51:17,submitting @ 2021-12-28 17:51:17,authing @ 2021-12-28 17:51:17,launching @ 2021-12-28 17:51:17,running @ 2021-12-28 17:51:17,closing @ 2021-12-28 18:21:19,completed @ 2021-12-28 18:21:19
s4-cleanup,cleanup-delete,cleanup,200000,0,31.32,31.32,20,40,70,170,260,830,5126.7,0,100%,completed,waiting @ 2021-12-28 17:50:53,booting @ 2021-12-28 18:21:20,submitting @ 2021-12-28 18:21:19,authing @ 2021-12-28 18:21:19,launching @ 2021-12-28 18:21:19,running @ 2021-12-28 18:21:19,closing @ 2021-12-28 18:22:04,completed @ 2021-12-28 18:22:05
s5-dispose,dispose-delete,dispose,0,0,N/A,N/A,N/A,N/A,N/A,N/A,N/A,N/A,0,0,N/A,completed,waiting @ 2021-12-28 17:50:53,booting @ 2021-12-28 18:22:07,submitting @ 2021-12-28 18:22:07,authing @ 2021-12-28 18:22:07,launching @ 2021-12-28 18:22:07,running @ 2021-12-28 18:22:07,closing @ 2021-12-28 18:22:17,completed @ 2021-12-28 18:22:17

使用pandas进行读取

import pandas as pd
import numpy as np
import os

fpath = "E:\Python\AdvancedPython\data\w108-8K-80%Read20%Write-160Thread.csv"
w108 = pd.read_csv(fpath, sep=",", header=0)
print(w108.head())
columns = w108.columns  # 获取列名
print("="*20)
print(columns)
print("="*20)
print(w108['Stage'])

返回的数据很奇怪

E:\Python\AdvancedPython\venv\Scripts\python.exe E:/Python/AdvancedPython/test1.py
                                                                     Stage  ...                  Detailed Status
s1-init    init-write     init    0        0            NaN   NaN      NaN  ...  completed @ 2021-12-28 17:50:58
s2-prepare prepare-write  prepare 100000   800000000    18.40 18.34   20.0  ...  completed @ 2021-12-28 17:51:14
s3-main    read           read    15249196 121993568000 8.96  8.57    10.0  ...  completed @ 2021-12-28 18:21:19
           write          write   3812554  30500432000  39.58 39.53   20.0  ...  completed @ 2021-12-28 18:21:19
s4-cleanup cleanup-delete cleanup 200000   0            31.32 31.32   20.0  ...  completed @ 2021-12-28 18:22:05

[5 rows x 18 columns]
====================
Index(['Stage', 'Op-Name', 'Op-Type', 'Op-Count', 'Byte-Count', 'Avg-ResTime',
       'Avg-ProcTime', '60%-ResTime', '80%-ResTime', '90%-ResTime',
       '95%-ResTime', '99%-ResTime', '100%-ResTime', 'Throughput', 'Bandwidth',
       'Succ-Ratio', 'Status', 'Detailed Status'],
      dtype='object')
====================
s1-init     init-write      init     0         0             NaN    NaN       NaN
s2-prepare  prepare-write   prepare  100000    800000000     18.40  18.34    20.0
s3-main     read            read     15249196  121993568000  8.96   8.57     10.0
            write           write    3812554   30500432000   39.58  39.53    20.0
s4-cleanup  cleanup-delete  cleanup  200000    0             31.32  31.32    20.0
s5-dispose  dispose-delete  dispose  0         0             NaN    NaN       NaN
Name: Stage, dtype: float64

Process finished with exit code 0

按理说,以第一行作为列名,那Stage对应的数据就应该是s1-init、s2-prepare、s3-main这些内容,但实际返回的内容确实乱七八糟,毫无头绪,希望懂pandas的同学们能够帮我解惑。

展开全部

  • 写回答

2条回答 默认 最新

  • bekote 2021-12-29 07:56
    关注

    用逗号分隔读取列,表头数了一下18个,但是行不止18个,可能是某一列里的内容包括逗号,你可以先处理下文件,把含有逗号的内容加上双引号,或者读入后再合并处理

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论 编辑记录
    Gustab.M2 2021-12-29 08:43

    原来如此,没注意第一行和后面几行逗号的数量不一致。
    我说怎么会出现这么奇怪的事情呢。

    回复
    Gustab.M2 2021-12-29 08:44

    cosbench误我!

    回复
查看更多回答(1条)
编辑
预览

报告相同问题?

问题事件

  • 系统已结题 1月5日
  • 已采纳回答 12月29日
  • 创建了问题 12月29日

悬赏问题

  • ¥15 python 用Dorc包报错,我的写法和网上教的是一样的但是它显示无效参数,是什么问题
  • ¥15 指定IP电脑的访问设置
  • ¥30 matlab ode45 未发现警告,但是运行出错
  • ¥15 vscode platformio
  • ¥15 代写uni代码,app唤醒
  • ¥15 全志t113i启动qt应用程序提示internal error
  • ¥15 ensp可以看看嘛.
  • ¥80 51单片机C语言代码解决单片机为AT89C52是清翔单片机
  • ¥60 优博讯DT50高通安卓11系统刷完机自动进去fastboot模式
  • ¥15 minist数字识别
手机看
程序员都在用的中文IT技术交流社区

程序员都在用的中文IT技术交流社区

专业的中文 IT 技术社区,与千万技术人共成长

专业的中文 IT 技术社区,与千万技术人共成长

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

关注【CSDN】视频号,行业资讯、技术分享精彩不断,直播好礼送不停!

客服 返回
顶部