为什么在 c + + 中读取 stdin 行比读取 Python 慢得多？

I wanted to compare reading lines of string input from stdin using Python and C++ and was shocked to see my C++ code run an order of magnitude slower than the equivalent Python code. Since my C++ is rusty and I'm not yet an expert Pythonista, please tell me if I'm doing something wrong or if I'm misunderstanding something.

(TLDR answer: include the statement: cin.sync_with_stdio(false) or just use fgets instead.

TLDR results: scroll all the way down to the bottom of my question and look at the table.)

C++ code:

#include <iostream>
#include <time.h>

using namespace std;

int main() {
    string input_line;
    long line_count = 0;
    time_t start = time(NULL);
    int sec;
    int lps;

    while (cin) {
        getline(cin, input_line);
        if (!cin.eof())
            line_count++;
    };

    sec = (int) time(NULL) - start;
    cerr << "Read " << line_count << " lines in " << sec << " seconds.";
    if (sec > 0) {
        lps = line_count / sec;
        cerr << " LPS: " << lps << endl;
    } else
        cerr << endl;
    return 0;
}

// Compiled with:
// g++ -O3 -o readline_test_cpp foo.cpp

Python Equivalent:

#!/usr/bin/env python
import time
import sys

count = 0
start = time.time()

for line in  sys.stdin:
    count += 1

delta_sec = int(time.time() - start_time)
if delta_sec >= 0:
    lines_per_sec = int(round(count/delta_sec))
    print("Read {0} lines in {1} seconds. LPS: {2}".format(count, delta_sec,
       lines_per_sec))

Here are my results:

$ cat test_lines | ./readline_test_cpp
Read 5570000 lines in 9 seconds. LPS: 618889

$cat test_lines | ./readline_test.py
Read 5570000 lines in 1 seconds. LPS: 5570000

I should note that I tried this both under Mac OS X v10.6.8 (Snow Leopard) and Linux 2.6.32 (Red Hat Linux 6.2). The former is a MacBook Pro, and the latter is a very beefy server, not that this is too pertinent.

$ for i in {1..5}; do echo "Test run $i at `date`"; echo -n "CPP:"; cat test_lines | ./readline_test_cpp ; echo -n "Python:"; cat test_lines | ./readline_test.py ; done
Test run 1 at Mon Feb 20 21:29:28 EST 2012
CPP:   Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 2 at Mon Feb 20 21:29:39 EST 2012
CPP:   Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 3 at Mon Feb 20 21:29:50 EST 2012
CPP:   Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 4 at Mon Feb 20 21:30:01 EST 2012
CPP:   Read 5570001 lines in 9 seconds. LPS: 618889
Python:Read 5570000 lines in 1 seconds. LPS: 5570000
Test run 5 at Mon Feb 20 21:30:11 EST 2012
CPP:   Read 5570001 lines in 10 seconds. LPS: 557000
Python:Read 5570000 lines in  1 seconds. LPS: 5570000

Tiny benchmark addendum and recap

For completeness, I thought I'd update the read speed for the same file on the same box with the original (synced) C++ code. Again, this is for a 100M line file on a fast disk. Here's the comparison, with several solutions/approaches:

Implementation      Lines per second
python (default)           3,571,428
cin (default/naive)          819,672
cin (no sync)             12,500,000
fgets                     14,285,714
wc (not fair comparison)  54,644,808

转载于:https://stackoverflow.com/questions/9371238/why-is-reading-lines-from-stdin-much-slower-in-c-than-python

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

10条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
℡Wang Yan 2012-02-21 03:24
关注
By default, cin is synchronized with stdio, which causes it to avoid any input buffering. If you add this to the top of your main, you should see much better performance:

std::ios_base::sync_with_stdio(false);

Normally, when an input stream is buffered, instead of reading one character at a time, the stream will be read in larger chunks. This reduces the number of system calls, which are typically relatively expensive. However, since the FILE* based stdio and iostreams often have separate implementations and therefore separate buffers, this could lead to a problem if both were used together. For example:

int myvalue1; cin >> myvalue1; int myvalue2; scanf("%d",&myvalue2);

If more input was read by cin than it actually needed, then the second integer value wouldn't be available for the scanf function, which has its own independent buffer. This would lead to unexpected results.

To avoid this, by default, streams are synchronized with stdio. One common way to achieve this is to have cin read each character one at a time as needed using stdio functions. Unfortunately, this introduces a lot of overhead. For small amounts of input, this isn't a big problem, but when you are reading millions of lines, the performance penalty is significant.

Fortunately, the library designers decided that you should also be able to disable this feature to get improved performance if you knew what you were doing, so they provided the sync_with_stdio method.
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(9条)

报告相同问题？

关注问题

为什么在 c + + 中读取 stdin 行比读取 Python 慢得多？ c++ ios python
2012-02-21 02:17

回答 9 已采纳 By default, cin is synchronized with stdio, which causes it to avoid any input buffering. If you
为什么sys.stdout.write写入到标准流之后使用sys.stdin.read无法读取？ python
2022-02-23 13:07

回答 2 已采纳 sys.stdin是标准输入啊，你需要从键盘输入内容才行，你自己输出的东西还想自己再读回来？
python sys.stdin 如何判断是否到达最后一行 python
2022-08-06 01:16

回答 3 已采纳为什么会有这样的需求？for循环结束的时候自然最后一行完事了啊？如果是需要最后一行做特殊处理，完全可以写一个延迟输出的逻辑，比如说： for line in sys.stdin: prin
为什么c比python快_为什么在C ++中从stdin读取行比Python慢得多？
2020-11-24 23:45

weixin_39909366的博客我想比较使用Python和C ++从stdin读取的字符串输入的行数，并且震惊地看到我的C ++代码比等效的Python代码慢一个数量级。由于我的C ++生锈，并且我还不是专家Pythonista，所以请告诉我我做错了什么还是误解了什么。...
在python 2.5 的代码中出现的 “UR” 会导致程序错误，这是一个语法吗？应如何修改？ python 有问必答
2021-12-11 10:49

回答 1 已采纳字符串前加u,后面字符串以 Unicode格式进行编码 exp = U"我是中文字符串” 字符串前加r 去掉反斜杠的转移机制。（特殊字符：即那些，反斜杠加上对应字母，表示对应的特殊含义的，比如最常
Golang重定向fmt.Scanf从文件而不是os.Stdin读取
2017-09-25 07:06

回答 1 已采纳 Setting a file as os.Stdin If this is truly what you want: os.Stdin is a variable (of type *os.Fi
为什么c比python快_为什么在C中读取stdin的行比Python慢??得多...
2020-11-24 23:45

weixin_39528559的博客我想比较使用Python和C从stdin读取字符串的读取行,并且看到我的C代码运行速度比等效的Python代码慢一个数量级.由于我的C生锈了,我还不是专家Pythonista,请告诉我,如果我做错了或者我误解了什么.(TLDR回答：包含声明...
Python读取stdin方法实例
2020-09-19 09:01

在本篇文章中小编给大家分享了关于Python里如何读取stdin的知识点以及相关实例内容，需要的朋友们学习参考下。
python为什么只能运行一行_Python中新建文件为什么只读取了一行line？
2020-11-30 08:54

weixin_39530437的博客文件的遍历因为文件保存了很多字符和行，因此也是循环常见的典型使用案例，最原始的方法可以调用文件对象的read方法，把文件内容一次性加载至字符串对象file = open('myfile.txt', 'r')print(file.read())hello text...
python stdin和stdout_stdin似乎比stdout(python)慢得多。为什么？
2021-01-15 03:43

weixin_39540020的博客我有两个python程序(一个是子进程)需要相互通信。目前我通过stdin和stdout来完成此操作。但是，写入子进程的stdin似乎非常缓慢。a.py，这是一个接受任意一行输入并打印时间的程序：from time import time, sleepfrom...
python stdin和stdout_stdin似乎比stdout(python)慢得多.为什么？
2021-01-15 03:43

weixin_39634194的博客然而,写入子进程的标准似乎很慢.a.py,一个采用任意行输入并打印时间的程序：from time import time, sleepfrom sys import stdout, stdinwhile True:stdin.readline()stdout.write('%f\n' % time()...
如何在Python中从stdin读取
2020-07-17 13:34

cunchi4221的博客 There are three ways to ... 在Python中，有三种方法可以从stdin读取数据。 sys.stdin sys.stdin input() built-in function input（）内置函数 fileinput.input() function fileinput.input（）函数 1.使用sys.s...
python中stdin.write_在python中同时读取stdin和写入stdout
2020-12-16 06:04

weixin_39586235的博客在我想做一个写stdin并读取c程序的stdout的程序。这是主程序中的代码。在from subprocess import Popen, PIPEfrom threading import Threadfrom Queue import Queue, Emptyfrom os import getcwd...
python比c++慢多少_为什么Python I/O比C++慢得多？
2021-03-06 15:06

weixin_39616565的博客我已经创建了一些代码，用于在Python和C++中进行测试，其中我从文件中读取两个矩阵并打印一些东西。Python的I/O时间似乎是原来的两倍：$ ./test.sh -i Testing/2000.in -p "C++/read-write-only.out" -n 2Executing:...
python从标准输入读取数据_在PYTHON中如何从标准输入读取内容stdin
2020-12-13 02:46

weixin_39747807的博客 1、sys.stdinsys.stdin提供了read()和readline()函数，如果想按一行行来读取，可以考虑使用它：import sysline = sys.stdin.readline()while line:print line,line = sys.stdin.readline()注意：如果没有数据，io会...
python中readlines读取指定行_Python从readlines读取前四行()
2021-01-29 04:26

慢慢买比价的博客您可以像使用Python中的任何文件类型对象一样遍历sys.stdin，这样更快，因为它不需要创建列表。with open('/tmp/redirect.log', 'a') as log:while True: #If you need to continuously check for more.for line in ...
python中回车怎么表示,如何在python中使用读取行仅拆分回车符？
2020-12-04 18:46

猫小姐很忧郁的博客 I have a text file that contains both \n and \r\n end-of-line ... I want to split only on \r\n, but can't figure out a way to do this with python's readlines method. Is there a simple workaround ...
python读取输入的数字sys_python – sys.stdin.readline()和input()：在读取输入行时哪个更快,为什么？...
2021-02-11 03:06

Mars Zhu的博客内置输入和sys.stdin.readline函数不完全相同,哪一个更快可能取决于你正在做什么的细节.正如aruisdante评论的那样,Python 3中的差异小于Python 2中的差异,当你提供的引用来自,但仍然存在一些差异.第一个区别是输入有...
pythonstdin_Python读取stdin方法实例
2020-11-21 02:01

weixin_39639698的博客 Python读取stdin方法实例Python中常用到的两种标准化输入方式：分别sys.stdin和input，两者使用方式大致相同，但是总的来说sys.stdin使用方式更加多样化一些，下面就例子说明两者之间的使用差别。1、input输入input...
python stdin什么意思_python中stdin是什么
2021-03-17 02:24

无味金的博客 python如何判断stdin里面是否有数据解决方案： select,poll等监视标准输入文件句柄(0),一旦有I/O操作就打印数据使用sys.stdin.isatty()函数 import sys def check_method_1(): import select if select.select([sys...
没有解决我的问题, 去提问

悬赏问题

¥50 导入文件到网吧的电脑并且在重启之后不会被恢复
¥15 （希望可以解决问题）ma和mb文件无法正常打开，打开后是空白，但是有正常内存占用，但可以在打开Maya应用程序后打开场景ma和mb格式。
¥15 绘制多分类任务的roc曲线时只画出了一类的roc，其它的auc显示为nan
¥20 ML307A在使用AT命令连接EMQX平台的MQTT时被拒绝
¥20 腾讯企业邮箱邮件可以恢复么
¥15 有人知道怎么将自己的迁移策略布到edgecloudsim上使用吗？
¥15 错误 LNK2001 无法解析的外部符号
¥50 安装pyaudiokits失败
¥15 计组这些题应该咋做呀
¥60 更换迈创SOL6M4AE卡的时候，驱动要重新装才能使用，怎么解决？

为什么在 c + + 中读取 stdin 行比读取 Python 慢得多？

10条回答 默认 最新

悬赏问题

10条回答默认最新