C ++:提高ifstream二进制文件的读取速度

我正在将小程序从PHP重写为C ++。 这个想法基本上是通过SSD读取32Gb文件并对其进行一些简单的操作。 </ p>

我正在使用带有x64版本构建的Visual Studio 2012。 PHP是5.3 32位。 </ p>

问题是PHP中的读取速度比C ++中的读取速度快,这真让我感到困惑。 PHP大约350 Mb / s,C ++ / ifstream代码执行180 Mb /秒。</ p>

代码非常简单:</ p>

  ifstream datafile  (“data.txt”,ios :: binary); 

while(datafile.read((char *)buffer,data_per_chunk)){
// do stuff;
</ code> </ pre> \ n

我尝试过不同的缓冲区大小,最高可达16Mb,并没有什么区别。 我也尝试通过datafile.rdbuf() - &gt; pubsetbuf(...)设置内部缓冲区,但它也没有什么区别。 </ p>

有没有关于如何在不恢复古老的C级界面的情况下加速ifstream的提示? 我想至少达到PHP级别的性能。 也许一些花哨的预读/缓存设置或其他东西。 </ p>

据我所知,内存映射文件可能会有所帮助,但是如果可以保持简单,只要文件比物理RAM大得多并且大于物理RAM,则更愿意调整ifstream的设置。 4Gb即不用于潜在的32位版本。</ p>
</ div>



I am rewriting small program from PHP to C++. The idea is basically to read through 32Gb file on an SSD and do some simple operations on it.

I am using Visual Studio 2012 with x64 release build. PHP is 5.3 32bit.

The problem is that bare reading speed in PHP is higher, than in C++, and this really puzzles me. PHP does ~350 Mb/s and C++/ifstream code does 180 Mb/sec.

Code is really simple:

ifstream datafile("data.txt", ios::binary);

while(datafile.read((char*)buffer, data_per_chunk)) {
//  do stuff;

I've tried different buffer sizes up to 16Mb and it did little difference. I also tried to set internal buffer via datafile.rdbuf()->pubsetbuf(...) but it also didn't made a difference.

Is there any hints on how to speed ifstream up without reverting to ancient C-level interface? I would like to at least reach PHP level of performance. Maybe some fancy read-ahead / cache settings or something.

I understand that memory-mapped files could likely help, but would prefer to tweak settings of ifstream, if it's possible to keep things simple given that file is significantly larger than physical RAM and larger than 4Gb i.e. no-go for potential 32-bit builds.


即使使用ifstream,您也可以达到最大的SSD读取速度。</ p>

这样做 ,你需要设置内部ifstream读缓冲到~2Mb,这是峰值SSD读取速度发生的地方,同时很好地适应CPU的L2缓存。 然后,您需要以小于内部缓冲区的块读出数据。 我在8-16kB块中读取数据得到了最好的结果,但它比读取1Mb块的速度快了约1%。 </ p>

设置ifstream内部缓冲区:</ strong> </ p>

  ifstream datafile(“base.txt”,ios :: binary  ); 
datafile.rdbuf() - &gt; pubsetbuf(iobuf,sizeof iobuf);
</ code> </ pre>

通过所有这些调整,我获得了495 Mb /秒的读取速度 接近M500 480Gb SSD的理论最大值。 在执行期间,CPU负载为5%,这意味着它并未受到ifstream实现开销的限制。 </ p>

我发现ifstream和std :: basic_filebuf之间没有可观察到的速度差异。</ p>
</ div>



It appeared that you can reach maximum SSD reading speed even with ifstream.

To do so, you need to set internal ifstream readbuffer to ~2Mb, which is where peak SSD read speed happening, while fitting nicely in L2 cache of CPU. Then you need to readout data in chunks smaller than internal buffer. I've got best results reading data in 8-16kB chunks, but it only about 1% faster than reading in 1Mb chunks.

Setting ifstream internal buffer:

ifstream datafile("base.txt", ios::binary);
datafile.rdbuf()->pubsetbuf(iobuf, sizeof iobuf);

With all these tweaks I've got 495 Mb/sec read speed which is close to theoretical maximum of M500 480Gb SSD. During execution CPU load was 5%, which means that it was not really limited by ifstream implementation overhead.

I found no observable speed difference between ifstream and std::basic_filebuf.

当你把它全部读入缓冲区时,我没有看到使用 ifstream </ code>的意义。 basic_filebuf </ code>或“古老”C接口都可以使用。 你需要首先将 ifstream </ code>与C接口进行比较,这样你就知道它真的是 ifstream </ code>。</ p>

我看到以下内容 选项,按性能提升顺序:</ p>

  • std :: ifstream </ code>: read </ code>等。</ li >
  • std :: basic_filebuf </ code> open </ code>, sgetn </ code>等。</ li>
  • C fopen </ code>, fread </ code>等。</ li>
  • WinApi CreateFile </ code>(< strong>不</ strong> OpenFile </ code>!), ReadFileEx </ code>等。</ li>
    </ ul>

    也许PHP不是 在内部使用C接口,但使用winapi,这就是差异的来源。</ p>
    </ div>



I don't see the point of using ifstream when you're reading it all into a buffer. Either basic_filebuf or the "ancient" C interface will work. You need to compare ifstream to the C interface first, so that you know it's really ifstream to blame.

I see the following options, in order of increasing performance:

  • std::ifstream: read, etc.
  • std::basic_filebuf: open, sgetn, etc.
  • C: fopen, fread, etc.
  • WinApi: CreateFile (not OpenFile!), ReadFileEx, etc.

Perhaps PHP is not using the C interface internally, but winapi, and that is where the difference comes from.

Csdn user default icon