将整个 ASCII 文件读入 c + + std: : string [ duplicate ]

This question already has an answer here:

I need to read a whole file into memory and place it in a C++ std::string.

If I were to read it into a char[], the answer would be very simple:

std::ifstream t;
int length;
t.open("file.txt");      // open input file
t.seekg(0, std::ios::end);    // go to the end
length = t.tellg();           // report location (this is the length)
t.seekg(0, std::ios::beg);    // go back to the beginning
buffer = new char[length];    // allocate memory for a buffer of appropriate dimension
t.read(buffer, length);       // read the whole file into the buffer
t.close();                    // close file handle

// ... Do stuff with buffer here ...

Now, I want to do the exact same thing, but using a std::string instead of a char[]. I want to avoid loops, i.e. I don't want to:

std::ifstream t;
t.open("file.txt");
std::string buffer;
std::string line;
while(t){
std::getline(t, line);
// ... Append line to buffer and go on
}
t.close()

Any ideas?

</div>

转载于:https://stackoverflow.com/questions/2602013/read-whole-ascii-file-into-c-stdstring

csdnceshi74
7*4 Casting the streampos returned by tellg() into an int is not guaranteed to return the length of the file. If you subtract the streampos at the start of the file from that at the end of the file, you will get a streamoff which is guaranteed to be of an integral type and represent an offset in the file, at least in C++11. See cplusplus.com/reference/ios/streamoff and the comment in stackoverflow.com/a/10135341/1908650. See stackoverflow.com/a/2409527/1908650 for a safe version.
大约 4 年之前 回复
csdnceshi76
斗士狗 "the answer would be very simple". Understandable yes, simple no ;-)
4 年多之前 回复
csdnceshi73
喵-见缝插针 Side note for anyone looking at this code: The code presented as an example for reading into char[] does not null-terminate the array (read does not do this automatically), which may not be what you expect.
4 年多之前 回复
csdnceshi72
谁还没个明天 (1) Your link is outdated (C++11 made it contiguous) and (2) even if it wasn't, std::getline(istream&, std::string&) would still do the right thing.
接近 5 年之前 回复
csdnceshi59
ℙℕℤℝ This code is buggy, in the event that the std::string doesn't use a continuous buffer for its string data (which is allowed): stackoverflow.com/a/1043318/1602642
大约 7 年之前 回复
weixin_41568110
七度&光 I believe that the poster knew that reading bytes involved looping. He just wanted an easy, perl-style gulp equivalent. That involved writing little code.
8 年多之前 回复
csdnceshi79
python小菜 There will always be a loop involved, but it can be implicit as part of the standard library. Is that acceptable? Why are you trying to avoid loops?
10 年多之前 回复

9个回答

Update: Turns out that this method, while following STL idioms well, is actually surprisingly inefficient! Don't do this with large files. (See: http://insanecoding.blogspot.com/2011/11/how-to-read-in-file-in-c.html)

You can make a streambuf iterator out of the file and initialize the string with it:

#include <string>
#include <fstream>
#include <streambuf>

std::ifstream t("file.txt");
std::string str((std::istreambuf_iterator<char>(t)),
                 std::istreambuf_iterator<char>());

Not sure where you're getting the t.open("file.txt", "r") syntax from. As far as I know that's not a method that std::ifstream has. It looks like you've confused it with C's fopen.

Edit: Also note the extra parentheses around the first argument to the string constructor. These are essential. They prevent the problem known as the "most vexing parse", which in this case won't actually give you a compile error like it usually does, but will give you interesting (read: wrong) results.

Following KeithB's point in the comments, here's a way to do it that allocates all the memory up front (rather than relying on the string class's automatic reallocation):

#include <string>
#include <fstream>
#include <streambuf>

std::ifstream t("file.txt");
std::string str;

t.seekg(0, std::ios::end);   
str.reserve(t.tellg());
t.seekg(0, std::ios::beg);

str.assign((std::istreambuf_iterator<char>(t)),
            std::istreambuf_iterator<char>());
weixin_41568196
撒拉嘿哟木头 With C++17 you can shorten the std::string initialization line quite nicely (and similarly for the str.assign method): std::string str{std::istreambuf_iterator{in}, {}};. This uses C++11 brace initialization syntax and C++17 deduction guides (to omit the <char>).
接近 2 年之前 回复
csdnceshi65
larry*wei yep,this answer is quite messy. In particular: does the update ("dont do this with large files") refer to the first code? what exactly is the inefficiency? does the second code fix it?
大约 2 年之前 回复
csdnceshi56
lrony* While this answer is highly ranked, with the updates and edits stating that the method is slow now it's a mess. Do you have a method that is using stl facilities that is also fast? If so, clean all the mess and just write it in a concise way.
2 年多之前 回复
weixin_41568184
叼花硬汉 Luke, use std::ios::ate std::ifstream t("file.txt", std::ios::in | std::ios::binary | std::ios::ate); str.reserve(t.tellg());
接近 3 年之前 回复
csdnceshi64
游.程 The insanecoding blog post is benchmarking solutions to a slightly different problem: it is reading the file as binary not text, so there's no translation of line endings. As a side effect, reading as binary makes ftell a reliable way to get the file length (assuming a long can represent the file length, which is not guaranteed). For determining the length, ftell is not reliable on a text stream. If you're reading a file from tape (e.g., a backup), the extra seeking may be a waste of time. Many of the blog post implementations don't use RAII and can therefore leak if there's an error.
接近 7 年之前 回复
csdnceshi63
elliott.david You're right. About a year after I wrote this post, somebody did some benchmarking of various approaches to this problem and found that reserve+assign unfortunately does not seem to work the way that you would hope it did. And it turns out that in general iterators produce a surprisng amount of overhead. Disappointing. Edited this into the post.
接近 8 年之前 回复
csdnceshi58
Didn"t forge Benchmarked: both Tyler's solutions take about 21 seconds on a 267 MB file. Jerry's first takes 1.2 seconds and his second 0.5 (+/- 0.1), so clearly there's something inefficient about Tyler's code.
接近 8 年之前 回复
weixin_41568110
七度&光 he probably means dereferencing. for a 1MB file this would require about 1M compares, which doesn't seem really efficient.
大约 8 年之前 回复
csdnceshi62
csdnceshi62 N. You're going to have to explain that a little more, it looks like once to me.
大约 8 年之前 回复
csdnceshi74
7*4 No sure why people are voting this up, here is a quick question, say I have a 1MB file, how many times will the "end" passed to the std::string constructor or assign method be invoked? People think these kind of solutions are elegant when in fact they are excellent examples of HOW NOT TO DO IT.
9 年多之前 回复
csdnceshi57
perhaps? '\n' is merely a portable way of specifying newline in C code. Down below the compiler will still translate '\n' to what's appropriate for a newline for the compiler's operating system.
10 年多之前 回复
csdnceshi64
游.程 Note that the file may be longer than the string. If your OS uses <CR><LF> (two chars) as a line separator, the string will use '\n' (one char). Text streams do conversions to and from '\n' to the underlying representation.
10 年多之前 回复
csdnceshi57
perhaps? In the str.assign() approach the first argument's parentheses are unnecessary, because it can't parse as a declaration.
10 年多之前 回复
csdnceshi80
胖鸭 Of course, the read() method undoubtedly has lots of looping going on. The question is not whether it loops but where and how explicitly.
10 年多之前 回复
csdnceshi63
elliott.david If efficiency is important, you could find the file length the same was as in the char* example and call std::string::reserve to preallocate the necessary space.
10 年多之前 回复
csdnceshi60
℡Wang Yan no second parameter is required - ifstreams are input streams
10 年多之前 回复
csdnceshi51
旧行李 Yep, I am starting off with C++ and I'm still quite illiterate. Thanks for the answer, though, it is exactly what I needed. +1.
10 年多之前 回复
csdnceshi73
喵-见缝插针 This is just making the explicit loop implicit. Since the iterator is a forward iterator, it will be read one character at a time. Also, since there is no way for the string constructor to know the final length, it will probably lead to several allocations and copies of the data.
10 年多之前 回复
csdnceshi63
elliott.david Right. I was saying that ifstream doesn't have a method with the signature open(const char*, const char*)
10 年多之前 回复
csdnceshi66
必承其重 | 欲带皇冠 open is definitely a method of ifstream, however the 2nd parameter is wrong. cplusplus.com/reference/iostream/ifstream/open
10 年多之前 回复

There are a couple of possibilities. One I like to use a stringstream as a go-between:

std::ifstream t("file.txt");
std::stringstream buffer;
buffer << t.rdbuf();

Now the contents of "file.txt" are available in a string as buffer.str().

Another possibility (though I certainly don't like it as well) is much more like your original:

std::ifstream t("file.txt");
t.seekg(0, std::ios::end);
size_t size = t.tellg();
std::string buffer(size, ' ');
t.seekg(0);
t.read(&buffer[0], size); 

Officially, this isn't required to work under the C++98 or 03 standard (string isn't required to store data contiguously) but in fact it works with all known implementations, and C++11 and later do require contiguous storage, so it's guaranteed to work with them.

As to why I don't like the latter as well: first, because it's longer and harder to read. Second, because it requires that you initialize the contents of the string with data you don't care about, then immediately write over that data (yes, the time to initialize is usually trivial compared to the reading, so it probably doesn't matter, but to me it still feels kind of wrong). Third, in a text file, position X in the file doesn't necessarily mean you'll have read X characters to reach that point -- it's not required to take into account things like line-end translations. On real systems that do such translations (e.g., Windows) the translated form is shorter than what's in the file (i.e., "\r\n" in the file becomes "\n" in the translated string) so all you've done is reserved a little extra space you never use. Again, doesn't really cause a major problem but feels a little wrong anyway.

csdnceshi55
~Onlooker It is possible to get the total read characters (not bytes/chars!) by using the std::basic_istream::gcount function. I believe one should strip of the unused bytes by adding a buffer.resize(t.gcount());.
大约 2 年之前 回复
weixin_41568134
MAO-EYE After puzzling over this for a few minutes (compiler errors -- Windows 10, VS2015), I found I need to include BOTH #include <sstream> and #include <fstream>. Best of luck!
3 年多之前 回复
csdnceshi62
csdnceshi62 No. It probably doesn't make much sense to read into a string unless your data is actually a string, but that's not really a limitation, just good sense.
接近 4 年之前 回复
csdnceshi66
必承其重 | 欲带皇冠 OP asked for code to read an ASCII file into a string. Will this read any file, or is there something ASCII-specific lurking under the hood?
接近 4 年之前 回复
csdnceshi72
谁还没个明天 The suggestion was wrong, but your description seems off. One cannot reserve then read into the reserved space because no elements exist in the new space. Operations are only valid on elements between begin and begin + size - 1. The reserve only increases capacity, beyond size. Only the space exists there; elements do not. To create elements, one must use resize, emplace_back, etc. That's why, if using the 2nd method here, the container must first have its entire size declared and all elements default constructed... just so that they can immediately be overwritten.
大约 4 年之前 回复
csdnceshi62
csdnceshi62 Not really--you want to set the length before you do the read, so you'll have enough space to read into. By the time you call read, it's too late to set the size.
大约 4 年之前 回复
csdnceshi54
hurriedly% Can you get the number of chars read in the t.read() call and use that to set the string length.
大约 4 年之前 回复
csdnceshi60
℡Wang Yan You should not use reserve(), because the size() information is not correctly maintained and the string is in a broken state!
接近 5 年之前 回复
csdnceshi76
斗士狗 fwiw, on OSX 10.10, I needed to #include <fstream> instead of <sstream>
大约 5 年之前 回复
csdnceshi69
YaoRaoLov If you want to get the file as a std::string see stackoverflow.com/questions/116038/… for a one liner solution.
接近 6 年之前 回复
csdnceshi65
larry*wei If anyone is still interested, the answer to the question of dhardy can be found in the ifstream doc: " This function simply copies a block of data, without checking its contents nor appending a null character at the end."
大约 6 年之前 回复
csdnceshi61
derek5. Wouldn't constructing an empty string and then calling reserve(size) on it be more efficient?
大约 6 年之前 回复
csdnceshi53
Lotus@ This worked perfectly for my needs! thanks.
7 年多之前 回复
csdnceshi62
csdnceshi62 Into buffer.
7 年多之前 回复
csdnceshi74
7*4 Where is the data stored in example 2?
7 年多之前 回复
csdnceshi59
ℙℕℤℝ According to my testing (GCC 4.7), the buffer contains the same number of characters as the file size no matter which line endings are used. I'm guessing read(buf, size) turns off these conversions — anyone know?
接近 8 年之前 回复
csdnceshi62
csdnceshi62 Most of the time, you're fine not testing whether the file has opened (the other operations will simply fail). As a rule, you should avoid printing out error messages on the spot, unless you're sure that fits with the rest of the program -- if you must do something, throwing an exception is usually preferable. You should almost never explicitly close a file either -- the destructor will do that automatically.
大约 8 年之前 回复
csdnceshi80
胖鸭 Should also check to see if the file has opened, e.g., if (!t) std::cerr << "Error opening file." << std::endl;. Of course, don't forget to close the file as well when you are done.
大约 8 年之前 回复
weixin_41568110
七度&光 make sure to #include <sstream>
大约 8 年之前 回复
csdnceshi57
perhaps? Important note for some, at least on my implementation, the three-liner works at least as good as the C fopen alternative for files under 50KB. Past that, it seems to lose performance fast. In which case, just use the second solution.
接近 9 年之前 回复
csdnceshi58
Didn"t forge This should've been marked as the answer.
大约 9 年之前 回复
csdnceshi63
elliott.david The three-liner works like a charm!
大约 9 年之前 回复

I think best way is to use string stream. simple and quick !!!

ifstream inFile;
inFile.open(inFileName);//open the input file

stringstream strStream;
strStream << inFile.rdbuf();//read the file
string str = strStream.str();//str holds the content of the file

cout << str << endl;//you can do anything with the string!!!
csdnceshi66
必承其重 | 欲带皇冠 But in Cfree 5, string is not identified as a name type. In that case, will 'char' type work?
接近 4 年之前 回复
csdnceshi69
YaoRaoLov Yes, as Jerry said 3 years earlier.
大约 4 年之前 回复
weixin_41568131
10.24 Or let the destructor do it automatically - take advantage of C++!
4 年多之前 回复
csdnceshi52
妄徒之命 Remember to close the stream afterwards...
4 年多之前 回复
csdnceshi71
Memor.の Simple and quick, right! insanecoding.blogspot.com/2011/11/how-to-read-in-file-in-c.html
大约 6 年之前 回复

Try one of these two methods:

string get_file_string(){
    std::ifstream ifs("path_to_file");
    return string((std::istreambuf_iterator<char>(ifs)),
                  (std::istreambuf_iterator<char>()));
}

string get_file_string2(){
    ifstream inFile;
    inFile.open("path_to_file");//open the input file

    stringstream strStream;
    strStream << inFile.rdbuf();//read the file
    return strStream.str();//str holds the content of the file
}

I figured out another way that works with most istreams, including std::cin!

std::string readFile()
{
stringstream str;
ifstream stream("Hello_World.txt");
if(stream.is_open())
{
    while(stream.peek() != EOF)
    {
        str << (char) stream.get();
    }
    stream.close();
    return str.str();
}
}

I could do it like this:

void readfile(const std::string &filepath,std::string &buffer){
    std::ifstream fin(filepath.c_str());
    getline(fin, buffer, char(-1));
    fin.close();
}

If this is something to be frowned upon, please let me know why

csdnceshi78
程序go char(-1) is probably not a portable way to denote EOF. Also, getline() implementations are not required to support the "invalid" EOF pseudo-character as a delimiter character, I think.
7 年多之前 回复

If you happen to use glibmm you can try Glib::file_get_contents.

#include <iostream>
#include <glibmm.h>

int main() {
    auto filename = "my-file.txt";
    try {
        std::string contents = Glib::file_get_contents(filename);
        std::cout << "File data:\n" << contents << std::endl;
    catch (const Glib::FileError& e) {
        std::cout << "Oops, an error occurred:\n" << e.what() << std::endl;
    }

    return 0;
}

You may not find this in any book or site but I found out that it works pretty well:

ifstream ifs ("filename.txt");
string s;
getline (ifs, s, (char) ifs.eof());
weixin_41568110
七度&光 This will only work, as long as there are no "eof" (e.g. 0x00, 0xff, ...) characters in your file. If there are, you will only read part of the file.
大约 3 年之前 回复
csdnceshi67
bug^君 Good point about converting eof() to a char. I suppose for old-school ascii character sets, passing any negative value (msb set to 1) would work. But passing \0 (or a negative value) won't work for wide or multi-byte input files.
4 年多之前 回复
csdnceshi66
必承其重 | 欲带皇冠 Casting eof to (char) is a bit dodgy, suggesting some kind of relevance and universality which is illusory. For some possible values of eof() and signed char, it will give implementation-defined results. Directly using e.g. char(0) / '\0' would be more robust and honestly indicative of what's happening.
4 年多之前 回复

I don't think you can do this without an explicit or implicit loop, without reading into a char array (or some other container) first and ten constructing the string. If you don't need the other capabilities of a string, it could be done with vector<char> the same way you are currently using a char *.

csdnceshi74
7*4 -1 Not true... See above
大约 9 年之前 回复
Csdn user default icon
上传中...
上传图片
插入图片
抄袭、复制答案,以达到刷声望分或其他目的的行为,在CSDN问答是严格禁止的,一经发现立刻封号。是时候展现真正的技术了!
立即提问
相关内容推荐