I just start using Go after years of using Perl and from initial tests seems like reading text file from a hard drive into hash is not as fast as Perl.
In Perl I use "File::Slurp" module and it helps reading file into memory (into string variable, array, or hash) really fast - in the limits of hard drive Read throughput.
I am not sure what is the best way by using Go to read e.g. 500MB CSV file with 10 columns into memory (into hash) where Key of a Hash is 1st column and Value is rest of 9 columns.
What is the fastest way to achieve this? Goal is to read and store into some Go memory variable as fast as Hard drive can deliver data.
This is one line from input file - there are around 20 million similar lines:
1341,2014-11-01 00:01:23.588,12000,AV7WN259SEH1,1133922,SingleOven/HCP/-PRODUCTION/-23C_30S,0xd8d2a106d44bea07,8665456.006,5456-02,3010-30 N- PHOTO,AV7WN259SEH1
Platform is Win 7 - i7 Intel processor with 16GB Ram. I can install Go on Linux as well if there are benefits in doing so.
Edit:
So one use case that is - load whole file into memory as fast as you can into 1 variable. Later I can scan that variable, split (all in memory) etc.
Another approach is to to store each line as key-value pair during load time (e.g. after X bites are passed or after \N character arrive).
To me - these 2 approaches can yield different performance results. But since I am very new to Golang - it will probably take me days to make best performance algorithm in Golang trying different techniques.
I would like to learn all possible ways to do above in Golang and also recommended ways. At this point I am no concerned about memory usage since this process will be repeated 10,000 times soon as first file processing is finished (each file will be erased from memory soon as processing is done). Files range from 50MB to 500MB. Since there are several thousands of files - any performance gain (even 1 sec gain per file) is significant overall gain.
I do not want to add complexity to the question about what will be done with data later but just want to learn about fastest way to read file from drive and store in hash. I will put more detailed benchmarks on my findings and also as I learn more about different ways to do it in Golang and as I hear more recommendations. I am hoping someone already did research on this topic.