Import archive/zip
.
-
Open and read the archive file
as shown in the example right there in the docs.
Note that in order to mimic the behaviour of zcat
you have to
first check the length of the File
field of the zip.ReadCloser
instance returned by a call to zip.OpenReader
,
and fail if it is not equal to 1 — that is, there is no files in the
archive or there are two or more files in it¹.
-
Note that you have to check the error value
returned by a call to zip.OpenReader
for being equal to zip.ErrFormat
,
and if it's equal, you have to:
- Close the returned
zip.ReadCloser
.
- Try to reinterpret the file as being
gzip
-formatted (step 4).
-
Take the first (and sole) File
member and
call Open
on it.
You can then read the file's contents from the returned io.ReaderCloser
.
After reading, you need to call Close()
on that instance and then
close the zip file as well. That's all. ∎
-
If step (2) failed because the file did not have the zip format,
you'd test whether it's gzip-formatted.
In order to do this, you do basically the same steps using the
compress/gzip
package.
Note that contrary to the zip format, gzip does not provide file archival — it's merely a compressor, so there's no meta information on any files in the gzip stream, just the compressed data.
(This fact is underlined by the difference in the names of the packages.)
If an attempt to opening the same file as a gzip archive returns
the gzip.ErrHeader
error, you bail out, otherwise you read the data
after which you close the reader. That's all. ∎
To process just the specific lines from the decompressed file,
you'd need to
- Skip the lines before the first one to process.
- Process the lines until, and including the last one to process.
- Stop processing.
To interpret the data read from an io.Reader
or io.ReadCloser
,
it's best to use bufio.Scanner
—
see the "Example (Lines)" there.
P.S.
Please read thoroughly this essay
to try to make your next question better that this one.
¹ You might as well read all the files and interpret their contents
as a contiguous stream — that would deviate from the behaviour of zcat
but that might be better. It really depends on your data.