doutong6814 2017-01-30 23:35
浏览 105
已采纳

如何在Go中一步返回哈希和字节?

I'm trying to understand how I can read content of the file, calculate its hash and return its bytes in one Go. So far, I'm doing this in two steps, e.g.

// calculate file checksum
hasher := sha256.New()
f, err := os.Open(fname)
if err != nil {
    msg := fmt.Sprintf("Unable to open file %s, %v", fname, err)
    panic(msg)
}
defer f.Close()
b, err := io.Copy(hasher, f)
if err != nil {
    panic(err)
}
cksum := hex.EncodeToString(hasher.Sum(nil))

// read again (!!!) to get data as bytes array
data, err := ioutil.ReadFile(fname)

Obviously it is not the most efficient way to do this, since read happens twice, once in copy to pass to hasher and another in ioutil to read file and return list of bytes. I'm struggling to understand how I can combine these steps together and do in one go, read data once, calculate any hash and return it along with list of bytes to another layer.

  • 写回答

4条回答 默认 最新

  • dongxu7121 2017-01-31 16:57
    关注

    If you want to read a file, without creating a copy of the entire file in memory, and at the same time calculate its hash, you can do so with a TeeReader:

    hasher := sha256.New()
    f, err := os.Open(fname)
    data := io.TeeReader(f, hasher)
    // Now read from data as usual, which is still a stream.
    

    What happens here is that any bytes that are read from data (which is a Reader just like the file object f is) will be pushed to hasheras well.

    Note, however, that hasher will produce the correct hash only once you have read the entire file through data, and not until then. So if you need the hash before you decide whether or not you want to read the file, you are left with the options of either doing it in two passes (for example like you are now), or to always read the file but discard the result if the hash check failed.

    If you do read the file in two passes, you could of course buffer the entire file data in a byte buffer in memory. However, the operating system will typically cache the file you just read in RAM anyway (if possible), so the performance benefit of doing a buffered two-pass solution yourself rather than just doing two passes over the file is probably negligible.

    本回答被题主选为最佳回答 , 对您是否有帮助呢?
    评论
查看更多回答(3条)

报告相同问题?

悬赏问题

  • ¥15 运筹学排序问题中的在线排序
  • ¥15 关于#flink#的问题:关于docker部署flink集成hadoop的yarn,请教个问题flink启动yarn-session.sh连不上hadoop
  • ¥30 求一段fortran代码用IVF编译运行的结果
  • ¥15 深度学习根据CNN网络模型,搭建BP模型并训练MNIST数据集
  • ¥15 lammps拉伸应力应变曲线分析
  • ¥15 C++ 头文件/宏冲突问题解决
  • ¥15 用comsol模拟大气湍流通过底部加热(温度不同)的腔体
  • ¥50 安卓adb backup备份子用户应用数据失败
  • ¥20 有人能用聚类分析帮我分析一下文本内容嘛
  • ¥15 请问Lammps做复合材料拉伸模拟,应力应变曲线问题