dongwuzun4630 2017-08-06 08:33
浏览 46

如何防止格式错误的上传?

I have a fairly simple code for uploading files to Google Cloud Storage using Golang.

func upload(object *storage.ObjectHandle, b []byte) error {
    w := object.NewWriter(context.Background())

    if _, err = w.Write(b); err != nil {
        return err
    }
    return w.Close()
}

I have uploaded multitudes of files without any problems, but yesterday I noticed that one of the files was damaged. I'm fairly certain that the file was damaged during the upload as I name the files based on MD5 hash of its contents. I believe the Google Cloud Storage should've returned an error when calling the w.Close() but it didn't. What's the best way to make sure that the upload always fails when the transfer is interrupted/damaged?

  • 写回答

1条回答 默认 最新

  • dongyulan6251 2017-08-06 09:02
    关注

    You could try the following checks before and after you upload the bytes:

    • store len(b) of bytes
    • store sha256 hash of bytes

    Verify that both of these are the same when reading back from the cloud storage directly afterwards. This could impact performance of course but it would ensure that you are getting out what you put into GCS.

    That isn't the only place you could see corruption though - if the client stopped transmitting or transmitted bad data to your server, this wouldn't detect it. If so checking for integrity in some other way before upload might be your best bet. If your files are of a known type you could also check for integrity that way by verifying that it really is a valid jpg file for example.

    It might be best by trying to reproduce and finding out exactly where the corruption occurs first to verify your assumption that GCS should have returned an error and instead silently corrupted the data given to it.

    评论

报告相同问题?

悬赏问题

  • ¥15 制裁名单20240508芯片厂商
  • ¥20 易康econgnition精度验证
  • ¥15 msix packaging tool打包问题
  • ¥28 微信小程序开发页面布局没问题,真机调试的时候页面布局就乱了
  • ¥15 python的qt5界面
  • ¥15 无线电能传输系统MATLAB仿真问题
  • ¥50 如何用脚本实现输入法的热键设置
  • ¥20 我想使用一些网络协议或者部分协议也行,主要想实现类似于traceroute的一定步长内的路由拓扑功能
  • ¥30 深度学习,前后端连接
  • ¥15 孟德尔随机化结果不一致