2017-05-21 16:44
浏览 206


We're currently doing a transition from Google Storage to Amazon S3 storage.

On Google Storage I've used this function https://godoc.org/cloud.google.com/go/storage#Writer.Write to write to files. It basically streams bytes of data into file using io.Writer interface and saves file when Close() is called on writer. That allows us to stream data into a file all day long and finalize it on the end of the day without ever creating a local copy of the file.

I've examined aws-sdk-go s3 documentation on godoc and can't seem to find a similar function that would allow us to just stream data to file without creating a file locally first. All I've found are functions that stream data from already existing local files like PutObject().

So my question is: Is there a way to stream data to amazon s3 files using aws-sdk-go that is similar to google storage Write() method?

图片转代码服务由CSDN问答提供 功能建议

我们目前正在从Google Storage过渡到Amazon S3存储。

在Google存储设备上,我已使用此功能 https://godoc.org/cloud.google.com/go/storage#Writer.Write 写入文件。 它基本上使用io.Writer接口将数据字节流传输到文件中,并在writer上调用Close()时保存文件。 这样一来,我们就可以整天将数据流式传输到文件中,并在一天结束时将其完成,而无需创建文件的本地副本。

我已经检查过aws-sdk- 转到godoc上的s3文档,似乎找不到类似的功能,该功能使我们可以将数据流式传输到文件而无需先在本地创建文件。 我发现的所有功能都是从已经存在的本地文件(例如PutObject())流式传输数据的函数。

所以我的问题是:有没有一种方法可以使用aws-将数据流式传输到Amazon s3文件中? sdk-go与Google存储的Write()方法相似吗?

  • 写回答
  • 好问题 提建议
  • 关注问题
  • 收藏
  • 邀请回答

1条回答 默认 最新

  • doushan1863 2017-05-21 20:18

    The S3 HTTP API doesn't have any append-like write method, instead it uses multipart uploads. You basically upload fixed size chunks with an index number and S3 will store them internally as separate files and automatically concatenate them when the last chunks is received. Default chunk size is 5MB (can be changed) and you can have atmost 10,000 chunks (can't be changed).

    Unfortunately it doesn't look like the aws-sdk-go API provides any convenient interface for working with chunks to achieve the streaming behaviour.

    You would have to work with the chunks manually (called parts in aws-sdk-go) directly using CreateMultipartUpload to initialize the transfers, create UploadPartInput instances for the data you want to send and send it with UploadPart. When the final chunk has been sent you need to close the transaction with CompleteMultipartUpload.

    Regarding the question on how to stream directly from e.g. []byte data instead of a file: the Body field of the UploadPartInput struct is where you put your content you want to send to S3, note that Body is of type io.readseeker. This means you can create a io.readseeker from e.g. your []byte content with something like bytes.NewReader([]byte) and set UploadPartInput.Body to that.

    The s3manager upload utility of uploads could be a good starting point to see how the multipart functions are used, it uses the multipart API to upload a single large file as smaller chunks concurrently.

    Keep in mind that you should set a lifecycle policy that removes unfinished multipart uploads. If you don't send the final CompleteMultipartUpload all the chunks that have been uploaded will stay in S3 and incur costs. The policy can be set through AWS console/CLI or programmatically with aws-sdk-go.

    解决 无用
    打赏 举报

相关推荐 更多相似问题