提高AWS日志下载速度

I'm going to start with showing the code and then what I'm trying to do, code:

package main

import (
    "fmt"
    "os"
    "path/filepath"

    "github.com/aws/aws-sdk-go/aws"
    "github.com/aws/aws-sdk-go/aws/session"
    "github.com/aws/aws-sdk-go/service/s3"
    "github.com/aws/aws-sdk-go/service/s3/s3manager"
)

var (
    // empty strings for security reasons
    Bucket         = ""                                               // Download from this bucket
    Prefix         = "" // Using this key prefix
    LocalDirectory = "s3logs"                                                    // Into this directory
)

func main() {
    sess := session.New()
    client := s3.New(sess, &aws.Config{Region: aws.String("us-west-1")})
    params := &s3.ListObjectsInput{Bucket: &Bucket, Prefix: &Prefix}

    manager := s3manager.NewDownloaderWithClient(client, func(d *s3manager.Downloader) {
        d.PartSize = 64 * 1024 * 1024 // 64MB per part
        d.Concurrency = 8
    }) // works

    //manager := s3manager.NewDownloaderWithClient(client) //works

    d := downloader{bucket: Bucket, dir: LocalDirectory, Downloader: manager}

    client.ListObjectsPages(params, d.eachPage)

}

type downloader struct {
    *s3manager.Downloader
    bucket, dir string
}

func (d *downloader) eachPage(page *s3.ListObjectsOutput, more bool) bool {
    for _, obj := range page.Contents {
        d.downloadToFile(*obj.Key)
    }

    return true
}

func (d *downloader) downloadToFile(key string) {
    // Create the directories in the path
    file := filepath.Join(d.dir, key)

    if err := os.MkdirAll(filepath.Dir(file), 0775); err != nil {
        panic(err)
    }
    fmt.Printf("Downloading " + key)
    // Setup the local file
    fd, err := os.Create(file)
    if err != nil {
        panic(err)
    }

    defer fd.Close()

    // Download the file using the AWS SDK
    fmt.Printf("Downloading s3://%s/%s to %s...
", d.bucket, key, file)
    params := &s3.GetObjectInput{Bucket: &d.bucket, Key: &key}
    _, e := d.Download(fd, params)
    if e != nil {
        panic(e)
    }

}

I'm trying to download the log files from a particular bucket and eventually many buckets. I need the download to be as fast as possible because. There is lots of data. My question is what is the most effective way to download huge amounts of data quickly? The whole process is nil if those logs can't be downloaded at a reasonable speed. Is there a faster way, it's already concurrent according to amazons doc? Any ideas? Also, i've noticed a curious thing. It doesn't matter if I set the Concurrency to 1, 4, or 20. Everything is still downloading at ~.70 - ~/.80 gb / min

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

报告相同问题？

关注问题

使用Golang的AWS S3并行下载
2019-01-29 11:34

回答 1 已采纳 Try altering your NewDownLoader() to this. See https://docs.aws.amazon.com/sdk-for-go/api/service
aws亚马逊云 health check一直失败 aws
2023-01-06 11:13

回答 4 已采纳首先，确保 VPS 上的防火墙规就正确地配置，并允许相应的端口。可以使用命令 ufw status 来查看防火墙状态和已配置的规就。其次需要确保在 VPS 上运行的服务器能够正确地监听指定的端口。可
从AWS S3紧急状态下载日志文件：运行时错误：
2015-12-17 22:38

回答 1 已采纳 You're passing nil to s3manager.NewDownloader where it requires a Session sess := session.New() m
AWS概述
2023-06-27 17:56

还是转转的博客从2006年开始，AWS就开始以web服务的形式对外提供IT基础设施的商业化服务–即现在的云计算。云计算的一个重要的优势就是以与业务规模匹配的较少变化的成本取代了前期为了搭建基础设施而投入的巨量资金。有了云平台，...
来自外部软件包的AWS Lambda golang日志记录
2018-07-26 06:37

回答 1 已采纳 In the serverless framework anything printed to std.out/std err will be written to the cloudwatch
使用AWS开发工具包Go的完整URI从S3下载文件
2018-03-27 22:37

回答 4 已采纳 There is no way to do what you want. The only ways to get a private object are: Use the bucket a
使用AWS SDK for PHP下载多个数据 aws php
2015-01-22 10:28

回答 1 已采纳 I found myself the solution. Turns out that I was sending an empty string to the second petition,
AWS 常见使用服务
2023-03-08 15:37

y75674952的博客如 mysql 等 ElastiCache：托管内存缓存服务，用于提高应用程序的响应速度和性能。如 redis,集群等 Route 53：可扩展的域名系统（DNS），用于注册和管理域名，并将域名与 AWS 资源和其他网络连接。 VPC：虚拟私有云...
无法访问aws的rabbitmq 15672端口 aws rabbitmq
2022-07-29 15:33

回答 1 已采纳检查一下你访问的ip写对了没，看下你的aws实例的公网ip
进行AWS测试以进行无效登录
2019-02-26 19:12

回答 2 已采纳 Here's what I ended up doing. Test that credentials are loaded, test a known service such as an S
为什么我的AWS Glue 爬网程序正常结束执行但生成的表格个数为0
2021-08-05 12:14

回答 1 已采纳我自己找到啦！S3的桶名一定要以aws-glue开头，AWS令人无语参考网站 https://stackoverflow.com/questions/68309438/crawl-is-not-
.NET LINQ分析AWS ELB日志
2019-10-18 20:12

漫步星辰575654643的博客 “小明，分析一下我们超牛逼网站上个月的所有AWS ELB流量日志，这些日志保存在AWS S3上，你分析下，看哪个API的响应时间中位数最长。” “对了，别用Excel，哥给你写好了一段Python脚本，可以自动解析统计一个AWS ...
验证AWS访问和密钥
2019-04-11 02:48

回答 1 已采纳 Based on the above comments that the system takes users credentials and uses them to create EC2 is
AWS 服务简介
2021-02-26 10:18

ddwbzmb的博客 AWS 服务简介 Amazon S3 Glacier 和 S3 Glacier Deep Archive 用于数据存档的长期、安全且持久的 Amazon S3 对象存储类，每月每 TB 低至 1 USD Amazon S3 Glacier 入门 Amazon S3 ...
谈谈AWS 的 SNS SQS Lambda分别是什么
2023-11-17 18:24

weixin_45047825的博客日志和审计事件：场景: 当您需要记录和审计系统中的关键事件时，SNS 可以用于将日志信息发送到相关团队或存储中。这有助于及时发现和解决潜在的问题。多通道通知：场景: SNS 不仅支持通过电子邮件、短信、HTTP/S...
没有解决我的问题, 去提问

悬赏问题

¥15 装 pytorch 的时候出了好多问题，遇到这种情况怎么处理？
¥20 IOS游览器某宝手机网页版自动立即购买JavaScript脚本
¥15 手机接入宽带网线，如何释放宽带全部速度
¥30 关于#r语言#的问题：如何对R语言中mfgarch包中构建的garch-midas模型进行样本内长期波动率预测和样本外长期波动率预测
¥15 ETLCloud 处理json多层级问题
¥15 matlab中使用gurobi时报错
¥15 这个主板怎么能扩出一两个sata口
¥15 不是，这到底错哪儿了😭
¥15 2020长安杯与连接网探
¥15 关于#matlab#的问题：在模糊控制器中选出线路信息，在simulink中根据线路信息生成速度时间目标曲线（初速度为20m/s，15秒后减为0的速度时间图像）我想问线路信息是什么

码龄粉丝数原力等级 --

提高AWS日志下载速度

0条回答默认最新

悬赏问题

提高AWS日志下载速度

0条回答 默认 最新

悬赏问题

0条回答默认最新