doubi8383 2019-05-14 15:03
浏览 63

使用Goroutines和渠道将多个文件并行上传到Amazon S3

I'm trying to upload a directory into Amazon S3 bucket. However, the only way to upload a directory is to iterate through all the files inside the directory and upload them one by one.

I'm using Go to iterate over the files in directory. However, for each file I iterate through, I want to spin off a goroutine that uploads the file while the main thread iterates through the next element in the directory and spins off another goroutine to upload the same.

Any idea on how I can upload all the files in the directory parallelly using Goroutines and Channels?

Revised code snippet that implements a goroutine and a channel to upload files in parallel. But I'm not sure if this is the right implementation.

func uploadDirToS3(dir string, svc *s3.S3) {
    fileList := []string{}
    filepath.Walk(dir, func(path string, f os.FileInfo, err error) error {
        fmt.Println("PATH ==> " + path)
        fileList = append(fileList, path)
        return nil
    })
    for _, pathOfFile := range fileList[1:] {
        channel := make(chan bool)
        go uploadFiletoS3(pathOfFile, svc, channel)
        <-channel
    }
}

func uploadFiletoS3(path string, svc *s3.S3, channel chan bool) {
    file, err := os.Open(path)
    if err != nil {
        fmt.Println(err)
    }
    defer file.Close()
    fileInfo, _ := file.Stat()
    size := fileInfo.Size()

    buffer := make([]byte, size)
    file.Read(buffer)
    fileBytes := bytes.NewReader(buffer)
    fileType := http.DetectContentType(buffer)

    s3Path := file.Name()

    params := &s3.PutObjectInput{
        Bucket:        aws.String("name-of-bucket"),
        Key:           aws.String(s3Path),
        Body:          fileBytes,
        ContentLength: aws.Int64(size),
        ContentType:   aws.String(fileType),
    }

    resp, err := svc.PutObject(params)
    if err != nil {
        fmt.Println(err)
    }
    fmt.Printf("response %s", awsutil.StringValue(resp))
    close(channel)
}

Any ideas on how I could implement this better? I've looked into WaitGroups but for some reason, I found Channels much easier to understand and implement in this situation.

  • 写回答

2条回答 默认 最新

  • dtrovwl75780 2019-05-15 02:08
    关注

    So, you are looking for concurrency, which is rooted in go instruction. For synchronization between started in loop goroutine, you can use chanels OR sync.WaitGroup. The second option is a little bit easier to do. Also you have to refactor your function and move internal for logic into a separate function.

    func uploadDirToS3(dir string, svc *s3.S3) {
        fileList := []string{}
        filepath.Walk(dir, func(path string, f os.FileInfo, err error) error {
            fileList = append(fileList, path)
            return nil
        })
        var wg sync.WaitGroup
        wg.Add(len(fileList))
        for _, pathOfFile := range fileList[1:] {
            //maybe spin off a goroutine here??
            go putInS3(pathOfFile, svc, &wg)
        }
        wg.Wait()
    }
    
    func putInS3(pathOfFile string, svc *s3.S3, wg *sync.WaitGroup) {
        defer func() {
            wg.Done()
        }()
        file, _ := os.Open(pathOfFile)
        defer file.Close()
        fileInfo, _ := file.Stat()
        size := fileInfo.Size()
        buffer := make([]byte, size)
        file.Read(buffer)
        fileBytes := bytes.NewReader(buffer)
        fileType := http.DetectContentType(buffer)
        path := file.Name()
        params := &s3.PutObjectInput{
            Bucket:        aws.String("bucket-name"),
            Key:           aws.String(path),
            Body:          fileBytes,
            ContentLength: aws.Int64(size),
            ContentType:   aws.String(fileType),
        }
    
        resp, _ := svc.PutObject(params)
        fmt.Printf("response %s", awsutil.StringValue(resp))
    }
    
    评论

报告相同问题?

悬赏问题

  • ¥15 Vue3 大型图片数据拖动排序
  • ¥15 划分vlan后不通了
  • ¥15 GDI处理通道视频时总是带有白色锯齿
  • ¥20 用雷电模拟器安装百达屋apk一直闪退
  • ¥15 算能科技20240506咨询(拒绝大模型回答)
  • ¥15 自适应 AR 模型 参数估计Matlab程序
  • ¥100 角动量包络面如何用MATLAB绘制
  • ¥15 merge函数占用内存过大
  • ¥15 使用EMD去噪处理RML2016数据集时候的原理
  • ¥15 神经网络预测均方误差很小 但是图像上看着差别太大