Golang multipart uploads with chunked `http.GET` and Goamz `multi.PutAll`

Question

I'm using the Goamz package and could use some help getting bucket.Multi to stream an HTTP GET response to S3.

I'll be downloading a 2+ GB file via chunked HTTP and I'd like to stream it directly into an S3 bucket.

It appears that I need to wrap the resp.Body with something so I can pass an implementation of s3.ReaderAtSeeker to multi.PutAll

// set up s3
auth, _ := aws.EnvAuth()
s3Con := s3.New(auth, aws.USEast)
bucket := s3Con.Bucket("bucket-name")

// make http request to URL
resp, err := http.Get(export_url)
if err != nil {
    fmt.Printf("Get error %v\n", err)
    return
}

defer resp.Body.Close()

// set up multi-part 
multi, err := bucket.InitMulti(s3Path, "text/plain", s3.Private, s3.Options{})
if err != nil {
    fmt.Printf("InitMulti error %v\n", err)
    return
}

// Need struct that implements: s3.ReaderAtSeeker
// type ReaderAtSeeker interface {
//  io.ReaderAt
//  io.ReadSeeker
// }

rs := // Question: what can i wrap `resp.Body` in?

parts, err := multi.PutAll(rs, 5120)
if err != nil {
    fmt.Printf("PutAll error %v\n", err)
    return
}

err = multi.Complete(parts)
if err != nil {
    fmt.Printf("Complete error %v\n", err)
    return
}

Currently I get the following (expected) error when trying to run my program:

./main.go:50: cannot use resp.Body (type io.ReadCloser) as type s3.ReaderAtSeeker in argument to multi.PutAll:
    io.ReadCloser does not implement s3.ReaderAtSeeker (missing ReadAt method)

wilsonfiifi · Accepted Answer · 2014-12-27 08:59:34Z

1

~~You haven't indicated which package you're using to access the S3 api but I'm assuming it's this one~~ https://github.com/mitchellh/goamz/.

Since your file is of a significant in size, a possible solution might be to use the multi.PutPart. This will give you more control than multi.PutAll. Using the Reader from the standard library, your approach would be:

Get the Content-Length from the response header
Get the number of parts needed based on Content-Length and partSize
Loop over number of part and read []byte from response.Body into bytes.Reader and call multi.PutPart
Get parts from multi.ListParts
call multi.Complete with parts.

I don't have access to S3 so I can't test my hypothesis but the above could be worth exploring if you haven't already.

edited Dec 27, 2014 at 8:59

answered Dec 27, 2014 at 6:07

wilsonfiifi

3111 gold badge2 silver badges7 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Harshavardhana · Accepted Answer · 2015-11-06 21:47:13Z

A simpler approach is to use - http://github.com/minio/minio-go

It implements PutObject() which is a fully managed self contained operation for uploading large files. It also automatically does multipart for more than 5MB worth of data in parallel. if no pre-defined ContentLength is specified. It will keep uploading until it reaches EOF.

Following example shows how to do it, when one doesn't have a pre-defined input length but an io.Reader which is streaming. In this example i have used "os.Stdin" as an equivalent for your chunked input.

package main

import (
    "log"
    "os"

    "github.com/minio/minio-go"
)

func main() {
    config := minio.Config{
        AccessKeyID:     "YOUR-ACCESS-KEY-HERE",
        SecretAccessKey: "YOUR-PASSWORD-HERE",
        Endpoint:        "https://s3.amazonaws.com",
    }
    s3Client, err := minio.New(config)
    if err != nil {
        log.Fatalln(err)
    }

    err = s3Client.PutObject("mybucket", "myobject", "application/octet-stream", 0, os.Stdin)
    if err != nil {
        log.Fatalln(err)
    }

}
$ echo "Hello my new-object" | go run stream-object.go

Collectives™ on Stack Overflow

Golang multipart uploads with chunked `http.GET` and Goamz `multi.PutAll`

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related