6

We're looking to begin using S3 for some of our storage needs and I'm looking for a way to perform a batch upload of 'N' files. I've already written code using the Java API to perform single file uploads, but is there a way to provide a list of files to pass to an S3 bucket?

I did look at the following question is-it-possible-to-perform-a-batch-upload-to-amazon-s3, but it is from two years ago and I'm curious if the situation has changed at all. I can't seem to find a way to do this in code.

What we'd like to do is to be able to set up an internal job (probably using scheduled tasking in Spring) to transition groups of files every night. I'd like to have a way to do this rather than just looping over them and doing a put request for each one, or having to zip batches up to place on S3.

4
  • Can you script it with awscli or s3cmd, rather than write it in Java? Using Java seems heavy-handed here. Commented Jul 28, 2015 at 21:31
  • The things haven't changed in this regard. People have developed libraries that make use of the s3 apis and parallelize the uploads. Commented Jul 29, 2015 at 4:06
  • @TJ- Can you provide an example? Commented Jul 29, 2015 at 12:00
  • github.com/tj---/s3-parallel Commented Jul 29, 2015 at 12:10

2 Answers 2

5

The easiest way to go if you're using the AWS SDK for Java is the TransferManager. Its uploadFileList method takes a list of files and uploads them to S3 in parallel, or uploadDirectory will upload all the files in a local directory.

Sign up to request clarification or add additional context in comments.

2 Comments

does it spawn n upload processes performed in parallel or does it spawn a single upload process for all of the objects (therefore needing only one connection)? I hope it's the latter
It performs N independent uploads - how many will be executed at a time depends on what kind of ExecutorService you pass to the constructor. S3 does not expose a way to upload multiple objects in a single HTTP request besides manually zipping them up. And even then you'd probably want to do a multi-part upload and split the zip over multiple HTTP requests so if there's a transient failure halfway through you don't have to start the whole upload over from scratch...
0
public void uploadDocuments(List<File> filesToUpload) throws 
    AmazonServiceException, AmazonClientException,
    InterruptedException {
    AmazonS3 s3 = AmazonS3ClientBuilder.standard().withCredentials(getCredentials()).withRegion(Regions.AP_SOUTH_1)
            .build();

    TransferManager transfer = TransferManagerBuilder.standard().withS3Client(s3).build();
    String bucket = Constants.BUCKET_NAME;

    MultipleFileUpload upload = transfer.uploadFileList(bucket, "", new File("."), filesToUpload);
    upload.waitForCompletion();
}

private AWSCredentialsProvider getCredentials() {
    String accessKey = Constants.ACCESS_KEY;
    String secretKey = Constants.SECRET_KEY;
    BasicAWSCredentials awsCredentials = new BasicAWSCredentials(accessKey, secretKey);
    return new AWSStaticCredentialsProvider(awsCredentials);

}

1 Comment

I know it is an old post, would you please share some details in case if you know that this can be done in Java 2 aws sdk version?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.