Tracking Upload Progress of File to S3 Using Ruby aws-sdk

Question

Firstly, I am aware that there are quite a few questions that are similar to this one in SO. I have read most, if not all of them, over the past week. But I still can't make this work for me.

I am developing a Ruby on Rails app that allows users to upload mp3 files to Amazon S3. The upload itself works perfectly, but a progress bar would greatly improve user experience on the website.

I am using the aws-sdk gem which is the official one from Amazon. I have looked everywhere in its documentation for callbacks during the upload process, but I couldn't find anything.

The files are uploaded one at a time directly to S3 so it doesn't need to load it into memory. No multiple file upload necessary either.

I figured that I may need to use JQuery to make this work and I am fine with that. I found this that looked very promising: https://github.com/blueimp/jQuery-File-Upload And I even tried following the example here: https://github.com/ncri/s3_uploader_example

But I just could not make it work for me.

The documentation for aws-sdk also BRIEFLY describes streaming uploads with a block:

  obj.write do |buffer, bytes|
     # writing fewer than the requested number of bytes to the buffer
     # will cause write to stop yielding to the block
  end

But this is barely helpful. How does one "write to the buffer"? I tried a few intuitive options that would always result in timeouts. And how would I even update the browser based on the buffering?

Is there a better or simpler solution to this?

Thank you in advance. I would appreciate any help on this subject.

Trevor Rowe · Accepted Answer · 2012-08-27 18:45:42Z

10

The "buffer" object yielded when passing a block to #write is an instance of StringIO. You can write to the buffer using #write or #<<. Here is an example that uses the block form to upload a file.

file = File.open('/path/to/file', 'r')

obj = s3.buckets['my-bucket'].objects['object-key']
obj.write(:content_length => file.size) do |buffer, bytes|
  buffer.write(file.read(bytes))
  # you could do some interesting things here to track progress
end

file.close

answered Aug 27, 2012 at 18:45

Trevor Rowe

6,5382 gold badges32 silver badges35 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

DaedalusCoder Over a year ago

Thanks very much for this. it appears to be working although I'm still not sure of how I'll use the loop to update the page in real-time. One thing: Is streaming going to slow down the upload process by any considerable amount?

Trevor Rowe Over a year ago

One option would be to track the progress in some other location (like memcache/db/etc). Then you can have the web browser hit a separate action that poll for progress from a different action. Streaming should not slow down the upload much. Anything you do inside the block will, so make sure they are fast operations.

Andy Triggs Over a year ago

This behaviour seems to be problematic in Ruby 2.0.0, and deprecated (though I can't find a deprecation notice in the code). See github.com/aws/aws-sdk-ruby/issues/192, where Trevor says "The block form is deprecated. That said, we do support Ruby 2 and I'll take a look at why this is failing."

Andy Triggs Over a year ago

I have had some success with the above in 1.9.3, though the total of bytes uploaded sometimes ends up greater than file.size, for reasons I don't understand.

ggez44 Over a year ago

@AndyTriggs I assume you might be printing out "bytes"? That's just the chunk size, so if you're doing 5M chunks, the bytes variable will be 5M in every iteration even the last. For example, for a 18M file, you'll get 5M+5M+5M+5M at the end which is 20M out of 18M.

emartini · Accepted Answer · 2013-07-25 23:00:34Z

After read the source code of the AWS gem, I've adapted (or mostly copy) the multipart upload method to yield the current progress based on how many chunks have been uploaded

s3 = AWS::S3.new.buckets['your_bucket']

file = File.open(filepath, 'r', encoding: 'BINARY')
file_to_upload = "#{s3_dir}/#{filename}"
upload_progress = 0

opts = {
  content_type: mime_type,
  cache_control: 'max-age=31536000',
  estimated_content_length: file.size,
}

part_size = self.compute_part_size(opts)

parts_number = (file.size.to_f / part_size).ceil.to_i
obj          = s3.objects[file_to_upload]

begin
    obj.multipart_upload(opts) do |upload|
      until file.eof? do
        break if (abort_upload = upload.aborted?)

        upload.add_part(file.read(part_size))
        upload_progress += 1.0/parts_number

        # Yields the Float progress and the String filepath from the
        # current file that's being uploaded
        yield(upload_progress, upload) if block_given?
      end
    end
end

The compute_part_size method is defined here and I've modified it to this:

def compute_part_size options

  max_parts = 10000
  min_size  = 5242880 #5 MB
  estimated_size = options[:estimated_content_length]

  [(estimated_size.to_f / max_parts).ceil, min_size].max.to_i

end

This code was tested on Ruby 2.0.0p0

Collectives™ on Stack Overflow

Tracking Upload Progress of File to S3 Using Ruby aws-sdk

2 Answers 2

5 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related