2

I would like to programmatically download some files but am getting MemoryError exception for larger ones. For example, when I try to download a small file, the code is fine, but when I try to download a larger file, I catch a MemoryError.

Here is my code:

def __download_gpl_file(accession):
    try:
        bin_string = __get_response(accession)
        if bin_string is None:
            return False
        string = __unzip(bin_string)
    except MemoryError:
        print 'Out of memory for: ' + accession
        return False

    if string:
        filename = DOWNLOADED + accession + '.txt'
        with open(filename, 'w+') as f:
            f.write(string)
        return True
    return False


def __get_response(attempts=5):
    url = __construct_gpl_url(accession)  # Not shown
    response = None
    while attempts > 0:
        try:
            response = urllib2.urlopen(url)
            if response and response.getcode() < 201:
                break
            else:
                attempts -= 1
        except urllib2.URLError:
            print 'URLError with: ' + url
    return response.read()


def __unzip(bin_string):
    f = StringIO(bin_string)
    decompressed = gzip.GzipFile(fileobj=f)
    return decompressed.read()

Is there anything I can do to download larger files? Thanks in advance.

1
  • Can the downvoter please explain how to improve this question? Commented Nov 21, 2014 at 15:02

2 Answers 2

6

instead of writing whole file at once , you write line by line:

file = urllib2.urlopen('url')
with open('filename','w') as f:
    for x in file:
        f.write(x)

if you want to make it more fast:

file = urllib2.urlopen('url')
with open('filename','w') as f:
    while True:
        tmp = file.read(1024)
        if not tmp:
            break 
        f.write(tmp)
Sign up to request clarification or add additional context in comments.

Comments

2

I don't have enough points to comment on Hackaholic's answer so my answer is just his first example but with a slight correction.

file = urllib2.urlopen('url') 
with open('filename','w') as f:
    for x in file:
        f.write(x)

I think he wrote f.write(f) by accident.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.