0

Heyo, trynna download images from a site. I've setup a basic filter which works fine but my aim is to automate this and one of the steps to doing that is constantly re-downloading the site. I'm using wget to do this which works fine from terminal but it seems os.system() in python creates it's own (can't think of the name atm) 'terminal' which means I can't use things that I've installed, such as wget. I've tried gnome-terminal but I might be doing something wrong :/ Any other solutions would be greatly appreciated, thanks!

2
  • 1
    Did you try to specify the full path of wget. You can find the path with which wget. And it would help if you post your code. Commented Nov 2, 2014 at 23:07
  • why not just use a html lib to download the images? Commented Nov 2, 2014 at 23:21

1 Answer 1

1

Why are you trying to download the site by calling wget from the terminal ? I think a better idea is to download a site the python way:

import sys
import os
import urllib.error
import urllib.request

def get_raw_webpage(url):
    """
        Download a web url as raw bytes
    """
    try:
        req = urllib.request.Request(url)
        response = urllib.request.urlopen(req)
        data = response.read()
        return data

    except urllib.error.HTTPError as e:
        print('HTTPError: ', e.code , file = sys.stderr)
        return None

    except urllib.error.URLError as e:
        print('URLError: ', e.args, file = sys.stderr)
        return None

    except ValueError as e:
        print('Invalid url.', e.args, file = sys.stderr)

    return None


def get_webpage(url):
    """
    Get webpage as raw bytes and then
    convert to readable form
    """
    data = get_raw_webpage(url)
    if data == None:
        return None

    return data.decode('utf-8')

You can also use get_raw_webpage function with a link to an image to download it!

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks for your answer, the code works great. You mentioned being able to use the get_raw_webpage function to download an image?? Is it possible to get some more detail on that? Thanks!
Yes, get_raw_webpage actually downloads whatever your link points to as raw byte data, so if you give it a link of an image/sound or w/e file and then save that data to a file as binary you have therefore downloaded the image/sound/whatever.
Thanks heaps! Played around with it and got it to work perfectly.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.