added 4 characters in body

Source Link

edited Dec 9, 2015 at 15:21

13.5k
5
37
62

I wanted to download all colour schemes for phpstorm and was too lazy to do it manually so I wrote a little script for it in python, because why not.

I wanted to download all colour schemes for phpstorm and was lazy to do it manually so I wrote a little script for it in python, because why not.

I wanted to download all colour schemes for phpstorm and was too lazy to do it manually so I wrote a little script for it in python, because why not.

edited title

Link

edited Dec 9, 2015 at 12:58

Heslacher

51k
5
83
177

Python script for fetching Fetching daylerees colour schemes for PHPStorm from Github

Source Link

asked Dec 9, 2015 at 12:25

Such Much Code

295
2
5

Python script for fetching daylerees colour schemes for PHPStorm from Github

I wanted to download all colour schemes for phpstorm and was lazy to do it manually so I wrote a little script for it in python, because why not.

This is my very first Python script and I'm still learning through tutorials and what not.

Things it does:

Asks to input the path where to save the files
If the destination folder does not exist, creates it
Makes sure the the path ends with '/'
Instantiates a crawler instance
Crawler fetches the initial JSON payload
It loops over the items to check whether an item is a file or a directory
If it is a file, downloads it. If a folder, instantiates a new crawler instance with that directory path
Before downloading the files, it checks if the file exists already and prints the messages accordingly.

Things I can already see that I will go back and improve:

Make passing the link dynamic, instead of hard coding.
Implement a better path checking / validation.
use try / except methods instead where applicable.

And the code:

import urllib
import json
import time
import os
import re


class Crawler:
    def __init__(self, url, path):
        self.url = url
        self.path = path
        self.counter = 0
        self.total_files = 0

    def crawl(self):
        self.get_contents()

    def get_contents(self):
        response = urllib.urlopen(self.url).read()
        collection = json.loads(response)
        self.total_files = len(collection)
        for data in collection:
            if data.get('type') == 'file':
                self.download(data.get('name'), data.get('download_url'))
            elif data.get('type') == 'dir':
                dir_crawler = Crawler(data.get('url'), self.path)
                print('\nFound a directory. Pausing the main download and crawling the directory - ' + data.get(
                    'name') + '\n')
                dir_crawler.crawl()
                print('\n' + data.get('name') + ' directory is done. Continuing from where I left off.\n')
            time.sleep(1)

    def download(self, name, link):
        if name and not os.path.exists(self.path + name):
            download = urllib.URLopener()
            download.retrieve(link, self.path + name)
            self.counter += 1
            print('Downloading ' + name + ' (' + str(self.counter) + ' out of ' + str(self.total_files) + ')')
        else:
            print('File: ' + name + ' already exists in ' + self.path)


download_location = raw_input(
    "Where would you like me to save the files? (e.g. ./ or ../ or ./themes/ or /path/to/folder): ")
if download_location and re.match(r'.+/$', download_location):
    if not os.path.exists(download_location):
        os.mkdir(download_location)
    crawler = Crawler('https://api.github.com/repos/daylerees/colour-schemes/contents/jetbrains',
                      download_location)
    crawler.crawl()
    print('\n\nAll done. Themes have been saved into ' + download_location +
          '. Thank you for your cooperation. Have a nice day.')
else:
    print('The path must end with a forward slash.')

python

Stack Exchange Network

Return to Question

Python script for fetching Fetching daylerees colour schemes for PHPStorm from Github

Python script for fetching daylerees colour schemes for PHPStorm from Github