How Download Github Repo Filled with CSV Files on Github using Python?

Question

I'm trying to do some exploratory data analysis on the data that is provided by CSSE at Johns Hopkins University. They have it on Github at this link https://github.com/CSSEGISandData/COVID-19/tree/master/csse_covid_19_data/csse_covid_19_daily_reports I'm trying to download the entire file using python that will save it to my current directory. That way I'll have all the up to date, data and can reload it to use. I'm using two functions fetch_covid_daily_data() that will go to the website and download all the CSV files. Then ill have a load_covid_daily_data() that will go in the current repo and read the data so I can process it with pandas.

I'm doing this way because if I go back to my code I can call the function fetch_covid_daily_data() and it will download all the new changes made such as another daily CSV added.

Prayson W. Daniel · Accepted Answer · 2020-04-11 07:35:25Z

1

You can read data directly from online CSV to Pandas DataFrame:

Examples:

import pandas as pd

CONFIRMED_URL = 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv'

df = pd.read_csv(CONFIRMED_URL)

# df now contains data from time of call.

You can also create a class to get and manipulate all data


import pandas as pd

class Corona:


    def __init__(self):

        BASE_URL = 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series'

        self.URLS = {'confirmed': f'{BASE_URL}/time_series_covid19_confirmed_global.csv',
                'deaths': f'{BASE_URL}/time_series_covid19_deaths_global.csv',
                'recovered':f'{BASE_URL}/time_series_covid19_recovered_global.csv', 
        }


        self.data = {case:pd.read_csv(url) for case, url in self.URLS.items()}

    # create other useful functions to work with data
    def current_status(self):
        # function to show current status
        pass

To get current data:

# returns data as dictionary with DataFrames as Values
corona = Corona()
confirmed_df = corona.data['confirmed']

# If you want to save them to csv
confirmed_df.to_csv('confirmed.csv', index=False)

# show first five rows
print(corona_df.head())

# check other DataFrame
print(corona.data.keys())

edited Apr 11, 2020 at 7:35

answered Apr 11, 2020 at 7:26

Prayson W. Daniel

15.8k6 gold badges57 silver badges62 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Knowlege_Collector Over a year ago

Hi yes, I have done your first example multiple times but my problem is that I want to collect all those csv in the daily reports and join them together myself. I want to know if theres an easy way to do this in case I come across data that's in multiple csv files and ill need to join them. Im trying to do this on google colab so I don't want to download the data

Knowlege_Collector Over a year ago

I also love your idea of using a class!

Prayson W. Daniel Over a year ago

You can easily do that too. What I love about classes is that they help organise your code. To answer your multiple csv, if there is a pattern in csv names, you can still use the class above with list comprehension to get all csvs and the merge/concat/join then to one. I am happy to help if you provide a sample url if csvs and what you will like to do. See stackoverflow.com/questions/20906474/…

Teejay Bruno · Accepted Answer · 2020-04-11 07:01:02Z

0

Assuming you have git installed, you need to clone the repository from your terminal

git clone https://github.com/CSSEGISandData/COVID-19

hope this helps!

answered Apr 11, 2020 at 7:01

Teejay Bruno

2,2091 gold badge7 silver badges13 bronze badges

Collectives™ on Stack Overflow

How Download Github Repo Filled with CSV Files on Github using Python?

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related