3

I have a long list of csv files that I want to read as dataframes and name them by their file name. For example, I want to read in the file status.csv and assign its dataframe the name status. Is there a way I can efficiently do this using Pandas?

Looking at this, I still have to write the name of each csv in my loop. I want to avoid that.

Looking at this, that allows me to read multiple csv into one dataframe instead of many.

2
  • You can get all csv under current directory using os.listdir("."), combined with os.path.basename to parse file name. Commented Mar 19, 2019 at 16:29
  • Are you open to using dask? You could read in all the separate dataframes and have them contained in one data structure, i.e., a dask dataframe, partitioned by their original file name. Docs are here Commented Mar 19, 2019 at 16:36

2 Answers 2

9

You can list all csv under a directory using os.listdir(dirname) and combine it with os.path.basename to parse the file name.

import os

# current directory csv files
csvs = [x for x in os.listdir('.') if x.endswith('.csv')]
# stats.csv -> stats
fns = [os.path.splitext(os.path.basename(x))[0] for x in csvs]

d = {}
for i in range(len(fns)):
    d[fns[i]] = pd.read_csv(csvs[i])
Sign up to request clarification or add additional context in comments.

Comments

1

you could create a dictionary of DataFrames:

d = {}  # dictionary that will hold them 

for file_name in list_of_csvs:  # loop over files

   # read csv into a dataframe and add it to dict with file_name as it key
   d[file_name] = pd.read_csv(file_name)


Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.