Read different .csv files into different dataframes with a loop with Python Pandas [duplicate]

Question

Please, before marking this question as duplicated read the whole post. I know that this post has a similar question but what I'm looking for is somehow different.

I have a list of file names:

files = ['first.csv', 'second.csv', 'third.csv']

And I want to read them inside a loop with pandas. What I expect is to create for each iteration inside the loop a different dataframe:

first = pd.read_csv('first.csv')
second = pd.read_csv('second.csv')
third = pd.read_csv('third.csv')

But inside a loop. Something like:

for i in range(len(files)):
    csv = re.split('.', files[i])[0]
    csv = pd.read_csv(files[i])

IMPORTANT: Each csv has different rows and columns. So what I want is not to read the three csv to combine them into one with pd.concat. I want to read them separately.

I tried to read them into a list with:

dataframe_list = [pd.read_csv(file_name) for file_name in files]

But that raises the next error:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x85 in position 59: invalid start byte

"Something like" is exactly what you need (except for the second line, which is useless). Did you try? — DYZ
– DYZ, Commented Aug 22, 2018 at 19:08
@DYZ, won't what they have simply result in csv being the dataframe corresponding to third.csv? It sounds like they want three different dataframes. — Zain Patel
– Zain Patel, Commented Aug 22, 2018 at 19:10
More efficiently, you can get a list of dataframes with frames=[pd.read_csv(f) for f in files] or even frames=list(map(pd.read_csv, files)). — DYZ
– DYZ, Commented Aug 22, 2018 at 19:11
@Rubén that error is an issue with reading the csv, not storing them in a list. If all of the files have different encodings, you can either specify the encodings for each file in a dictionary, or more haphazardly, use a try and except clause. except UnicodeDecodeError: and then try reading the bad files with the added argument encoding='latin-1' within pd.read_csv — ALollz
– ALollz, Commented Aug 22, 2018 at 20:04

Zain Patel · Accepted Answer · 2018-08-22 19:12:28Z

2

You can do something like this:

import pandas as pd
files = ['file1.csv', 'file2.csv', 'file3.csv']
dataframe_list = [pd.read_csv(file_name) for file_name in files]

then you can call dataframe_list[0] to get the first dataframe, and so on. You might want to use a dictionary instead with keys being the dataframe labels you want.

Quick tip: the construct for i in range(0, len(files)) and then only caring about files[i] is ugly. files is a list, so you can iterate over it using for file in files.

answered Aug 22, 2018 at 19:12

Zain Patel

1,0537 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

rubengura Over a year ago

Thanks for the tip! I tried your solution but it raises this error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x85 in position 59: invalid start byte

mad_ · Accepted Answer · 2018-08-22 19:58:58Z

0

files = ['first.csv', 'second.csv', 'third.csv']
list_of_df=[]
for i in range(len(files)):
    df = pd.read_csv(files[i],encoding = "utf-8")
    list_of_df.append(df)

edited Aug 22, 2018 at 19:58

answered Aug 22, 2018 at 19:12

mad_

8,2832 gold badges32 silver badges46 bronze badges

2 Comments

rubengura Over a year ago

I tried but got this error: UnicodeDecodeError: 'utf-8' codec can't decode byte 0x85 in position 59: invalid start byte

mad_ Over a year ago

you need to pass encoding parameter.

Collectives™ on Stack Overflow

Read different .csv files into different dataframes with a loop with Python Pandas [duplicate]

2 Answers 2

1 Comment

2 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Linked

Related