Import multiple excel files, create a column and get values from excel file's name

Question

I need to upload multiple excel files - each one has a name of starting date. Eg. "20190114". Then I need to append them in one DataFrame. For this, I use the following code:

all_data = pd.DataFrame()
for f in glob.glob('C:\\path\\*.xlsx'):
df = pd.read_excel(f)
all_data = all_data.append(df,ignore_index=True)

In fact, I do not need all data, but filtered by multiple columns. Then, I would like to create an additional column ('from') with values of file name (which is "date") for each respective file.

Example:

Data from the excel file, named '20190101'

Data from the excel file, named '20190115'

The final dataframe must have values in 'price' column not equal to '0' and in code column - with code='r' (I do not know if it's possible to export this data already filtered, avoiding exporting huge volume of data?) and then I need to add a column 'from' with the respective date coming from file's name:

like this:

dataframes for trial:

import pandas as pd

df1 = pd.DataFrame({'id':['id_1', 'id_2','id_3', 'id_4','id_5'],
               'price':[0,12.5,17.5,24.5,7.5],
               'code':['r','r','r','c','r'] })

df2 = pd.DataFrame({'id':['id_1', 'id_2','id_3', 'id_4','id_5'],
               'price':[7.5,24.5,0,149.5,7.5],
               'code':['r','r','r','c','r'] })

anky · Accepted Answer · 2019-10-25 08:48:36Z

1

IIUC, you can filter necessary rows ,then concat, for file name you can use os.path.split() and access the filename with string slicing:

l=[]
for f in glob.glob('C:\\path\\*.xlsx'):
    df=pd.read_excel(f)
    df['from']=os.path.split(f)[1][:-5]
    l.append(df[(df['code'].eq('r')&df['price'].ne(0))])
pd.concat(l,ignore_index=True)

     id  price code      from
0  id_2   12.5    r  20190101
1  id_3   17.5    r  20190101
2  id_5    7.5    r  20190101
3  id_1    7.5    r  20190115
4  id_2   24.5    r  20190115
5  id_5    7.5    r  20190115

edited Oct 25, 2019 at 8:48

answered Oct 24, 2019 at 14:19

anky

75.3k11 gold badges46 silver badges76 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Vero Over a year ago

Thank you @anky_91! it's working! I just needed to exclude those that are not equal to 'r', but it's easy to fix

anky Over a year ago

@Vero sorry for missing that out, fixed the code now :)

Collectives™ on Stack Overflow

Import multiple excel files, create a column and get values from excel file's name

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related