1

I need to upload multiple excel files - each one has a name of starting date. Eg. "20190114". Then I need to append them in one DataFrame. For this, I use the following code:

all_data = pd.DataFrame()
for f in glob.glob('C:\\path\\*.xlsx'):
df = pd.read_excel(f)
all_data = all_data.append(df,ignore_index=True)

In fact, I do not need all data, but filtered by multiple columns. Then, I would like to create an additional column ('from') with values of file name (which is "date") for each respective file.

Example:

Data from the excel file, named '20190101'

enter image description here

Data from the excel file, named '20190115'

enter image description here

The final dataframe must have values in 'price' column not equal to '0' and in code column - with code='r' (I do not know if it's possible to export this data already filtered, avoiding exporting huge volume of data?) and then I need to add a column 'from' with the respective date coming from file's name:

like this:

enter image description here

dataframes for trial:

import pandas as pd

df1 = pd.DataFrame({'id':['id_1', 'id_2','id_3', 'id_4','id_5'],
               'price':[0,12.5,17.5,24.5,7.5],
               'code':['r','r','r','c','r'] })

df2 = pd.DataFrame({'id':['id_1', 'id_2','id_3', 'id_4','id_5'],
               'price':[7.5,24.5,0,149.5,7.5],
               'code':['r','r','r','c','r'] })

1 Answer 1

1

IIUC, you can filter necessary rows ,then concat, for file name you can use os.path.split() and access the filename with string slicing:

l=[]
for f in glob.glob('C:\\path\\*.xlsx'):
    df=pd.read_excel(f)
    df['from']=os.path.split(f)[1][:-5]
    l.append(df[(df['code'].eq('r')&df['price'].ne(0))])
pd.concat(l,ignore_index=True)

     id  price code      from
0  id_2   12.5    r  20190101
1  id_3   17.5    r  20190101
2  id_5    7.5    r  20190101
3  id_1    7.5    r  20190115
4  id_2   24.5    r  20190115
5  id_5    7.5    r  20190115
Sign up to request clarification or add additional context in comments.

2 Comments

Thank you @anky_91! it's working! I just needed to exclude those that are not equal to 'r', but it's easy to fix
@Vero sorry for missing that out, fixed the code now :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.