Accommodating variable filename for python script

Question

I have to read multiple filenames which i will be treating as input for my python script. But the input files may have variable name depending upon the time it got generated.

File1: RM_Sales_Japan_2011201920191124194200.xlsx
File2: RM_Volume_Australia_201120192019154321194200.xlsx

How to accommodate these changes while reading a file instead of exactly specifying the filename every time we run the script?

Things i tried: I have used below method in my previous scripts because it had only one file with known extension:

xlsxfile = "*.xlsx"
filelocation = "/user/script/" + xlsxfile

But with multiple files with similar extension i am not sure how to get the definition done.

EDIT1:

I was trying to get more clarity on using glob with read_excel. Please see my example code below:

import os
import glob
import pandas as pd
os.chdir ('D:\\Users\\RMoharir\\Downloads\\Smart Spend\\Input')

fls=glob.glob("Medical*.*")

df1 = pd.read_excel(fls, parse_cols = 'A:H', skiprows = 10, header = None)

But this gives me an error:

ValueError: Invalid file path or buffer object type: <class 'list'>

Any help is appreciated.

filename will change depending upon the generation time of the input file. so i just want to use a partial filename — RSM
– RSM, Commented Nov 25, 2019 at 6:22

Hymns For Disco · Accepted Answer · 2019-11-25 05:16:44Z

2

If you simply need to find all the files that match a given pattern in a directory, os and re modules have you covered.

import os
import re

files = os.listdir()

for file in files:
    if re.match(r".*\.xlsx$", file):
        print(file)

This short program will print out every file in the current directory whose name ends with .xslx. If you need to match a more complicated pattern, you may need to read up on Regular Expressions

Note that os.listdir takes an optional string argument of what path to look in, if not given it will look in the directory the program was ran from

answered Nov 25, 2019 at 5:16

Hymns For Disco

8,5052 gold badges26 silver badges42 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

RSM Over a year ago

Thanks for your answer. I modified your solution as per my need:____________

Python     for fl in os.listdir():           if re.match(r"RM_.*\.xlsx$", fl):               print(fl)               df = pd.read_excel (fl, parse_cols = 'A:E', skiprows = 13, header = None)              df.head()

. ____________However, i am doing this for every file which does make my code a bit bigger. I am sure there is another way where each filename can be assigned to a different variable and that variable can be used instead of using the loop again and again.

RSM Over a year ago

Formatting in comments is not working. If there is ay other link than : stackoverflow.com/editing-help#comment-formatting It would be great

Hymns For Disco Over a year ago

@RahulMoharir You can't format multi-line code in comments. I'm not sure I understand your problem, do you want to first store a list of filtered filenames then re-use that list instead of re-iterating through the directory multiple times? You can use a site like pastebin.com to link to your current code, or just add it to your question

RSM Over a year ago

yes that's what am trying to do. Use the loop once and get all the files saved in unique variable to use it later in my code.

Collectives™ on Stack Overflow

Accommodating variable filename for python script

1 Answer 1

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related