0

this is a 2 part question. I have a folder (existing in the same directory as the python script) with a lot of csv files and I want to read only specific csv files from the folder into python in order to later merge it into one data frame.

  1. I do not want to read all the csv files from the folder, just the csv files that have the substring "BANK_NIFTY_5MINs*". Can someone please help on how I can do that?

The files I want to read are named as:

BANK_NIFTY_5MINs_2020-01-01.csv, BANK_NIFTY_5MINs_2020-01-02.csv, BANK_NIFTY_5MINs_2020-01-03.csv and so on.

I have tried with the following code but it does not work. 'files' variable does not store anything. (The code works if the folder consists of only the files I want to work with - But that is not what I want):

import glob
import pandas as pd

path = 'C:/Users/User1/Desktop/Test/2020/'
files = sorted(glob.glob(path + "/BANK_NIFTY_5MINs*.csv"),reverse=True)

#Code to merge the files
df = pd.merge(
    read_csv(files.pop()),
    read_csv(files.pop()),
    left_index=True, 
    right_index=True
)
while files:
    df = pd.merge(
        df, 
        read_csv(files.pop()), 
        left_index=True, 
        right_index=True,
        how='outer'
    )
  1. Also, is there a way I can access the folder without specifying the absolute path of the directory? So that it runs on another system without having to change the path.

Any help would be appreciated. Thank you in advance.

2
  • Maybe first use print() (and print(type(...)), print(len(...)), etc.) to see which part of code is executed and what you really have in variables. It is called "print debuging" and it helps to see what code is really doing. Commented Oct 26, 2022 at 0:29
  • maybe first check os.listdir("'C:/Users/User1/Desktop/Test/2020/") to see what you really have in folder. Maybe files have little different name. OR maybe you should use r"C:\...\..." instead of "C:/.../...". BTW: when you add ".../" + "/..." then you get ...//... and this can also makes problem. Commented Oct 26, 2022 at 0:30

1 Answer 1

1

If you want to use relative path:

import os

currentDir = os.path.dirname(__file__) 
absolutePath = os.path.join(currentDir, "the/relative/path")

For the file name, RegEx should work fine, but maybe a simple solution like this will work better:

for file in os.listdir(currentDir):
    if filename.startswith("BANK_NIFTY_5MINs") and filename.endswith(".csv"):
        # do the thing that you want here
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.