Python - Pandas Combining parts of multiple files

Question

Have a list of 200 or so files in a folder. Each has the same amount of columns but there can be some variation in the naming. For instance, i can have Global ID or Global id or Global Id. Is there a way to control for case in pandas column names so that it doesnt matter what it equals? Currently it will get through the first 15 or so files out of 200 and will error because it doesnt find Global ID.

Caveat that im a beginner and still learning.

import pandas as pd
import glob

with open('test99.txt' , 'a') as out:
    list_of_files = glob.glob('M:\AD HOC Docs\Client\Blinded\*')
    for file_name in list_of_files:
        df = pd.read_table(file_name, low_memory=False)
        df['Client'] = file_name.split("_")[2].strip()
        Final = df[['Client','ClientID','Global ID','Internal ID','campaign type','engagement type', 'file_name']]
        Final.to_csv(out,index=False)

Have you tried looping through and renaming the columns?

Demetri Pananos
– Demetri Pananos

2016-10-11 19:30:04 +00:00
Commented Oct 11, 2016 at 19:30 — Demetri Pananos
– Demetri Pananos, Commented Oct 11, 2016 at 19:30

Zeugma · Accepted Answer · 2016-10-11 19:30:36Z

2

Use header=None, names=[list of column names you want to use] as additional argument to read_tableto ignore the header row and to get consistent names.

answered Oct 11, 2016 at 19:30

Zeugma

32.3k9 gold badges73 silver badges85 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

WildBK Over a year ago

OK, i can try that. What would happen if for example, i have 1 file with an extra column by mistake?

Zeugma Over a year ago

Pass the same list to usecols argument as well

Collectives™ on Stack Overflow

Python - Pandas Combining parts of multiple files

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related