Merging CSV Using pandas dataframe

Question

I am using the below code. All my CSV files have uniform structure. When a dataframe is formed, it contains two columns for date in my CSV.

In the resulting dataframe, for few rows date value is in first date column, while for rest of the data, it goes to second date column.

Any idea, why two columns (Date columns), are getting generated for one column in the source CSV files.

all_data = pd.DataFrame()
for f in glob.glob("/Users/tcssig/Desktop/Files/*.csv"):
    df = pd.read_csv(f)
    all_data = all_data.append(df,ignore_index=True)

In [76]: all_data.columns
Out[76]: Index(['0', '0.1', 'Channel_ID', 'Date', 'Date ', 'Duration (HH:MM)','Episode #', 'Image', 'Language', 'Master House ID', 'Parental Rating','Program Category', 'Program Title', 'StartTime_ET', 'StartTime_ET2','Synopsis'],
 dtype='object')

Probably in some of your csv files you have Date column with space. — Anton Protopopov
– Anton Protopopov, Commented Sep 6, 2016 at 12:39

EdChum · Accepted Answer · 2016-09-06 12:39:34Z

5

because you have a space in the second column:

'Date', 'Date '
             ^

so you need to normalise the columns prior to appending

all_data = pd.DataFrame()
for f in glob.glob("/Users/tcssig/Desktop/Files/*.csv"):
    df = pd.read_csv(f)
    df.columns = df.columns.str.strip()
    all_data = all_data.append(df,ignore_index=True)

here I use str.strip to remove any leading and trailing whitespace

answered Sep 6, 2016 at 12:39

EdChum

397k204 gold badges837 silver badges583 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Sarang Manjrekar Over a year ago

Thanks a lot, I corrected that from some of my CSV files, and it worked.

Collectives™ on Stack Overflow

Merging CSV Using pandas dataframe

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related