append multiple pandas data frames to single csv, but only include header on first append

Question

I need to create a .csv file and append subsets of multiple dataframes into it.

All the dataframes are structured identically, however I need to create the output data set with headers, and then append all the subsequent data frames without headers.

I know I could just create the output file using the headers from the first data frame and then do an append loop with no headers from there, but I'd really like to learn how to do this in a more efficient way.

path ='/Desktop/NYC TAXI/Green/*.csv' 
allFiles = glob.glob(path)

for file in allFiles:
    df = pd.read_csv(file, skiprows=[1,2], usecols=np.arange(20))
    metsdf = df.loc[df['Stadium_Code'] == 2]
    yankdf = df.loc[df['Stadium_Code'] == 1]
    with open('greenyankeetaxi.csv','a') as yankeetaxi:
        yankdf.to_csv(yankeetaxi,header=false)
    with open('greenmetstaxi.csv','a') as metstaxi:
        metsdf.to_csv(metstaxi,header=false)
    print(file + " done")

Ikram Ul Haq · Accepted Answer · 2021-10-13 10:41:18Z

3

The efficient way to append multiple subsets of a dataframe in a large file with only one header is following:

        for df in dataframes:

            if not os.path.isfile(filename):
                df.to_csv(filename, header='column_names', index=False)
            else:  # else it exists so append without writing the header
                df.to_csv(filename, mode='a', header=False, index=False)

In the above code, I have written a file for the first time with a header and after that, I checked the existence of the file and just appended it without the header in the file.

you can use the above code in any scenario where you need to append multiple dataframes in the same file without the header multiple times.

answered Oct 13, 2021 at 10:41

Ikram Ul Haq

966 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Leb · Accepted Answer · 2015-11-17 23:24:29Z

2

To do it efficiently, you can use one of the Merge, join, and concatenate so you have two complete dataframe (yankdf and metsdf), then write to csv using to_csv as you have been doing.

Current data

Here we have 2 dataframe, one from each file:

First dataframe df

   a  b  c
0  1  2  3
1  4  5  6

Second dataframe df2

   a   b   c
0  7   6   8
1  9  10  11

Using append

df = df.append(df2)

The above line will result in a single df which can be written to file

   a   b   c
0  1   2   3
1  4   5   6
0  7   6   8
1  9  10  11

In short:

Loop through files in directory
Add data to dataframe using append instead of re-assigning everytime
Write a single dataframe to file

answered Nov 17, 2015 at 23:24

Leb

16k11 gold badges58 silver badges78 bronze badges

5 Comments

Ben Price Over a year ago

That definitely helps efficiency wise, but I was more stuck on how to import the headers from the first iteration of the loop and then only the data from there on

Leb Over a year ago

Having one dataframe will take care of that for you. The goal is to minimize the loops.

Ben Price Over a year ago

sorry... maybe I'm confused, but I'm sorta new to python... The directory I'm looking in has about 20 files in it, so the loop has to happen for each of those larger files, both of which create two unique data frames (mets and yankee). So instead of having 40 writes, there would be 20, but I still think I would run into the issue of the headers.

Leb Over a year ago

No worries, you're only looping to read the files then as you're reading you append to dataframe. Once all the files are done being read and you have a single df, then write to csv without loops. How big are all the files combined?

Ben Price Over a year ago

oh, I thought it was performing all of those steps for each loop... so would recreate a new mets and yankees data frame every loop and overwrite what I had. I'll try implementing what you said and see what happens. The files are about 1.5M-6M lines each

Collectives™ on Stack Overflow

append multiple pandas data frames to single csv, but only include header on first append

2 Answers 2

Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related