0

I am walking through a directory tree. In each sub directory when I come across a file called portfolio-insts-summary.csv I open the file and read the contents of the .csv to a data-frame called df. I then append the df data-frame to another data-frame called final_df. Once the code has finished going through the directory tree the data-frame df_final is saved to a .cvs called final.csv.

I have printed the head of each df dataframe and they contain data, however when I write the df_final to a final.csv the .csv the file is created but is empty. What have I done wrong and why is the final.csv file empty even though the df has data from each file?

The code is below:

# -*- coding: utf-8 -*-
"""
Created on Wed Jul 18 22:30:05 2018

@author: stacey
"""


import pandas as pd
import os

from pandas.tseries.offsets import BDay

def main():        

    folder = '/home/stacey/work/jp_aus_bk_tests/port_100k/'

    df_final = pd.DataFrame()

    for dirname, dirs, files in os.walk(folder):
        for filename in files:
            filename_without_extension, extension = os.path.splitext(filename)
            if filename_without_extension == 'portfolio-insts-summary':

                df = pd.read_csv(dirname + '/' +filename)

                df_final.append(df)

    df_final.to_csv('final.csv', index=False)



if __name__ == "__main__":

    print ("Processing_Results...17/07/18")


    try:

        main()



    except KeyboardInterrupt:

        print ("Ctrl+C pressed. Stopping...")  
1
  • Didn't my answer resolve your problem? Commented Nov 9, 2018 at 11:46

2 Answers 2

1

This happens because DataFrame.append returns a new Dataframe instead of modifying the original in place. (docs). If you change your code to df_final = df_final.append(df) it should work as expected.

Sign up to request clarification or add additional context in comments.

1 Comment

Yes that is why I didn't spot that one.
0

You can just add all the frames to an array and use pd.concat to combine them. I also needed to call main() before the last print.

import pandas as pd
import os

def main():

    folder = './dir/'

    frames = []

    for dirname, dirs, files in os.walk(folder):
        for filename in files:
            filename_without_extension, extension = os.path.splitext(filename)
            if filename_without_extension == 'portfolio-insts-summary':

                df = pd.read_csv(dirname + '/' +filename)
                frames.append(df)


    final = pd.concat(frames)
    final.to_csv('final.csv', index=False)


if __name__ == "__main__":
    main()
    print ("Processing_Results...17/07/18")

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.