1

This is a multi-part question. I just can't seem to combine everything together. The goal is to to create one DataFrame (guessing using MultiIndex) that I can access as follows:

ticker = 'GOLD'
date = pd.to_datetime('1978/03/31')
current_bar = df.ix[ticker].ix[date]

Can I then just say: current_bar.Last ?

Anyway, here are the files, and how I load them.

In [108]: df = pd.read_csv('GOLD.csv', parse_dates='Date', index_col='Date')
In [109]: df
Out[109]: 
            Exp       Last     Volume
Date
1978-03-30  198002    995.6    54
1978-03-31  198002    999.5    78

In [110]: df2 = pd.read_csv('SPX.csv', parse_dates='Date', index_col='Date')
In [111]: df2
Out[111]: 
            Exp       Last     Volume
Date
1978-03-30  198003    215.5    25
1978-03-31  198003    214.1    99

Ideally, I want it to look like this (I think):

ticker      GOLD                            SPX
values      Exp       Last     Volume       Exp       Last     Volume
Date
1978-03-30  198002    995.6    54           198003    215.5    25
1978-03-31  198002    999.5    78           198003    214.1    99
  1. I guess my questions are:
    • How do I make this Hierarchical (the actual data has 20+ identical columns for each file)
    • How do I then combine the files (I have about 100 that need to all go in 1 DataFrame)
    • Is my assumption correct that I can then just do: current_bar.Last to get values?

Thanks so much.

0

1 Answer 1

4

You can use pd.concat to concatenate DataFrames. (Concatenating smushes DataFrames together, while merging joins DataFrames based on common indices or columns). When you supply the keys parameter, you get a hierarchical index:

import pandas as pd
df = pd.read_csv('GOLD.csv', parse_dates='Date', index_col='Date', sep='\s+')
df2 = pd.read_csv('SPX.csv', parse_dates='Date', index_col='Date', sep='\s+')
result = pd.concat([df, df2], keys=['GOLD', 'SPX'], names=['ticker']).unstack('ticker')
result = result.reorder_levels([1, 0], axis=1).sortlevel(level=0, axis=1)
print(result)

yields

ticker        GOLD                    SPX               
               Exp   Last  Volume     Exp   Last  Volume
Date                                                    
1978-03-30  198002  995.6      54  198003  215.5      25
1978-03-31  198002  999.5      78  198003  214.1      99

result['Last'] yields the DataFrame:

In [147]: result['Last']
Out[147]: 
ticker       GOLD    SPX
Date                    
1978-03-30  995.6  215.5
1978-03-31  999.5  214.1

I'd recommend avoiding the syntax result.Last because it is too close to result.last, which returns a DataFrame method.


To handle more files, you might use code like this:

import pandas as pd
dfs = list()
for filename in filenames:
    df = pd.read_csv(filename, parse_dates='Date', index_col='Date')
    # compute moving_mean
    dfs.append(df)

keys = [filename[:-4] for filename in filenames]
result = pd.concat(dfs, keys=keys, names=['ticker']).unstack('ticker')

Note that this does require enough memory to hold a list of all the DataFrames in memory plus enough memory to hold result.

Sign up to request clarification or add additional context in comments.

5 Comments

Got it. What if I want to add another df to 'result' (ie 'NDX'). what would I use for the keys, given that result already has 2? (I just need to do this for 100 files, so need a way to do it iteratively. Thanks
The first argument to pd.concat can be a list of 100 DataFrames, and keys can be a list of 100 ticker names. For example, to concatenate 3 DataFrames: result = pd.concat([df, df2, df3], keys=['GOLD', 'SPX', 'NDX'], names=['ticker']).unstack('ticker')
Is there a way to do something like this: result = pd.concat([result, df3], keys =[result.keys, 'NDX'], names=['ticker']).unstack('ticker')?
The reason for my above question is that I need to perform some functions (moving_mean) on the files before adding them to the 'result' df. As far as I know I can't do those when in MultiIndex, correct?
I've edited the code (above) to show how you might compute the moving_mean before concatening.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.