Splitting dataframe python

Question

I have this relatively large (9mb) JSON, it's a list of dicts (I don't know if that's the convention for JSON) any way I've been able to read it in and turn into a data frame.

The data is a backtest for a predictive model model and is of the format:

[{"assetname":"xxx", 'return':0.9, "timestamp":1451080800},{"assetname":"xxx", 'return':0.9, "timestamp":1451080800}...{"assetname":"yyy", 'return':0.9, "timestamp":1451080800},{"assetname":"yyy", 'return':0.9, "timestamp":1451080800} ]

I would like the separate all the assets into their own data frames, can anyone help?

Here's the data btw http://www.mediafire.com/view/957et8za5wv56ba/test_predictions.json

What are your expecting output? You could do it with pandas.Series with [pd.Series(x) for x in l] where l is your list with dicts — Anton Protopopov
– Anton Protopopov, Commented Jan 26, 2016 at 11:10

George Petrov · Accepted Answer · 2016-01-26 10:37:10Z

1

Just put your data into DataFrame:

import pandas as pd

df = pd.DataFrame([{"assetname":"xxx", 'return':0.9, "timestamp":1451080800},
                   {"assetname":"xxx", 'return':0.9, "timestamp":1451080800}, 
                   {"assetname":"yyy", 'return':0.9, "timestamp":1451080800},
                   {"assetname":"yyy", 'return':0.9, "timestamp":1451080800}])
print(df)

Output:

  assetname  return   timestamp
0       xxx     0.9  1451080800
1       xxx     0.9  1451080800
2       yyy     0.9  1451080800
3       yyy     0.9  1451080800

answered Jan 26, 2016 at 10:37

George Petrov

2,8591 gold badge15 silver badges20 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

rapto · Accepted Answer · 2016-01-26 12:15:54Z

You can load a dataframe from a json file like this:

In [9]: from pandas.io.json import read_json

In [10]: d = read_json('Descargas/test_predictions.json')

In [11]: d.head()
Out[11]: 
  market_trading_pair  next_future_timestep_return  ohlcv_start_date  \
0    Poloniex_ETH_BTC                     0.003013        1450753200   
1    Poloniex_ETH_BTC                    -0.006521        1450756800   
2    Poloniex_ETH_BTC                     0.003171        1450760400   
3    Poloniex_ETH_BTC                    -0.003083        1450764000   
4    Poloniex_ETH_BTC                    -0.001382        1450767600   

   prediction_at_ohlcv_end_date  
0                     -0.157053  
1                     -0.920074  
2                      0.999806  
3                      0.627140  
4                      0.999857

You may split it like this:

Poloniex_ETH_BTC = d[d['market_trading_pair'] == 'Poloniex_ETH_BTC']

majr · Accepted Answer · 2016-01-27 09:52:52Z

0

Extending rapto's answer, you can split the whole dataframe by the value of one column like this:

df_dict = dict()
for name,df in d.groupby('market_trading_pair'):
    df_dict[name]=df

answered Jan 27, 2016 at 9:52

majr

3831 silver badge8 bronze badges

Collectives™ on Stack Overflow

Splitting dataframe python

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related