4

I am trying to concat some timeseries. For some datasets this works. My timeseries uses the date as the index. Now for a few datasets where the ts.size are the same, the pd.concat works perfectly. But when the size is different among the timeseries I get the error: cannot reindex from a duplicate axis. So I assume this happens due to the difference in size. If so, should I pad the timeseries with zeros?

ts.head():

date
2017-03-09    24.6245
2017-03-10    24.5765
2017-03-13    24.5767
2017-03-14    24.5344
2017-03-15    24.5440

I have been stuck on this for a day so any help is appreciated. Thanks Here is the original question I posted and you can see my code: ValueError: cannot reindex from a duplicate axis Pandas. I just want to know if this is problem.

My Code:

def get_adj_nav(self, fund_id):
    df_nav = read_frame(
        super(__class__, self).filter(fund__id=fund_id, nav__gt=0).exclude(fund__account_class=0).order_by(
            'valuation_period_end_date'), coerce_float=True,
        fieldnames=['income_payable', 'valuation_period_end_date', 'nav', 'outstanding_shares_par'],
        index_col='valuation_period_end_date')
    df_dvd, skip = self.get_dvd(fund_id=fund_id)
    df_nav_adj = calculate_adjusted_prices(
        df_nav.join(df_dvd).fillna(0).rename_axis({'payout_per_share': 'dividend'}, axis=1), column='nav')
return df_nav_adj

def json_total_return_table(request, fund_account_id):
ts_list = []
for fund_id in Fund.objects.get_fund_series(fund_account_id=fund_account_id):
    if NAV.objects.filter(fund__id=fund_id, income_payable__lt=0).exists():
        ts = NAV.objects.get_adj_nav(fund_id)['adj_nav']
        ts.name = Fund.objects.get(id=fund_id).account_class_description
        ts_list.append(ts.copy())
        print(ts)
    df_adj_nav = pd.concat(ts_list, axis=1) # ====> Throws error
    cols_to_datetime(df_adj_nav, 'index')
    df_adj_nav = ffn.core.calc_stats(df_adj_nav.dropna()).to_csv(sep=',')
3
  • Can you paste some of your code? Commented Aug 25, 2017 at 15:44
  • @cᴏʟᴅsᴘᴇᴇᴅ Sure and done Commented Aug 25, 2017 at 15:45
  • 2
    @anderish I think you answered your own question. Since you're concatenating along axis=1 (adding more columns), then you'll want to keep the "column" lengths the same. If that criteria isn't in place the concat feature wouldn't know how to fill in missing data. Another pattern that you might want to consider is merge() link Commented Aug 25, 2017 at 16:06

1 Answer 1

1

So I think I was correct when I said the reason why it is failing is due to difference in size. So I used merge instead. I simply changed this line: df_adj_nav = pd.concat(ts_list, axis=1)

to this line: df_adj_nav = reduce(lambda x, y: pd.merge(x, y, left_index=True, right_index=True, how='outer'), ts_list).

Thanks to @HodgePodge for the hint :)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.