How to create a multiIndex object from series?

Question

I have a data series 'rpt_date' :

>>> rpt_date
STK_ID
000002    [u'20060331', u'20060630']
000005    [u'20061231', u'20070331', u'20070630']
>>> type(rpt_date)
<class 'pandas.core.series.Series'>
>>>

And how to create a multiIndex object (pandas.core.index.MultiIndex) by:

'my_index = gen_index_by_series (rpt_date)'

'my_index' looks like :

>>> my_index
MultiIndex
[('000002', '20060331') ('000002', '20060630') ('000005', '20061231')
 ('000005', '20070331') ('000005', '20070630')]
>>> type(my_index)
<class 'pandas.core.index.MultiIndex'>
>>>

So how to write 'gen_index_by_series(series)' ?

Bakuriu · Accepted Answer · 2012-09-15 11:27:02Z

1

To associate the first element to the other you can use itertools.repeat and zip, in this way:

>>> import itertools as it
>>> L = [['000002', [u'20060331', u'20060630']],
...      ['000005', [u'20061231', u'20070331', u'20070630']]]
>>> couples = [zip(it.repeat(key), rest) for key, rest in L]
>>> couples
[[('000002', u'20060331'), ('000002', u'20060630')],
[('000005', u'20061231'), ('000005', u'20070331'), ('000005', u'20070630')]]

It shouldn't be too hard to obtain a list like L from the Series object.

To create a MultiIndex I belive you've to use the from_tuples method:

MultiIndex.from_tuples(sum(couples, []), names=('first', 'second'))

Since I'm not a pandas user I can't help much in the remaining tasks, even though they are probably easy. It's a matter of iterating over the Series in the correct way.

answered Sep 15, 2012 at 11:27

Bakuriu

103k23 gold badges209 silver badges236 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

bigbug Over a year ago

i try it. it works. Thanks. But the speed is not fast, and not 'vectorized'. Is there any other more vectorized method or any Pandas magic function ?

Wes McKinney Over a year ago

The original way the data is stored (as a Series of lists) is very inefficient (since getting the data out requires expensive iteration / unboxing of the values). Is there any way for you to change it?

bigbug Over a year ago

'rpt_date' is from 'rpt_date = ori_rpt.groupby('STK_ID').RPT_Date.apply(makeup_rpt_date_list)' . I have a 'ori_rpt' dataframe contains accumulative financial report data which lost some date's report, 'makeup_rpt_date_list' is used to make up the date list accordingly. And then build a multilevel index object 'full_rpt_idx', and 'rpt = ori_rpt.reindex(index = full_rpt_idx)' to fill the missing data (although all the columns are NaN). Then I can safely use 'rpt - rpt.shift(1)' to get the quarterly data. YES, above staff is sluggish, but i can't find a more Pandas-way to improve it.

Collectives™ on Stack Overflow

How to create a multiIndex object from series?

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related