0

Here is the example

    name        year     date      start opp
0   A.J. Price  2015     2014-12-02 No  MIL
1   A.J. Price  2015     2014-12-04 No  NYK
2   A.J. Price  2015     2014-12-05 No  TOR
3   A.J. Price  2015     2014-12-08 No  BRK
4   A.J. Price  2015     2014-12-09 No  TOR
318 Aaron       2015     2014-12-15 No  ATL
319 Aaron       2015     2014-12-18 No  NYK
320 Aaron       2015     2014-12-19 No  MEM

How to make the data frame above into something hierarchy like below

0   A.J. Price  2015     2014-12-02 No  MIL
                2015     2014-12-04 No  NYK
                2015     2014-12-05 No  TOR 
                2015     2014-12-08 No  BRK
                2015     2014-12-09 No  TOR  
318 Aaron       2015     2014-12-15 No  ATL
                2015     2014-12-18 No  NYK
                2015     2014-12-19 No  MEM
1
  • 1
    IIUC then df.set_index('name', append=True, inplace=True) should work, see the docs Commented Apr 8, 2016 at 16:55

1 Answer 1

1

With the help given by EdChum here is how it can be done:

In [11]: df
Out[11]:
   name         year       date        start  opp
0  A.J. Price      2015    2014-12-02   No    MIL
1  A.J. Price      2015    2014-12-04   No    NYK
2  A.J. Price      2015    2014-12-05   No    TOR
3  A.J. Price      2015    2014-12-08   No    BRK
4  A.J. Price      2015    2014-12-09   No    TOR
5  Aaron           2015    2014-12-15   No    ATL
6  Aaron           2015    2014-12-18   No    NYK
7  Aaron           2015    2014-12-19   No    MEM

In [12]: df.set_index('name',inplace=True)
In [13]: df.set_index('year',append=True, inplace=True)

In [14]: df
Out[14]:
                          date  start  opp
name        year
A.J. Price  2015    2014-12-02   No    MIL
            2015    2014-12-04   No    NYK
            2015    2014-12-05   No    TOR
            2015    2014-12-08   No    BRK
            2015    2014-12-09   No    TOR
Aaron       2015    2014-12-15   No    ATL
            2015    2014-12-18   No    NYK
            2015    2014-12-19   No    MEM

Update:

While writing out multi-index tables (or pivot_tables) to_csv will write the hierarchical index of each row, resulting in data as follows:

enter image description here

However using to_excel will write the output like this by merging the rows of hierarchical index:

enter image description here

So if the concern is how to get back the multi-index while reading the csv back, use the index_col argument of the function:

pd.read_csv('input.csv', index_col=[0,1])

Here is another link that will help you to write the csv the way you wanted.

Sign up to request clarification or add additional context in comments.

1 Comment

This works in the console (print df). But when I output this dataframe to csv file (df.to_csv), all the repeated names are still there in csv.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.