pandas DataFrame print index value only once

Question

import pandas as pd
li = [{"employee_id":1,"project_handled": "pas"},{"employee_id":1,"project_handled": "asap"},{"employee_id":2,"project_handled": "trimm"},{"employee_id":2,"project_handled": "fat"}]
df = pd.DataFrame(li)
df.set_index("employee_id",inplace=True)
print(df)

gives:

            project_handled
employee_id                
1                       pas
1                      asap
2                     trimm
2                       fat

What I want is, index values shouldn't be repeated when printing:

            project_handled
employee_id                
1                       pas
                       asap
2                     trimm
                        fat

I want to serialise this and share as excel using DataFrame.to_excel api. And the requirement is index shouldn't repeat itself in the employee_id column.

zipa · Accepted Answer · 2018-04-13 09:42:33Z

1

You need to set MultiIndex:

import pandas as pd
li = [{"employee_id":1,"project_handled": "pas"},{"employee_id":1,"project_handled": "asap"},{"employee_id":2,"project_handled": "trimm"},{"employee_id":2,"project_handled": "fat"}]
df = pd.DataFrame(li)
df['Something'] = 1
df.set_index(["employee_id", "project_handled"],inplace=True)
print(df)

I've added Something because otherwise you'd get:

Empty DataFrame
Columns: []
Index: [(1, pas), (1, asap), (2, trimm), (2, fat)]

EDIT

To create it without project_handled you'd need empty column and MultiIndex:

df["another"] = ""
df.set_index(["employee_id", "another"],inplace=True)

edited Apr 13, 2018 at 9:42

answered Apr 13, 2018 at 9:33

zipa

28k6 gold badges45 silver badges62 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

claudius Over a year ago

But aren't you abusing multi index? I don't want an index on project handled... I just want the first index to be printed only once.... The data I have given here is toy data... my use case is more complex....

zipa Over a year ago

If you want to see the data like this, you will have to go with MultiIndex. Another way without involving project_handled is in the edit.

0x51ba · Accepted Answer · 2018-04-14 07:50:33Z

If your only goal is to print your DataFrame to a csv in the required fashion and you don't need to have only one cell for each employee_id value then you can do something like this:

import pandas as pd

li = [{"employee_id":1,"project_handled": "pas"},{"employee_id":1,"project_handled": "asap"},{"employee_id":2,"project_handled": "trimm"},{"employee_id":2,"project_handled": "fat"}]
df = pd.DataFrame(li)

def custom_func(x): 
    for i in range(1, x['employee_id'].size):
            x['employee_id'].iloc[i] = ''
    return x;

df['employee_id'] = df['employee_id'].apply(str)
df = df.groupby('employee_id').apply(custom_func).set_index('employee_id')
print(df)

Output:

            project_handled
employee_id
1                       pas
                       asap
2                     trimm
                        fat

The result of df.to_csv('test.csv') looks like:

Collectives™ on Stack Overflow

pandas DataFrame print index value only once

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related