2
import pandas as pd
li = [{"employee_id":1,"project_handled": "pas"},{"employee_id":1,"project_handled": "asap"},{"employee_id":2,"project_handled": "trimm"},{"employee_id":2,"project_handled": "fat"}]
df = pd.DataFrame(li)
df.set_index("employee_id",inplace=True)
print(df)

gives:

            project_handled
employee_id                
1                       pas
1                      asap
2                     trimm
2                       fat

What I want is, index values shouldn't be repeated when printing:

            project_handled
employee_id                
1                       pas
                       asap
2                     trimm
                        fat

I want to serialise this and share as excel using DataFrame.to_excel api. And the requirement is index shouldn't repeat itself in the employee_id column.

0

2 Answers 2

1

You need to set MultiIndex:

import pandas as pd
li = [{"employee_id":1,"project_handled": "pas"},{"employee_id":1,"project_handled": "asap"},{"employee_id":2,"project_handled": "trimm"},{"employee_id":2,"project_handled": "fat"}]
df = pd.DataFrame(li)
df['Something'] = 1
df.set_index(["employee_id", "project_handled"],inplace=True)
print(df)

I've added Something because otherwise you'd get:

Empty DataFrame
Columns: []
Index: [(1, pas), (1, asap), (2, trimm), (2, fat)]

EDIT

To create it without project_handled you'd need empty column and MultiIndex:

df["another"] = ""
df.set_index(["employee_id", "another"],inplace=True)
Sign up to request clarification or add additional context in comments.

2 Comments

But aren't you abusing multi index? I don't want an index on project handled... I just want the first index to be printed only once.... The data I have given here is toy data... my use case is more complex....
If you want to see the data like this, you will have to go with MultiIndex. Another way without involving project_handled is in the edit.
0

If your only goal is to print your DataFrame to a csv in the required fashion and you don't need to have only one cell for each employee_id value then you can do something like this:

import pandas as pd

li = [{"employee_id":1,"project_handled": "pas"},{"employee_id":1,"project_handled": "asap"},{"employee_id":2,"project_handled": "trimm"},{"employee_id":2,"project_handled": "fat"}]
df = pd.DataFrame(li)

def custom_func(x): 
    for i in range(1, x['employee_id'].size):
            x['employee_id'].iloc[i] = ''
    return x;

df['employee_id'] = df['employee_id'].apply(str)
df = df.groupby('employee_id').apply(custom_func).set_index('employee_id')
print(df)

Output:

            project_handled
employee_id
1                       pas
                       asap
2                     trimm
                        fat

The result of df.to_csv('test.csv') looks like:

enter image description here

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.