0

I would like to add an index column based on existing columns. Duplicates would share the same index. For example,

enter image description here

If the values for the two columns ['old_index','year'] are the same, then the new index would be same. The value in the column 'num' does not matter.

I'm wondering if anyone can help. Thank you very much!

2
  • Hello, welcome to SO. Consider making a Tour and read the section How to ask before asking another question. Commented Jul 9, 2021 at 16:04
  • 1
    Thanks! I'd be sure to follow this next time. Commented Jul 10, 2021 at 17:55

1 Answer 1

1

df['new_id'] = df.groupby(df.columns.tolist(), sort=False).ngroup() + 1
df


index   year    id  new_id
0   1   2000    5   1
1   2   1996    3   2
2   2   1996    3   2
3   4   1994    2   3
4   4   1999    4   4
5   4   1999    4   4
6   12  1989    1   5
7   12  1989    1   5
8   12  1985    0   6
9   12  2011    6   7

Give this a try, but let me know if it isn't fully what you are looking for.

Sign up to request clarification or add additional context in comments.

5 Comments

Thank you for your help. It did work. I just edited the question, could you please check it? I'm looking for a way that works on selected columns.
@Arist. I just edited my answer. Try that new line of code
yes it works. I really appreciate your help :)
Be sure to click the check arrow to indicate your question has been answered next to my submission. @Arist
Just did. I'm new to Stackoverflow. Thanks for reminding me.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.