3

Assuming the following DataFrame:

df = pd.DataFrame({'id': [8,16,23,8,23], 'count': [5,8,7,1,2]}, columns=['id', 'count'])

   id  count
0   8      5
1  16      8
2  23      7
3   8      1
4  23      2

...is there some Pandas magic that allows me to remap the ids so that the ids become sequential? Looking for a result like:

   id  count
0   0      5
1   1      8
2   2      7
3   0      1
4   2      2

where the original ids [8,16,23] were remapped to [0,1,2]

Note: the remapping doesn't have to maintain original order of ids. For example, the following remapping would also be fine: [8,16,23] -> [2,0,1], but the id space after remapping should be contiguous.

I'm currently using a for loop and a dict to keep track of the remapping, but it feels like Pandas might have a better solution.

3 Answers 3

3

use factorize:

>>> df
   id  count
0   8      5
1  16      8
2  23      7
3   8      1
4  23      2
>>> df['id'] = pd.factorize(df['id'])[0]
>>> df
   id  count
0   0      5
1   1      8
2   2      7
3   0      1
4   2      2
Sign up to request clarification or add additional context in comments.

Comments

1

You can do this via a groupby's labels:

In [11]: df
Out[11]:
   id  count
0   8      5
1  16      8
2  23      7
3   8      1
4  23      2

In [12]: g = df.groupby("id")

In [13]: g.grouper.labels
Out[13]: [array([0, 1, 2, 0, 2])]

In [14]: df["id"] = g.grouper.labels[0]

In [15]: df
Out[15]:
   id  count
0   0      5
1   1      8
2   2      7
3   0      1
4   2      2

Comments

0

This may be helpful to you.

x,y = pd.factorize(df['id'])
remap = dict(set(zip(list(x),list(y))))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.