concatenate text & numbers in python/pandas

Question

I have a dataframe as below

+---+---+---+
| A | B | C |
+---+---+---+
| 1 | 0 | 0 |
+---+---+---+
| 0 | 0 | 1 |
+---+---+---+
| 2 | 1 | 1 |
+---+---+---+
| 3 | 1 | 2 |
+---+---+---+
| 4 | 2 | 3 |
+---+---+---+

df = pd.DataFrame({
    'A':[1,0,2,3,4],
    'B':[0,0,1,1,2],
    'C':[0,1,1,2,3]
})

My objective is to concatenate each element with it's corresponding column name and produce a series.

I tried below

df.dot(df.columns +', ').str[:-2]

what I get is

+---------------------------+
| A                         |
+---------------------------+
| C                         |
+---------------------------+
| A, A, B, C                |
+---------------------------+
| A, A, A, B, C, C          |
+---------------------------+
| A, A, A, A, B, B, C, C, C |
+---------------------------+

But, I want is

+------------+
| A          |
+------------+
| C          |
+------------+
| 2A, B, C   |
+------------+
| 3A, B, 2C  |
+------------+
| 4A, 2B, 3C |
+------------+

How should I change my code to achieve this?

Please have a look at my answer as well.

Mayank Porwal
– Mayank Porwal

2021-02-23 07:37:55 +00:00
Commented Feb 23, 2021 at 7:37 — Mayank Porwal
– Mayank Porwal, Commented Feb 23, 2021 at 7:37

jezrael · Accepted Answer · 2021-02-23 06:49:40Z

1

One idea with lambda function:

f = lambda x: ', '.join(f'{v}{k}' if v != 1 else k for k, v in x[x > 0].items())
df = df.apply(f, axis=1)
print (df)
0             A
1             C
2      2A, B, C
3     3A, B, 2C
4    4A, 2B, 3C
dtype: object

Another idea with melting, remove 0 rows, join numbers with columns names and last join in groupby:

df = df.melt(ignore_index=False)
df = df[df['value'].ne(0)]
df['variable'] = df['value'].mask(df['value'].eq(1), '').astype(str) + df['variable']

df = df.groupby(level=0)['variable'].agg(', '.join)
print (df)
0             A
1             C
2      2A, B, C
3     3A, B, 2C
4    4A, 2B, 3C
Name: variable, dtype: object

edited Feb 23, 2021 at 6:49

answered Feb 23, 2021 at 6:41

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Mayank Porwal · Accepted Answer · 2021-02-24 05:16:04Z

1

Another way of solving this using collections.Counter and List comprehension:

In [416]: from collections import Counter

In [403]: y = df.dot(df.columns).tolist()

In [420]: ans = [' ,'.join({k: (str(v)+k if v > 1 else k) for k,v in Counter(i).items()}.values()) if len(i) > 1 else i for i in y]

In [421]: pd.DataFrame(ans)
Out[421]: 
            0
0           A
1           C
2    2A ,B ,C
3   3A ,B ,2C
4  4A ,2B ,3C

Performance of solutions:

@jezrael solutions:

In [427]: def j():
     ...:     f = lambda x: ', '.join(f'{v}{k}' if v != 1 else k for k, v in x[x > 0].items())
     ...:     df.apply(f, axis=1)
     ...: 

In [428]: %timeit j()
1.22 ms ± 47.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [434]: def j1():
     ...:     x = df.melt(ignore_index=False)
     ...:     x = x[x['value'].ne(0)]
     ...:     x['variable'] = x['value'].mask(x['value'].eq(1), '').astype(str) + x['variable']
     ...:     x = x.groupby(level=0)['variable'].agg(', '.join)
     ...: 

In [435]: %timeit j1()
3.19 ms ± 139 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

My solution:

In [429]: def m():
     ...:     y = df.dot(df.columns).tolist()
     ...:     ans = [' ,'.join({k: (str(v)+k if v > 1 else k) for k,v in Counter(i).items()}.values()) if len(i) > 1 else i for i in y]
     ...:     pd.DataFrame(ans)
     ...: 

In [430]: %timeit m()
213 µs ± 3.87 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

edited Feb 24, 2021 at 5:16

answered Feb 23, 2021 at 7:37

Mayank Porwal

34.2k9 gold badges45 silver badges65 bronze badges

3 Comments

Tommy Over a year ago

Upvoted.! This works. 1st time I encountered ` from collections import Counter`. Thank You.!

Mayank Porwal Over a year ago

@Tommy I've also added performance. My solution works the fastest.

Tommy Over a year ago

Awesome.! Cheers..!

Collectives™ on Stack Overflow

concatenate text & numbers in python/pandas

2 Answers 2

Comments

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related