All possible combinations of columns in dataframe depending on value in another column

Question

My df looks like this:

sent  token  token2
1     word1  word1
1     word2  word2
1     word3  word3
1     word4  word4
1     word5  word5
2     word6  word6

Now I want to get all possible combinations of tokens in a list if they have the same value for sent. The output should look like something like this:

[1, word1, word2, n]
[1, word1, word3, n]
[1, word1, word4, n]
[1, word1, word5, n]
[1, word2, word3, n]
...

I tried using itertools and crosstab consctructions but I can't seem to figure out how to add a condition to them.

What is n.. ?

jpp
– jpp

2018-05-14 15:32:48 +00:00
Commented May 14, 2018 at 15:32 — jpp
– jpp, Commented May 14, 2018 at 15:32
It's just a useless column I forgot to add in the frame.

Mi.
– Mi.

2018-05-14 15:34:57 +00:00
Commented May 14, 2018 at 15:34 — Mi.
– Mi., Commented May 14, 2018 at 15:34

BENY · Accepted Answer · 2018-05-14 15:36:52Z

1

You can using merge here, then sort the value , drop the duplicated by using np.sort and drop_duplicates

s=df.loc[:,['sent','token2']].merge(df.loc[:,['sent','token']],on='sent')
s[['token','token2']]=np.sort(s[['token','token2']],1)
s=s.loc[s.token2!=s.token].drop_duplicates()
s.head()

Out[213]: 
   sent token2  token
1     1  word2  word1
2     1  word3  word1
3     1  word4  word1
4     1  word5  word1
7     1  word3  word2

answered May 14, 2018 at 15:36

BENY

324k22 gold badges176 silver badges250 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

All possible combinations of columns in dataframe depending on value in another column

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related