I getting an empty lists after intersect in python

Question

I have 2 columns and in each column I have 5 words in each row.

For example:
x=[dog|cat|mouse|new|world]
y=[fish|cat|new|thing|nice]

And I need to find intersections between them [cat|new].

But it shows me an empty list. Do you know why?

data = pd.read_csv('data.csv')

intersect1=[]
    
for j in range(len(data)):
    #print('==========================================================================')
        x=str(data.iloc[:, 2]).split("|")
        y=str(data.iloc[:, 3]).split("|")  


        #get_jaccard_sim(x, y) 
    
        #intersect.append(result)


        intersect= list(set(x) & set(y))   
        intersect1.append(intersect)
    
#print(inter)
print(intersect1)

kelvt · Accepted Answer · 2021-09-28 03:04:22Z

2

The issue is in your iteration loop, you are selecting the whole column when you do data.iloc[:,2] when you want to only select each value row by row. Change the : to use the counter in your loop, j.

df = pd.DataFrame({'x': ['dog|cat|mouse|new|world'],
                   'y': ['fish|cat|new|thing|nice']})
  
for j in range(len(df)):
      x=str(df.iloc[j, 0]).split("|")
      y=str(df.iloc[j, 1]).split("|")
      intersect= list(set(x) & set(y))   

print(intersect)

Output:

['new', 'cat']

edited Sep 28, 2021 at 3:04

answered Sep 27, 2021 at 8:29

kelvt

1,0588 silver badges18 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

d12 Over a year ago

I get in this way empty lists also

kelvt Over a year ago

You will need to share a more concrete dataset for us to look into. This above code worked well for me

d12 Over a year ago

I realized I took the wrong columns in excel that's why. Thank you!

DeX97 · Accepted Answer · 2021-09-27 08:26:33Z

2

I just did a test using the code below:

data1 = "dog|cat|mouse|new|world"
data2 = "fish|cat|new|thing|nice"

x = data1.split("|")
y = data2.split("|")

intersect= list(set(x) & set(y))

print(intersect)

This outputs ['cat', 'new'], exactly what you'd expect. Note that x and y are arrays containing the words as separate strings, i.e.:

['dog', 'cat', 'mouse', 'new', 'world'] # this is x
['fish', 'cat', 'new', 'thing', 'nice'] # this is y

Make sure that this is also the case in your code!

answered Sep 27, 2021 at 8:26

DeX97

6574 silver badges12 bronze badges

Comments

vtasca · Accepted Answer · 2021-09-27 08:29:12Z

Even though you added the code in a loop, you are not actually traversing your dataframe. Assuming your data is of this shape:

    one two
0   [dog|cat|mouse|new|world]   [fish|cat|new|thing|nice]
1   [dog|cat|mouse|new|world]   [fish|cat|new|thing|nice]
2   [dog|cat|mouse|new|world]   [fish|cat|new|thing|nice]
3   [dog|cat|mouse|new|world]   [fish|cat|new|thing|nice]
4   [dog|cat|mouse|new|world]   [fish|cat|new|thing|nice]
5   [dog|cat|mouse|new|world]   [fish|cat|new|thing|nice]
6   [dog|cat|mouse|new|world]   [fish|cat|new|thing|nice]
7   [dog|cat|mouse|new|world]   [fish|cat|new|thing|nice]
8   [dog|cat|mouse|new|world]   [fish|cat|new|thing|nice]
...

Then assuming the columns you're interested in are 2 and 3, modifying your like this would work:

for j in range(len(data)):
    x = data.iloc[j, 2][0].split('|')
    y = data.iloc[j, 3][0].split('|')
    intersect = list(set(x) & set(y))

Collectives™ on Stack Overflow

I getting an empty lists after intersect in python

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

3 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related