Using pandas, how do i loc a value where my column contains lists?

Question

I have a df, with a column that contains a list. for example -

df = pd.DataFrame({'name': ['name1', 'name2', 'name3', 'name4'],
                   'age': [21, 23, 24, 28],
                   'occupation': ['data scientist',  'doctor',  'data analyst', 'engineer'],
                   'knowledge':[['python','c++'], ['python', 'c#'], ['css','js','html'], ['c#']],
                  })

now, I want to locate only the rows with 'python' as one of the 'knowledge' values in the list. how do I do that?

I tried to do: pd.loc[(pd['knowledge'].isin['python'])] and it didn't work

(edited to fix the code)

mozway · Accepted Answer · 2022-04-18 10:48:05Z

4

You need to use a loop:

df[['python' in l for l in df['knowledge']]]

output:

    name  age      occupation      knowledge
0  name1   21  data scientist  [python, c++]
1  name2   23          doctor   [python, c#]

alternatives

finding any element of a set

keep rows with at least one match

search = set(['python', 'js'])
df[[bool(search.intersection(l)) for l in df['knowledge']]]

output:

    name  age      occupation        knowledge
0  name1   21  data scientist    [python, c++]
1  name2   23          doctor     [python, c#]
2  name3   24    data analyst  [css, js, html]

matching all elements of a set

all elements need to match

search = set(['python', 'c++'])
df[[search <= set(l) for l in df['knowledge']]]

output:

    name  age      occupation      knowledge
0 name1   21  data scientist  [python, c++]

answered Apr 18, 2022 at 10:48

mozway

267k13 gold badges56 silver badges106 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Lajos Arpad Over a year ago

@NightHawk you can accept this answer as the correct answer so future visitors will see by a glance that the problem was solved and they are at the right place if their issue is similar to yours.

NightHawk Over a year ago

@LajosArpad Could you help with stackoverflow.com/questions/71912044/…

Lajos Arpad Over a year ago

@NightHawk unfortunately no, I'm not very experienced with Python.

Ynjxsjmh · Accepted Answer · 2022-04-18 12:25:51Z

0

You can try to join the list into space separated value, then find it contains your wanted word with word boundry.

m = df['knowledge'].str.join(' ').str.contains(r'\bpython\b')

Or you can try Series.apply

m = df['knowledge'].apply(lambda l: 'python' in l)

print(m)

0     True
1     True
2    False
3    False
Name: knowledge, dtype: bool

The use boolean indexing to select the True rows

print(df[m])

    name  age      occupation      knowledge
0  name1   21  data scientist  [python, c++]
1  name2   23          doctor   [python, c#]

answered Apr 18, 2022 at 12:25

Ynjxsjmh

30.3k7 gold badges43 silver badges64 bronze badges

Collectives™ on Stack Overflow

Using pandas, how do i loc a value where my column contains lists?

2 Answers 2

alternatives

finding any element of a set

matching all elements of a set

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

alternatives

finding any element of a set

matching all elements of a set

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related