1

Trying to find a way to do a sum of all the columns (there are around 7) with the criteria being 1 word? For example, across all the columns of Name, Fruit, Country, etc. I want to know how many times the word 'the' appears in each one. 

I can use this df3['Name'].str.count('The').sum(), and that will give this result:

Out[121]: 3522

But then when I add in the next string field so that it is

df3['Name'].str.count('The').sum()
df3['Fruit'].str.count('The').sum()

it only shows the last syntax (as expected): 

Out[122]:27

What I obviously want is for it to say:

Name: 3522
Fruit: 27

But I don't seem to be able to use str.count or str.contains in a way that groups it like I need. 

If the data is something like the following:

Name | Year | Score | 2nd Score | % of People | Country | Fruit | Export Countries | Language | Transit Duration | Quality | Taste | Freshness | Packaging
Andes, The | 2021 | 8 | 8.8 | 87% | The Netherlands | The Apple | United States,United Kingdom | English,Japanese,French | 148.0 | 1.0 | 0.0 | 0.0 | 0.0
Phil | 2021 | 8 | 8.4 | 87% | Spain | The Banana | United Kingdom, Germany | English,German,French,Italian | 165.0 | 1.0 | 0.0 | 0.0 | 0.0
Sarah | 2021 | 9 | 8.3 | 89% | Greece | The Plum | Germany,United States | English,German,French,Italian | 153.0 | 1.0 | 0.0 | 0.0 | 0.0

The expected output should be

Name: 1
Year: 0
Score: 0
2ndScore: 0
Country: 1
Fruit: 3
TransitDuration: 0
Quality: 0
Taste: 0
Freshness: 0
Packaging: 0
2
  • kindly share a reproducible example, with expected output Commented Aug 15, 2021 at 10:41
  • 1
    Have made an edit to the original post; is that what you're talking about? Commented Aug 15, 2021 at 10:45

1 Answer 1

1

You could use applymap to get your output; it hits every cell:

In [477]: df.applymap(lambda df: 'The' in df).sum()
Out[477]: 
Name        1
 Fruit      2
 Country    1
dtype: int64

The first part, which is the applymap, returns a series of booleans for each cell in each column:

In [476]: df.applymap(lambda df: 'The' in df)
Out[476]: 
   Name    Fruit    Country
0   True     True      True
1  False     True     False

From here, you can sum the booleans, which is just 1s and 0s

You could use the transform function, or apply to replicate the same result :

 df.transform(lambda df: df.str.contains('The')).sum()
Out[482]: 
Name        1
 Fruit      2
 Country    1
dtype: int64

Based on your comments, you could select only text columns, with the select_dtypes method:

In [483]: df.select_dtypes('object').applymap(lambda df: 'The' in df).sum()
Out[483]: 
Name        1
 Fruit      2
 Country    1
dtype: int64

Thanks to @Shubhamsharma, the solution below works:

 df.astype(str).applymap(lambda s: 'The' in s).sum()
Sign up to request clarification or add additional context in comments.

7 Comments

That would be absolutely perfect based on all the info given originally, however have realised that a catch all will not work because a couple of the columns are int and so this threw an error. Does your Syntax allow for nominating specific columns (i.e. ones that aren't int)?
Neither of those allow me to run it. The first df.applymap(lambda df: 'The' in df) gives me the error: argument of type 'int' is not iterable running the second df.transform(lambda df: df.str.contains('The')).sum(), I firstly get: AttributeError: Can only use .str accessor with string values! During handling of the above exception, another exception occurred: and then get the error down the bottom that is ValueError: Transform function failed. Have edited original to include int value
did you apply the select_types? df.select_dtypes('object').applymap(lambda df: 'The' in df).sum()
@Aemonar Check df.astype(str).applymap(lambda s: 'The' in s).sum()
df.astype(str).applymap(lambda s: 'The' in s).sum() absolutely worked @ShubhamSharma, thank you! @sammywemmy - I edited the original post to include the new info, but the above has worked for me.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.