0

I have a df like this:

names = ["Internal medicine, Gastroenterology", "Internal medicine, Family and general medicine, Endocrinology", "Pediatrics, Medical genetics, Laboratory medicine", "Internal medicine"]
df = pd.DataFrame(names, columns=['names'])

I would like to know how often each medical term occurs. e.g. here

  • Internal medicine: 3
  • Gastroenterology: 1
  • etc

It works with Counter for words but how do I get it working for phrases such as "Internal medicine"? The ", " separates the phrases.

2 Answers 2

1

split by , and then explode and then value_counts

df['names'].str.split(", ").explode().value_counts()
Sign up to request clarification or add additional context in comments.

1 Comment

should be split(", ") otherwise it miscounts but thanks otherwise great!! accepting in 5 min
1

If you want to use collections.Counter, you can do this:

In [1945]: from collections import Counter

In [1946]: d = Counter(df['names'].str.split(", ").explode().tolist())

In [1947]: d
Out[1947]: 
Counter({'Internal medicine': 3,
         'Gastroenterology': 1,
         'Family and general medicine': 1,
         'Endocrinology': 1,
         'Pediatrics': 1,
         'Medical genetics': 1,
         'Laboratory medicine': 1})

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.