0

What is the most efficient way (using the least amount of lines possible) to locate and drop multiple strings in a specified column?

Information regarding the .tsv dataset that may help:

'tconst' = movie ID

'region' = region in which the movie was released in

'language' = language of movie

Here is what I have right now:

origin.drop(origin.columns[[1,2,5,6,7]], axis=1, inplace=True)
origin.columns = ['tconst','region','language']
origin.drop(origin.loc[origin['region']!=('US')].index, inplace=True)

I am trying to drop all rows under the 'region' column that contain a string value abbreviation of countries that do not speak English. For example, drop all rows under region that are not equal to 'US','UK','AUS',etc. I have tried the & and or operands within the parenthesis and they tend to select and drop only one of the string values that I place inside.

ADDITIONAL QUESTION:

The 'language' column contains numerous amounts of null values (I don't really care about that), however there are some rows that contain 'en' for English. IF the 'region' happens to be a non-English speaking country BUT the language is in English, how do I prevent those rows from being removed as well?

1 Answer 1

1

Assuming I understood your question correctly and you want to drop the entire row if your specified conditions holds, you can write them as masks like

cond1 = origin['region'].isin(['US', 'UK', 'AUS'])
cond2 = origin['language'] == 'en'

Combining them (you want at least one of the conditions to be true in order to keep a row) with | you get

result = origin[cond1 | cond2]

Of course, this can also be written in a single line if you care to do so:

result = origin[(origin['region'].isin(['US', 'UK', 'AUS'])) | (origin['language'] == 'en')]
Sign up to request clarification or add additional context in comments.

1 Comment

This works. Very much appreciated. Dropped the DT from 2mil+ entries to barely over 100K haha

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.