2

I want to do some language detection using the python package textblob: I created a new column in a pandas df which should contain the detected language:

from textblob import TextBlob
posts['Language']=posts['Caption'].apply(TextBlob.detect_language)

This code works. However, with one df it interrupts and throws an exeception ('TranslatorError') where the respective row contains less then 3 character. Therefore, I'd like write a function which ensures that the 'TextBlob.detect_language' function gets applied to the full df even when an exception occurs.

I thought about something like that:

def get_language(r):
    try:
        return r.TextBlob.detect_language()
    # except (r.TextBlob.detect_language==TranslatorError):
        return np.nan # where textblob was not able to detect language -> nan

However, I don't know what to write after the (outcommented) "except" clause. Any help?

The current function applied (with the except not outcommented)

posts['Language']=posts['Caption'].apply(get_language)

returns

AttributeError: 'TextBlob' object has no attribute 'TextBlob'

if I try

def get_language(r):
    try:
        return r.TextBlob.detect_language()
    except:
        pass # (or np.nan)

it just passes all the rows, i.e. doesn't detect the language for any row...

Thanks for help guys!

0

1 Answer 1

2

see below:

from textblob import TextBlob
import pandas

def detect_language(text):
    try:
        b = TextBlob(text)
        return b.detect_language()
    except:
        return "Language Not Detected"

df = pandas.DataFrame(data=[("na","hello"),("na", "bonjour"),("na", "_")], columns = ['Language', 'Caption']) 
df['Language']=df['Caption'].apply(detect_language)
df
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.