0

I built a data frame on python using an inputed SQL query. Afer this I name my columns and make sure it's nice to isolate columns with NaN values :

cursor.execute(raw_input("Enter your SQL query: "))
records = cursor.fetchall()
import pandas as pd
dframesql = pd.DataFrame(records)
dframesql.columns = [i[0] for i in cursor.description]

The problem comes after when I want to compare the number of rows with data with the total number of rows in the data frame :

dframelines = len(dframesql)
dframedesc = pd.DataFrame(dframesql.count())

When I try to compare dframedesc with dframelines, I get an error

nancol = []
for line in dframedesc:
    if dframedesc < dframelines:
        nancol.append(line)

ValueError: The truth value of a DataFrame is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Thanks in advance !

1 Answer 1

1

If you want to do it with a forloop, loop through the df's index:

nancol = []
for index in dframedesc.index:
    if dframedesc.loc[index,'a_column'] < dframelines:
        nancol.append(dframedesc.loc[index,:])

But why not just:

dframedesc[dframedesc['col_to_compare'] < dframelines]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.