11

I am attempting to create a graph by querying values in a pandas DataFrame.

In this line:

data1 = [np.array(df.query('type == i')['continuous']
         for i in ('Type1', 'Type2', 'Type3', 'Type4')]

I get the error:

UndefinedVariableError: name 'i' is not defined

What am I missing?

3 Answers 3

16

The i in your query expression

df.query('type == i')

is literally just the string 'i'. Since there are no extra enclosing quotes around it, pandas interprets it as the name of another column in your DataFrame, i.e. it looks for cases where

df['type'] == df['i']

Since there is no i column, you get an UndefinedVariableError.

It looks like you intended to query where the values in the type column are equal to the string variable named i, i.e. where

df['type'] == 'Type1'
df['type'] == 'Type2' # etc.

In this case you need to actually insert the string i into the query expression:

df.query('type == "%s"' % i)

The extra set of quotes are necessary if 'Type1', 'Type2' etc. are values within the type column, but not if they are the names of other columns in the dataframe.

Sign up to request clarification or add additional context in comments.

6 Comments

Hi ali_m, I must be missing something in your explanation. Apologies, I am still very new at this! This line now looks like this: data1 = [np.array(df.query('type == %s' %type) ['continous']) for i in ('Type1', 'Type2', 'Type3', 'Type4') It tells me that Type1 is undefined. Type1 is a value within column 'type.' What am I doing wrong?
Sorry, that's partly my fault - it wasn't clear whether 'Type1', 'Type2' etc. were values in the type column or the names of other columns. In the former case you also need to enclose the string value in quotes within the expression: df.query('type == "%s"' % i)
Thanks for working through this with me. I am still struggling with this. Would you be able to post a full line example? My line now looks like this: data1 = [np.array(df.query('type == "%s"' %Type1 %Type2 %Type3 %Type4)['continuous'])] and I still get the same error that Type1 is undefined.
No. i is the name of a string variable, whose value will change from 'Type1' to 'Type2' etc. as you iterate over the tuple of strings. On each loop iteration you want to insert the current value of the variable i into the query string, hence df.query('type == "%s"' % i) for i in ('Type1', 'Type2', ..., )
That did it! I have another error further down that I will try and solve now, but this definitely answered my question. Thank you again for your patience.
|
3
data1 = [np.array(df.query('type == @i')['continuous']
     for i in ('Type1', 'Type2', 'Type3', 'Type4')]

use '@' to refer variables

please refer to documentation, which writes:

You can refer to variables in the environment by prefixing them with an ‘@’ character like @a + b.

2 Comments

Hi, and thanks for the answer. It would really help our readers if you could explain why and how your answer solves the OPs problems
@SimasJoneliunas Thanks for your suggestion. I have add the link to the documentation.
2

I know too late but maybe it helps somebody - use double quotes for i data1 = [np.array(df.query('type == "i"')['continuous']

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.