1

I am learning Python (Pandas) and working with the Pokemon DB (following Youtube tutorial) I am currently learning .loc and query methods so that I can access data in different ways. I encounter an error while working with df.query.

Please find my information below:

  1. I imported all my libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
  1. Read my DB file
df = pd.read_csv(r"C:\XXXX\pokemon.csv")
  1. I print the head of my df. Please see the results here

When I try to access the Pokemons that are type 1 = fire and type 2= flying using .loc it works and I have no errors as shown below:

df.loc[(df['Type 1'] == "Fire") & (df['Type 2'] == "Flying")]

Please see the results here

But, when I try to do the same using .query I get an error

df.query("'Type 1' == 'Fire' and 'Type 2' == 'Flying'")

The error is the following:

KeyError                                  Traceback (most recent call last)
Cell In[30], line 3
      1 #if we want to bring their rows to see their names. We need to be aware that it is 6 Pokemons
      2 #df.loc[(df['Type 1'] == "Fire") & (df['Type 2'] == "Flying")]
----> 3 df.query("'Type 1' == 'Fire' and 'Type 2' == 'Flying'")

File ~\anaconda3\envs\env1\lib\site-packages\pandas\core\frame.py:4605, in DataFrame.query(self, expr, inplace, **kwargs)
   4602 res = self.eval(expr, **kwargs)
   4604 try:
-> 4605     result = self.loc[res]
   4606 except ValueError:
   4607     # when res is multi-dimensional loc raises, but this is sometimes a
   4608     # valid query
   4609     result = self[res]

File ~\anaconda3\envs\env1\lib\site-packages\pandas\core\indexing.py:1153, in _LocationIndexer.__getitem__(self, key)
   1150 axis = self.axis or 0
   1152 maybe_callable = com.apply_if_callable(key, self.obj)
-> 1153 return self._getitem_axis(maybe_callable, axis=axis)

File ~\anaconda3\envs\env1\lib\site-packages\pandas\core\indexing.py:1393, in _LocIndexer._getitem_axis(self, key, axis)
   1391 # fall thru to straight lookup
   1392 self._validate_key(key, axis)
-> 1393 return self._get_label(key, axis=axis)

File ~\anaconda3\envs\env1\lib\site-packages\pandas\core\indexing.py:1343, in _LocIndexer._get_label(self, label, axis)
   1341 def _get_label(self, label, axis: AxisInt):
   1342     # GH#5567 this will fail if the label is not present in the axis.
-> 1343     return self.obj.xs(label, axis=axis)

File ~\anaconda3\envs\env1\lib\site-packages\pandas\core\generic.py:4236, in NDFrame.xs(self, key, axis, level, drop_level)
   4234             new_index = index[loc]
   4235 else:
-> 4236     loc = index.get_loc(key)
   4238     if isinstance(loc, np.ndarray):
   4239         if loc.dtype == np.bool_:

File ~\anaconda3\envs\env1\lib\site-packages\pandas\core\indexes\range.py:418, in RangeIndex.get_loc(self, key)
    416         raise KeyError(key) from err
    417 if isinstance(key, Hashable):
--> 418     raise KeyError(key)
    419 self._check_indexing_error(key)
    420 raise KeyError(key)

KeyError: False

I searched for the error, and I understand that KeyError happens when I am calling the wrong key or unexisting key. Why I am getting this issue when trying to retrieve Keys that I already use before?

**What I have tried until now? **

  1. I used other queries in my documents and they run just fine. such as:
df.query("Speed <= 10")
  1. I tried different syntaxes for my query thinking that maybe the "" and '' were wrong, but it did not work. I checked the documentation for that.

  2. I checked if maybe the names are wrong and verify the names of my columns with:

print(df.columns.tolist()) 
  1. I restarted my kernels and loaded my file again, thinking I moved or did something to change some settings.

  2. Updated my libraries in Anaconda just in case is something related to that

  3. I haven't tried the exceptions yet, as I am very curious why the video I am following has no issues

Please let me know if you have any ideas,

Thank you in advance

2
  • Note that your error is in a different query than that you posted, i.e. in df.query("'Type 1' == 'Poison' ") Commented Jan 1, 2024 at 9:42
  • @gtomer so sorry. Already changed to the right error. tried so many things. Commented Jan 1, 2024 at 9:45

1 Answer 1

1

You have to use backticks because Type 1 and Type 2 are not valid Python variable names:

>>> df.query("`Type 1` == 'Fire' and `Type 2` == 'Flying'")

                          Name Type 1  Type 2  Total   HP  Attack  Defense  Sp. Atk  Sp. Def  Speed  Generation  Legendary
#                                                                                                                         
6                    Charizard   Fire  Flying    534   78      84       78      109       85    100           1      False
6    CharizardMega Charizard Y   Fire  Flying    634   78     104       78      159      115    100           1      False
146                    Moltres   Fire  Flying    580   90     100       90      125       85     90           1       True
250                      Ho-oh   Fire  Flying    680  106     130       90      110      154     90           2       True
662                Fletchinder   Fire  Flying    382   62      73       55       56       52     84           6      False
663                 Talonflame   Fire  Flying    499   78      81       71       74       69    126           6      False

From documentation:

You can refer to column names that are not valid Python variable names by surrounding them in backticks.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.