1

I have a DataFrame and want to extract 3 columns from it, but one of them is an input from the user. I made a list, but need it to be iterable so I can run a For iteration. So far I made it through by making a dictionary with 2 of the columns making a list of each and zipping them... but I really need the 3 columns...

My code:

Data=pd.read_csv(----------)
selec=input("What month would you want to show?")
NewData=[(Data['Country']),(Data['City']),(Data[selec].astype('int64')]

#here I try to iterate:
iteration=[i for i in NewData if NewData[i]<=25] 
print (iteration)
*TypeError:list indices must be int ot slices, not Series*

My CSV is the following:

I want to be able to choose the month with the variable "selec" and filter the results of the month I've chosen... so the output for selec="Feb" would be:

I tried as well with loc/iloc, but not lucky at all (unhashable type:'list').

0

1 Answer 1

1

See the below example for how you can:

  • select specific columns from a DataFrame by providing a list of columns between the selection brackets (link to tutorial)
  • select specific rows from a DataFrame by providing a condition between the selection brackets (link to tutorial)
  • iterate rows of a DataFrame, although I don't suppose you need it - if you'd like to keep working with the DataFrame after filtering it, it's better to use the method mentioned above (you won't have to put the rows back together, and it will likely be more performant because pandas is optimized for bulk operations)
import pandas as pd

# this is just for testing, instead of pd.read_csv(...)
df = pd.DataFrame([
    dict(Country="Spain", City="Madrid", Jan="15", Feb="16", Mar="17", Apr="18", May=""),
    dict(Country="Spain", City="Galicia", Jan="1", Feb="2", Mar="3", Apr="4", May=""),
    dict(Country="France", City="Paris", Jan="0", Feb="2", Mar="3", Apr="4", May=""),
    dict(Country="Algeria", City="Argel", Jan="20", Feb="28", Mar="29", Apr="30", May=""),
])

print("---- Original df:")
print(df)

selec = "Feb"  # let's pretend this comes from input()

print("\n---- Just the 3 columns:")
df = df[["Country", "City", selec]]  # narrow down the df to just the 3 columns
df[selec] = df[selec].astype("int64")  # convert the selec column to proper type
print(df)

print("\n---- Filtered dataframe:")
df1 = df[df[selec] <= 25]
print(df1)

print("\n---- Iterated & filtered rows:")
for row in df.itertuples():
    # we could also use row[3] instead of getattr(...)
    if getattr(row, selec) <= 25:
        print(row)

Output:

---- Original df:
   Country     City Jan Feb Mar Apr May
0    Spain   Madrid  15  16  17  18
1    Spain  Galicia   1   2   3   4
2   France    Paris   0   2   3   4
3  Algeria    Argel  20  28  29  30

---- Just the 3 columns:
   Country     City  Feb
0    Spain   Madrid   16
1    Spain  Galicia    2
2   France    Paris    2
3  Algeria    Argel   28

---- Filtered dataframe:
  Country     City  Feb
0   Spain   Madrid   16
1   Spain  Galicia    2
2  France    Paris    2

---- Iterated & filtered dataframe:
Pandas(Index=0, Country='Spain', City='Madrid', Feb=16)
Pandas(Index=1, Country='Spain', City='Galicia', Feb=2)
Pandas(Index=2, Country='France', City='Paris', Feb=2)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.