See the below example for how you can:
- select specific columns from a
DataFrame by providing a list of columns between the selection brackets (link to tutorial)
- select specific rows from a
DataFrame by providing a condition between the selection brackets (link to tutorial)
- iterate rows of a
DataFrame, although I don't suppose you need it - if you'd like to keep working with the DataFrame after filtering it, it's better to use the method mentioned above (you won't have to put the rows back together, and it will likely be more performant because pandas is optimized for bulk operations)
import pandas as pd
# this is just for testing, instead of pd.read_csv(...)
df = pd.DataFrame([
dict(Country="Spain", City="Madrid", Jan="15", Feb="16", Mar="17", Apr="18", May=""),
dict(Country="Spain", City="Galicia", Jan="1", Feb="2", Mar="3", Apr="4", May=""),
dict(Country="France", City="Paris", Jan="0", Feb="2", Mar="3", Apr="4", May=""),
dict(Country="Algeria", City="Argel", Jan="20", Feb="28", Mar="29", Apr="30", May=""),
])
print("---- Original df:")
print(df)
selec = "Feb" # let's pretend this comes from input()
print("\n---- Just the 3 columns:")
df = df[["Country", "City", selec]] # narrow down the df to just the 3 columns
df[selec] = df[selec].astype("int64") # convert the selec column to proper type
print(df)
print("\n---- Filtered dataframe:")
df1 = df[df[selec] <= 25]
print(df1)
print("\n---- Iterated & filtered rows:")
for row in df.itertuples():
# we could also use row[3] instead of getattr(...)
if getattr(row, selec) <= 25:
print(row)
Output:
---- Original df:
Country City Jan Feb Mar Apr May
0 Spain Madrid 15 16 17 18
1 Spain Galicia 1 2 3 4
2 France Paris 0 2 3 4
3 Algeria Argel 20 28 29 30
---- Just the 3 columns:
Country City Feb
0 Spain Madrid 16
1 Spain Galicia 2
2 France Paris 2
3 Algeria Argel 28
---- Filtered dataframe:
Country City Feb
0 Spain Madrid 16
1 Spain Galicia 2
2 France Paris 2
---- Iterated & filtered dataframe:
Pandas(Index=0, Country='Spain', City='Madrid', Feb=16)
Pandas(Index=1, Country='Spain', City='Galicia', Feb=2)
Pandas(Index=2, Country='France', City='Paris', Feb=2)