3

I have the following dataframe:

>>> d = {'route1': ['a', 'b'], 'route2': ['c', 'd'], 'val1': [1,2]}
>>> df = pd.DataFrame(data=d)
>>> df
  route1 route2  val1
0      a      c     1
1      b      d     2

What I am trying to do is to pass a list that contains some column names and print the row value associated with that columns:

>>> def row_eval(row, list):
>>>    print(row.loc[list])

In the dataframe above, I first find all the columns that contains the name "route" and then apply the row_val func to each row. However I get the following err:

>>> route_cols = [col for col in df.columns if 'route' in col]
>>> route_cols
['route1', 'route2']
>>> df.apply(lambda row: row_eval(row, route_cols)
KeyError: "None of [Index(['route1', 'route2'], dtype='object')] are in the [index]"

Result should look like this:

route1 a
route2 c

route1 b
route2 d

3
  • Will you please add what the resulting df should look like? Commented Feb 8, 2021 at 19:39
  • In your sample output, is that 2 dataframes? Commented Feb 8, 2021 at 19:45
  • @dfundako no the function is being applying to each row individually. So the outputs are the values of the row['route1'] and row['route2']. Commented Feb 8, 2021 at 19:54

3 Answers 3

2

Add axis=1 to the apply function:

df.apply(lambda row: row_eval(row, route_cols), axis=1)

Without axis=1, you are iterating over the row index (column-wise). The row index are 0 1 instead of the column index route1 route2 you want to match. Hence, the error message. What you want is to have row-wise operation (i.e. passing row to the apply function), then you need axis=1

Sign up to request clarification or add additional context in comments.

Comments

1

One way to print all of the values in these columns, is to iterate over all of the columns within a loop that iterates through all of the rows. Then to simply print the column name and the value together. The if statement is optional, but it will give a line break between rows, like in your example.

for idx in df.index:
    for column in route_cols:
        print(f'{column} {df.loc[idx, column]}')
        if column == route_cols[-1]:
            print('\n')

Comments

0

To get you started, you can use either .melt() or .stack() to get real close to your expected output. Not 100% sure if you're looking for 2 dataframes or not.

df[route_cols].stack()

0  route1    a
   route2    c
1  route1    b
   route2    d
dtype: object
    
df[route_cols].melt()

  variable value
0   route1     a
1   route1     b
2   route2     c
3   route2     d

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.