1

I have the following three dataframes:

df1 = pd.DataFrame(
{
"A_price": [10, 12, 15],
"B_price": [20, 19, 29],
"C_price": [23, 21, 4],
"D_price": [45, 47, 44],
},
index = ['01-01-2020', '01-02-2020', '01-03-2020']
)
df2 = pd.DataFrame(
{
"A_mid": [10, 12, 15],
"B_mid": [20, 19, 29],
"C_mid": [23, 21, 4],
"D_mid": [45, 47, 44],
},
index = ['01-01-2020', '01-02-2020', '01-03-2020']
)
df3 = pd.DataFrame(
{
"A_weight": [0.1, 0.2, 0.4],
"B_weight": [0.2, 0.5, 0.1],
"C_weight": [0.3, 0.2, 0.1],
"D_weight": [0.4, 0.1, 0.4],
},
index = ['01-01-2020', '01-02-2020', '01-03-2020']
)

I have defined the following function:

def price_weight(df1, df3):

    df_price_weight = pd.merge(df1, df3, left_index=True, right_index=True)
    if 'close' in df_price_weight.columns:
        df_price_weight.filter(regex=('close|weight'))
        df_price_weight.columns = df_price_weight.columns.str.split('_', expand=True)
        df_price_weight = df_price_weight.sort_index(axis=1)

    elif 'price' in df_price_weight.columns:
        df_price_weight.filter(regex=('price|weight'))
        df_price_weight.columns = df_price_weight.columns.str.split('_', expand=True)
        df_price_weight.rename(columns={'price':'close'}, inplace=True)
        df_price_weight = df_price_weight.sort_index(axis=1)
    
    else:
        df_price_weight.filter(regex=('mid|weight'))
        df_price_weight.columns = df_price_weight.columns.str.split('_', expand=True)
        df_price_weight.rename(columns={'mid':'close'}, inplace=True)
        df_price_weight = df_price_weight.sort_index(axis=1)

    return df_price_weight

For some reason, when I call price_weight(df1, df3), I don't get the right output. I should receive a dataframe with columns ['close', 'weight'], but I receive ['price', 'weight'].

How do I successfully define a function with multiple if statements to return the desired output?

UPDATE: I am trying to pass another function

def wmedian(dtfrm):
    df = dtfrm.unstack().sort_values('close')
    return df.loc[df['weight'].cumsum() > 0.5, 'close'].iloc[0]

where

dtfrm = price_weight(df1, df3)

The wmedian function should return a dataframe with close prices, but I am getting " KeyError: 'close' ".

What should I I change in the function?

Thank you.

1
  • 1
    What result do you get from df_price_weight.columns? When you look at that result, do you expect if 'close' in df_price_weight.columns: to be satisfied? Is it? Hint: what happens if you try 'x' in ['xx']? Commented Dec 28, 2021 at 20:47

1 Answer 1

2

The condition 'price' in df_price_weight.columns is never going to be True, because the exact string 'price' is not the name of a column.

Instead, I suggest:

any(('price' in column_name) for column_name in df_price_weight.columns)
Sign up to request clarification or add additional context in comments.

3 Comments

please see the update in my post. Your solution works with regards to the first function, but I am trying to pass a second function with no luck. Do you know what seems to be the problem? Thank you.
@MathMan99 Please don't edit your question to add a new, seemingly-unrelated, question into it. Post it as a new question instead. But try to include a minimal reproducible example in the new question. An example rebuilt from scratch which is focused on the problem and nothing else. We're not here to debug your code for you.
@MathMan99 For instance, your first question contained a lot of complicated stuff: merging pandas dataframes, filtering with regexps, splitting strings, renaming columns, etc. But the actual question was only about the behaviour of in for a list of strings. It would have been much better if you had done some effort to isolate this precise issue and focus the question on that. In fact, if you had done that, you probably would have found the solution on your own.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.