1

I have the following dataframe

df = pd.DataFrame({
    'Column_1': ['Position', 'Start', 'End', 'Position'], 
    'Original_1': ['Open', 'Barn', 'Grass', 'Bubble'], 
    'Latest_1': ['Shut', 'Horn', 'Date', 'Dinner'], 
    'Column_2': ['Start', 'Position', 'End', 'During'], 
    'Original_2': ['Sky', 'Hold', 'Car', 'House'], 
    'Latest_2': ['Pedal', 'Lap', 'Two', 'Force'], 
    'Column_3': ['Start', 'End', 'Position', 'During'], 
    'Original_3': ['Leave', 'Dog', 'Block', 'Hope'], 
    'Latest_3': ['Sear', 'Crawl', 'Enter', 'Night']
})

For every instance where the word Position is in 'Column_1', 'Column_2', or 'Column_3', I want to capture the associated values in 'Original_1', 'Original_2', 'Original_3' and assign them to the new column named 'Original_Values'.

The following code can accomplish that, but only on a column by column basis.

df['Original_Value1'] = df.loc[df['Column_1'] == 'Position', 'Original_1']
df['Original_Value2'] = df.loc[df['Column_2'] == 'Position', 'Original_2']
df['Original_Value3'] = df.loc[df['Column_3'] == 'Position', 'Original_3']

Is there a way to recreate the above code so that it iterates over the entire data frame (not by specified columns)?

I'm hoping to create one column ('Original_values') with the following result:

0      Open
1      Hold
2     Block
3    Bubble
Name: Original_Values, dtype: object
2
  • Is this the same as your question from yesterday? Commented Jun 20, 2018 at 18:24
  • I think this is the kind of question you need to explain why you are doing it. I think this questions lacks context to understand where you want to apply the code. Commented Jun 20, 2018 at 19:07

3 Answers 3

2

One way to do it, with df.apply():

def choose_orig(row):
    if row['Column_1'] == 'Position':
        return row['Original_1']
    elif row['Column_2'] == 'Position':
        return row['Original_2']
    elif row['Column_3'] == 'Position':
        return row['Original_3']
    return ''

df['Original_Values'] = df.apply(choose_orig, axis=1)

The axis=1 argument to df.apply() causes the choose_orig() function to be called once for each row of the dataframe.

Note that this uses a default value of the empty string, '', when none of the columns match the word 'Position'.

Sign up to request clarification or add additional context in comments.

Comments

2

How about creating a mask with the first 3 cols (or specify the name of them) and multiply it with the values in cols 6 to 9 (or specify the names of them). Then take max() value to remove nan.

df['Original_Values'] = ((df.iloc[:,:3] == 'Position') * df.iloc[:,6:9].values).max(1)

print(df['Original_values'])

Returns:

0      Open
1      Hold
2     Block
3    Bubble
Name: Original_Value, dtype: object

Comments

0

Here's a kinda silly way to do it with some stacking, which might perform better if you have a very large df and need to avoid axis=1.

  • Stack the first three columns to create a list of the index and which 'Original' column the value corresponds to
  • Stack the columns from which you want to get the values. Use the above list to reindex it, so you return the appropriate value.
  • Bring those values back to the original df based on the original row index.

Here's the code:

import re

mask_list = ['Column_1', 'Column_2', 'Column_3']
val_list = ['Original_1', 'Original_2', 'Original_3']

idx = df[mask_list].stack()[df[mask_list].stack() == 'Position'].index.tolist()
idx = [(x , re.sub('(.*_)', 'Original_', y)) for x, y in idx]

df['Original_Values'] = df[val_list].stack().reindex(idx).reset_index(level=1).drop(columns='level_1')

df is now:

   Column_1  Column_2  Column_3 ... Original_Values
0  Position     Start     Start ...            Open
1     Start  Position       End ...            Hold
2       End       End  Position ...           Block
3  Position    During    During ...          Bubble

If 'Position' is not found in any of the columns in mask_list, Original_Values becomes NaN for that row. If you need to scale it to more columns, simply add them to mask_list and val_list.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.