0

Hi I want to check a spark dataframe column value and set it based on checking if the row name matches with another dataframe row.

Example:

df1:
average name
3.5      n1
1.2      n2
4.2      n3

df2:
name    
n1     
n1        
n1    
n2
n3
n1
n2
n3
n3

df_i_want:
average name
3.5      n1
3.5      n1
3.5      n1
1.2      n2
4.2      n3
3.5      n1
1.2      n2
4.2      n3
4.2      n3
1
  • why are you guys down voting please explain? Commented Jul 25, 2018 at 23:06

2 Answers 2

2

All you needed to do was a Join

You can achieve the result like below.

Join your data frame df2 with df1 on name and the select the order of columns you want

df3 = df2.join(df1, on = 'name').select('average', 'name')

The above code snippet should give you the desired result

Sign up to request clarification or add additional context in comments.

Comments

2

You need a join to do this task:

## join both data on name
df3 = df2.join(df1, on='name',how='left')

# change column sequence
df3 = df3.select('average','name')

# order by name values
df3 = df3.orderBy('name', ascending=True)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.