1

I'm having problems trying to use a variable containing my expression in the selectExpr of a dataframe.

So I have my variable as:

expression = '"substr(value,1,1) as qffffffffbf3ef0cf","substr(value,2,1) as q6a0aaf20"'

And trying to use this on the dataframe as:

ascii_df.selectExpr(expression).show(1)

However I keep getting a mismatched input error. If I put the expression in directly as follows it works:

ascii_df.selectExpr("substr(value,1,1) as qffffffffbf3ef0cf","substr(value,2,1) as q6a0aaf20").show(1)

Is there a way of doing this in PySpark?

1 Answer 1

2

You are in fact using two separate expressions. When using them directly in selectExpr you are using the expressions as two separate arguments to selectExpr:

selectExpr("substr(value,1,1) as qffffffffbf3ef0cf","substr(value,2,1) as q6a0aaf20")

However expression is a single string which is why it will not work. Instead change it to:

expressions = ["substr(value,1,1) as qffffffffbf3ef0cf","substr(value,2,1) as q6a0aaf20"]
ascii_df.selectExpr(expressions).show(1)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.