2

I have two functions, foo and bar, that I want to write like follows:

def foo(df : DataFrame, conditionString : String) = 
  val conditionColumn : Column = something(conditionString) //help me define "something"
  bar(df, conditionColumn)
}
def bar(df : DataFrame, conditionColumn : Column) = {
  df.where(conditionColumn)
}

Where condition is a sql string like "person.age >= 18 AND person.citizen == true" or something.

Because reasons, I don't want to change the type signatures here. I feel this should work because if I could change the type signatures, I could just write:

def foobar(df : DataFrame, conditionString : String) = {
  df.where(conditionString)
}

As .where is happy to accept a sql string expression.

So, how can I turn a string representing a column expression into a column? If the expression were just the name of a single column in df I could just do col(colName), but that doesn't seem to take the range of expressions that .where does.

If you need more context for why I'm doing this, I'm working on a databricks notebook that can only accept string arguments (and needs to take a condition as an argument), which calls a library I want to take column-typed arguments.

1 Answer 1

2

You can use functions.expr:

def expr(expr: String): Column 

Parses the expression string into the column that it represents

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.