I have some string values in one column and I would like to replace the substrings in that column with values in other columns, and replace all the plus signs with spaces (like below).
I have these List[String] mappings which are passed in dynamically where the mapFrom and mapTo should correlate in index.
Description values: mapFrom: ["Child", "ChildAge", "ChildState"]
Column names: mapTo: ["name", "age", "state"]
Input example:
name, age, state, description
tiffany, 10, virginia, Child + ChildAge + ChildState
andrew, 11, california, ChildState + Child + ChildAge
tyler, 12, ohio, ChildAge + ChildState + Child
Expected result:
name, age, state, description
tiffany, 10, virginia, tiffany 10 virginia
andrew, 11, california, california andrew 11
tyler, 12, ohio, 12 ohio tyler
How can I achieve this using Spark Scala?
When I try the solution from here: How to replace string values in one column with actual column values from other columns in the same dataframe?
The output becomes
name, age, state, description
tiffany, 10, virginia, tiffany tiffanyAge tiffanyState
andrew, 11, california, andrewState andrew andrewAge
tyler, 12, ohio, tylerAge tylerState tyler
ChildState + Child + Childwhich one is age and name, How do you know it?tyler, 12, ohio, ChildAge + ChildState + ChildNameand that this should betyler, 12, ohio, ChildAge + ChildState + Child, is that correct?ChildNameinmapFromare actuallyChildwhile allChildNamein the input are actually onlyChild. I edited the question to reflect this, please tell me if it's wrong.