Another solution using stack() function
val df = Seq(
("John", "Jameson", "TRUE", "TRUE", "FALSE"),
("Kevin", "Smith", "TRUE", "FALSE", "TRUE")
).toDF("First Name", "Last Name", "Married", "Employed", "Children")
df.show(false)
df.createOrReplaceTempView("df")
+----------+---------+-------+--------+--------+
|First Name|Last Name|Married|Employed|Children|
+----------+---------+-------+--------+--------+
|John |Jameson |TRUE |TRUE |FALSE |
|Kevin |Smith |TRUE |FALSE |TRUE |
+----------+---------+-------+--------+--------+
spark.sql("""
select `First Name`, `Last Name`, stack(3,Married,"Married",Employed,"Employed",Children,"Children") (Value,Criteria) from df
""").show(false)
+----------+---------+-----+--------+
|First Name|Last Name|Value|Criteria|
+----------+---------+-----+--------+
|John |Jameson |TRUE |Married |
|John |Jameson |TRUE |Employed|
|John |Jameson |FALSE|Children|
|Kevin |Smith |TRUE |Married |
|Kevin |Smith |FALSE|Employed|
|Kevin |Smith |TRUE |Children|
+----------+---------+-----+--------+
If you want to use dataframe steps:
df.selectExpr( "`First Name`", "`Last Name`", """ stack(3,Married,"Married",Employed,"Employed",Children,"Children") (value,criteria) """ ).show(false)
+----------+---------+-----+--------+
|First Name|Last Name|value|criteria|
+----------+---------+-----+--------+
|John |Jameson |TRUE |Married |
|John |Jameson |TRUE |Employed|
|John |Jameson |FALSE|Children|
|Kevin |Smith |TRUE |Married |
|Kevin |Smith |FALSE|Employed|
|Kevin |Smith |TRUE |Children|
+----------+---------+-----+--------+
Or:
df.select( $"First Name", $"Last Name", expr(""" stack(3,Married,"Married",Employed,"Employed",Children,"Children") (value,criteria) """) ).show(false)