How can I add sequence of string as column on dataFrame and make as transforms

Question

I have a sequence of string

val listOfString : Seq[String] = Seq("a","b","c")

How can I make a transform like

def addColumn(example: Seq[String]): DataFrame => DataFrame {
some code which returns a transform which add these String as column to dataframe
}
input
+-------
| id                      
+-------
|  1     
+-------
output 
+-------+-------+----+-------
| id    |    a  |  b |    c                   
+-------+-------+----+-------
|  1    |  0    |  0 |    0     
+-------+-------+----+-------

I am only interested in making it as transform

abiratsis · Accepted Answer · 2020-05-05 23:31:55Z

1

You can use the transform method of the datasets together with a single select statement:

import org.apache.spark.sql.DataFrame
import org.apache.spark.sql.functions.lit

def addColumns(extraCols: Seq[String])(df: DataFrame): DataFrame = {
  val selectCols = df.columns.map{col(_)} ++ extraCols.map{c => lit(0).as(c)}
  df.select(selectCols :_*)
}


// usage example
val yourExtraColumns : Seq[String] = Seq("a","b","c")

df.transform(addColumns(yourExtraColumns))

Resources

https://towardsdatascience.com/dataframe-transform-spark-function-composition-eb8ec296c108

https://mungingdata.com/apache-spark/chaining-custom-dataframe-transformations/

edited May 5, 2020 at 23:31

answered May 5, 2020 at 23:01

abiratsis

7,3414 gold badges31 silver badges49 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Aditya Seth Over a year ago

thanks for posting but Can you make an intermediate function having datatype as DataFrame => DataFrame . The reason is that In my code i am having List[DataFrame => DataFrame] as a function

abiratsis Over a year ago

the above function you can also write it as

def addColumns(extraCols: Seq[String]): DataFrame => DataFrame = {   val selectCols = df.columns.map{col(_)} ++ extraCols.map{c => lit(0).as(c)}   df => df.select(selectCols :_*) }

this is what you mean? It is the same as above but different declaration type in Scala

notNull · Accepted Answer · 2020-05-05 23:12:39Z

1

Use .toDF() and pass your listOfString.

Example:

//sample dataframe
df.show()
//+---+---+---+
//| _1| _2| _3|
//+---+---+---+
//|  0|  0|  0|
//+---+---+---+


df.toDF(listOfString:_*).show()
//+---+---+---+
//|  a|  b|  c|
//+---+---+---+
//|  0|  0|  0|
//+---+---+---+

UPDATE:

Use foldLeft to add the columns to the existing dataframe with values.

val df=Seq(("1")).toDF("id")

val listOfString : Seq[String] = Seq("a","b","c")

val new_df=listOfString.foldLeft(df){(df,colName) => df.withColumn(colName,lit("0"))}
//+---+---+---+---+
//| id|  a|  b|  c|
//+---+---+---+---+
//|  1|  0|  0|  0|
//+---+---+---+---+

//or creating a function 
import org.apache.spark.sql.DataFrame

def addColumns(extraCols: Seq[String],df: DataFrame): DataFrame = {
  val new_df=extraCols.foldLeft(df){(df,colName) => df.withColumn(colName,lit("0"))}
  return new_df
}

addColumns(listOfString,df).show()
//+---+---+---+---+
//| id|  a|  b|  c|
//+---+---+---+---+
//|  1|  0|  0|  0|
//+---+---+---+---+

edited May 5, 2020 at 23:12

answered May 5, 2020 at 22:19

notNull

31.8k4 gold badges41 silver badges58 bronze badges

6 Comments

Aditya Seth Over a year ago

I want to make a transfrom of type DataFrame => DataFrame

notNull Over a year ago

@AdityaSeth,Use toDF() on the existing DataFrame and new dataframe will have new column names!

Aditya Seth Over a year ago

How you are initializing the value like I mean by default value

notNull Over a year ago

@AdityaSeth, Check my Updated answer by using foldLeft to add the columns.

Aditya Seth Over a year ago

thanks for posting but Can you make an intermediate function having datatype as DataFrame => DataFrame . The reason is that In my code i am having List[DataFrame => DataFrame] as a function

|

Collectives™ on Stack Overflow

How can I add sequence of string as column on dataFrame and make as transforms

2 Answers 2

2 Comments

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related