0

im trying to pivot row to column. Im using pivot function but when i use it it just gives me the same exact database without any changes. Code runs fine without any errors , but i would like to reformat the data and add column attribute and value as shown below. Any help is greatly appreciated!

 // current database table

    Census_block_group  B08007e1    B08007m1    B08007e2    B08007m2
    010010201001          291         95         291         95
    010010201002          678        143         663         139

// what i need

    Census_block_group   attribute      value
     010010201001           B08007e1      678

//code

import org.apache.spark.sql.SQLContext

spark.conf.set("spark.sql.pivotMaxValues", 999999)

 val df = censusBlocks.toDF
df.groupBy("B08007e1").pivot("census_block_group")
display(df)

2 Answers 2

0

What you are actually trying to do is in fact an "unpivot" rather than a pivot. Spark does not have an unpivot function. Instead you can use the stack function.

import org.apache.spark.sql.functions._

val unPivotDF = df.select($"Census_block_group",
expr("stack(4, 'B08007e1', B08007e1, 'B08007m1', B08007m1, 'B08007e2', B08007e2, 'B08007m2', B08007e2) as (attribute,value)"))
unPivotDF.show()

You can find more details about the using the stack function here - https://sparkbyexamples.com/how-to-pivot-table-and-unpivot-a-spark-dataframe/

Sign up to request clarification or add additional context in comments.

Comments

0

You actually want to 'Transpose' not the 'Pivot'. Here is another solution(Sorry a bit lengthy) :-)

import org.apache.spark.sql.functions._
import org.apache.spark.sql.types.StringType
import org.apache.spark.sql.{Column, DataFrame, SparkSession}

object Stackoverflow3 {

  def main(args: Array[String]): Unit = {
    val spark = SparkSession.builder().appName("Test").master("local").getOrCreate()

    val df = <YOUR ORIGINAL DATAFRAME>

    val transposed = transform(df, Array("Census_block_group"))

    transposed
      .withColumn("Attribute", col("ColVal.col1"))
      .withColumn("Value", col("ColVal.col2"))
      .drop("ColVal")
      .show()
  }

  def transform(df: DataFrame, fixedColumns: Array[String]): DataFrame = {

    val colsToTranspose = df.columns.diff(fixedColumns)

    val createCols = {

      colsToTranspose.foldLeft(Array.empty[Column]) {
        case (acc, name) => acc.:+(struct(lit(name).cast(StringType), col(name).cast(StringType)))
      }

    }

    df
      .withColumn("colVal", explode(array(createCols: _*)))
      .select(Array("Census_block_group", "colVal").map(col): _*)
  }

}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.