1
val df = sc.parallelize(Seq((201601, 100.5),
  (201602, 120.6),
  (201603, 450.2),
  (201604, 200.7),
  (201605, 121.4))).toDF("date", "volume")

val w = org.apache.spark.sql.expressions.Window.orderBy("date")    
val leadDf = df.withColumn("new_col", lag("volume", 1, 0).over(w))
leadDf.show()

+------+------+-------+
|  date|volume|new_col|
+------+------+-------+
|201601| 100.5|    0.0|
|201602| 120.6|  100.5|
|201603| 450.2|  120.6|
|201604| 200.7|  450.2|
|201605| 121.4|  200.7|
+------+------+-------+

This is working fine.

But if I have one more column as territory like below.

val df = sc.parallelize(Seq((201601, ter1, 10.1),
  (201601, ter2, 10.6),
  (201602, ter1, 10.7),
  (201603, ter3, 10.8),
  (201603, ter4, 10.8),
  (201603, ter3, 10.8),
  (201604, ter4, 10.9))).toDF("date", "territory", "volume")

My requirement is for the same territory, I want to find the volume of previous month(if exists) if not exists just assign a value 0.0

6
  • How can I do this ? Commented Jan 23, 2017 at 16:55
  • I tried doing this way.. Commented Jan 23, 2017 at 16:56
  • val w = org.apache.spark.sql.expressions.Window.orderBy("date", "territory") val leadDf = df.withColumn("new_col", lag("volume", 1, 0).over(w)) But doesnt work Commented Jan 23, 2017 at 16:57
  • I just included territory in orderBy clause....doesnt give the correct results Commented Jan 23, 2017 at 16:57
  • I am using Spark 1.6.2, Scala 2.10 Commented Jan 23, 2017 at 17:24

1 Answer 1

1

If I understand correctly you want the value of the previous date for the same territory.

If so then just add partitionBy i.e. redefine your window spec as follows:

val w = org.apache.spark.sql.expressions.Window.partitionBy("territory").orderBy("date")
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.