2

In scala/spark code I have 1 Dataframe which contains some rows:

col1      col2
Abc       someValue1 
xyz       someValue2
lmn       someValue3
zmn       someValue4
pqr       someValue5
cda       someValue6 

And i have a variable of ArrayBuffer[String] which contains [xyz,pqr,abc];

I want to filter given dataframe based on given values in arraybuffer at col1.

In SQL it would be like:

select * from tableXyz where col1 in("xyz","pqr","abc");

1 Answer 1

1

Assuming you have your dataframe:

val df = sc.parallelize(Seq(("abc","someValue1"),
                            ("xyz","someValue2"),
                            ("lmn","someValue3"),
                            ("zmn","someValue4"),
                            ("pqr","someValue5"),
                            ("cda","someValue6")))
                 .toDF("col1","col2") 

+----+----------+
|col1|      col2|
+----+----------+
| abc|someValue1|
| xyz|someValue2|
| lmn|someValue3|
| zmn|someValue4|
| pqr|someValue5|
| cda|someValue6|
+----+----------+ 

Then you can define an UDF to filter the dataframe based on array's values:

val array = ArrayBuffer[String]("xyz","pqr","abc") 

val function: (String => Boolean) = (arg: String) => array.contains(arg)
val udfFiltering = udf(function)

val filtered = df.filter(udfFiltering(col("col1")))
filtered.show()
+----+----------+
|col1|      col2|
+----+----------+
| abc|someValue1|
| xyz|someValue2|
| pqr|someValue5|
+----+----------+

Alternately you can register your dataframe and sql-query it by SQLContext:

var elements = ""
array.foreach { el => elements += "\"" + el + "\"" + "," }
elements = elements.dropRight(1)    
val query = "select * from tableXyz where col1 in(" + elements + ")"

df.registerTempTable("tableXyz")
val filtered = sqlContext.sql(query)
filtered.show()
+----+----------+
|col1|      col2|
+----+----------+
| abc|someValue1|
| xyz|someValue2|
| pqr|someValue5|
+----+----------+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.