A newbie in spark and have a problem about map function on data frame. I have a spark sql dataframe, named df, assuming it is like:
+----------+------------+------+
| time| tag| value|
+----------+------------+------+
|1399766400|A00000000001|1000.0|
|1399766401|A00000000002|1001.0|
+----------+------------+------+
I can select part of them based on the tag value with the command:
temp = sqlContext.sql("SELECT * FROM df WHERE tag = 'A00000000001'")
temp.show(1)
then we have:
+----------+------------+------+
| time| tag| value|
+----------+------------+------+
|1399766400|A00000000001|1000.0|
+----------+------------+------+
Currently, I have a list
x = ["SELECT * FROM df WHERE tag = 'A00000000001'", "SELECT * FROM df WHERE tag = 'A00000000002'"]
which has been stored as RDD variable and I would like to apply map function on it to count the number of dataframe selected based on them, I tried the function like:
y = x.map(lambda x: sqlContext.sql(x).count())
y.take(2)
I supposed that the return value should be [1, 1], but it gives the error:
TypeError: 'JavaPackage' object is not callable
Is it possible to execute a map function on a dataframe with this method? if not, how should I do.