I am fetching data from mysql table using pyspark like below.
df = sqlContext.read.format("jdbc").option("url", "{}:{}/{}".format(domain,port,mysqldb)).option("driver", "com.mysql.jdbc.Driver").option("dbtable", "(select ifnull(max(id),0) as maxval, ifnull(min(id),0) as minval, ifnull(min(test_time),'1900-01-01 00:00:00') as mintime, ifnull(max(test_time),'1900-01-01 00:00:00') as maxtime FROM `{}`) as `{}`".format(table, table)).option("user", "{}".format(mysql_user)).option("password", "{}".format(password)).load()
The result of df.show() is below
+------+------+-------------------+-------------------+
|maxval|minval| mintime| maxtime|
+------+------+-------------------+-------------------+
| 1721| 1|2017-03-09 22:15:49|2017-12-14 05:17:04|
+------+------+-------------------+-------------------+
Now I want to get column and its value seperately.
I want to get
max_valval = 1721
min_valval = 1
min_timetime = 2017-03-09 22:15:49
max_timetime = 2017-12-14 05:17:04
I have done like below.
max_val = df.select('maxval').collect()[0].asDict()['maxval']
min_val = df.select('minval').collect()[0].asDict()['minval']
max_time = df.select('maxtime').collect()[0].asDict()['maxtime']
min_time = df.select('mintime').collect()[0].asDict()['mintime']
Is there a better way to do this in pyspark.