I have the below script.
I am a bit stuck with this specific piece:
datex = datetime.datetime.strptime(df1.start_time,'%Y-%m-%d %H:%M:%S')
I can't figure out how to extract the actual value from the start_time field & store it in the datex variable.
Can anyone help me please?
while iters <10:
time_to_add = iters * 900
time_to_checkx = time_to_check + datetime.timedelta(seconds=time_to_add)
iters = iters + 1
session = 0
for row in df1.rdd.collect():
datex = datetime.datetime.strptime(df1.start_time,'%Y-%m-%d %H:%M:%S')
print(datex)
filterx = df1.filter(datex < time_to_checkx)
session = session + filterx.count()
print('current session value' + str(session))
print(session)
pyspark-sqlcode which will be more efficientitersexactly?