pyspark prompts an error for udf not defined

Question

Here is the code:

from py4j.protocol import Py4JJavaError
def parse_clf_time(s):
    try:
    #return "{0:04d}-{1:02d}-{2:02d} {3:02d}:{4:02d}:{5:02d}".format(int(s[7:11]),month_map[s[3:6]],int(s[0:2]),int(s[12:14]),int(s[15:17]),int(s[18:20]))
        return "{0:04d}-{1:02d}-{2:02d} {3:02d}:{4:02d}:{5:02d}".format(
            int(s[7:11]),
            month_map[s[3:6]],
            int(s[0:2]),
            int(s[12:14]),
            int(s[15:17]),
            int(s[18:20])
            )
    except Py4JJavaError as e:
        return "2016-08-11 00:00:01".format(
            int(s[7:11]),
            month_map[s[3:6]],
            int(s[0:2]),
            int(s[12:14]),
            int(s[15:17]),
            int(s[18:20])

u_parse_time = udf(parse_clf_time)

final_df = cleaned_df.select('*', u_parse_time(cleaned_df['timestamp']).cast('timestamp').alias('time')).drop('timestamp')
total_log_entries = final_df.count()

The df may contain bad data so I want to use a silly try except to handle it, please let me what is the best practice to exclude bad data.

For unknown reason, I got error:

So what's wrong with the code? It works in another project on the same environment so I am pretty sure the error should not be from the code itself.

Thank you very much, any clue is appreciated.

shuaiyuancn · Accepted Answer · 2016-08-15 14:46:07Z

6

You missed a ) for return "2016-08-11 00:00:01".format(

Also, you didn't have

from pyspark.sql.functions import udf

answered Aug 15, 2016 at 14:46

shuaiyuancn

2,7943 gold badges25 silver badges32 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

user6510402 · Accepted Answer · 2016-08-27 18:36:32Z

1

missing parentheses or bracket are indeed so common, I would suggest you using a text edit tool for double check in case like this. I use UltraEdit which is great to me.

answered Aug 27, 2016 at 18:36

user6510402

Collectives™ on Stack Overflow

pyspark prompts an error for udf not defined

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related