I have a dataframe which has a lot of columns (more than 50 columns) and want to select all the columns as they are with few column names renamed by maintaining the below order. I tried the following ,
cols = list(set(df.columns) - {'id','starttime','endtime'})
df.select(col("id").alias("eventid"),col("starttime").alias("eventstarttime"),col("endtime").alias("eventendtime"),*cols,lit(proceessing_time).alias("processingtime"))
and got the error ,
SyntaxError: only named arguments may follow *expression
Also, instead of *cols, i tried to pass a list of column type
df.select(col("id").alias("eventid"),col("starttime").alias("eventstarttime"),col("endtime").alias("eventendtime"),([col(x) for x in cols]),lit(proceessing_time).alias("processingtime"))
which gives the following error,
`TypeError: 'Column' object is not callable`
Any help is highly appreciated.