thanks for your time.
I need to read several file paths, which are divided into months and days (/mm/dd/*.json)
I've been trying to traverse the path associated with days, but my loop always sticks with the last read:
for i_dia in range(1, 9):
df_json = spark.read.json('/mnt/datalake/'+Year+'/'+ Month +'/'+ str(0) + str(i_dia) +'/'+ '*', mode="PERMISSIVE",multiLine = "true")
return df_json
display(df_json)
How should the correct reading be done? I want to read all files in only one big dataframe please.
From already thank you very much.
Regards
but my loop always sticks with the last readCan you clarify this part? What's going wrong? PS: Python range is not inclusive, so if you do range(1, 9) you will get 1 through 8. This may be the cause of your problem.pd.concat()in order to achieve that.