0

how to create dynamic dataframe name in pyspark here I am not able to create new dataframe using below code it will give me only last dataframe name, I need All dataframe name

for prime2 in pdf2:
    ol2 =  Bucket_path + prime2['S3_File_with_Path']
    t = 1
    sd = {}  
    testR = "df" + str(t)
    print("testR",testR)
    sd[testR] = spark.read.format("parquet").load(ol2).cache() 
    t = t + 1 

1 Answer 1

2

Seems like you're creating dict inside the loop, so getting a dict with only one (last) entry. Try changing code to something like this:

sd = {}  
for prime2 in pdf2:
    ol2 =  Bucket_path + prime2['S3_File_with_Path']
    t = 1
    testR = "df" + str(t)
    print("testR",testR)
    df = spark.read.format("parquet").load(ol2).cache() 
    sd[testR] = df
    t = t + 1 

# sd dict is available here, all the dataframes are inside
print(len(sd))
Sign up to request clarification or add additional context in comments.

1 Comment

hey @Ajinkya, was that answer helpful?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.