0

I have a code to read multiple files (>10) into different dataframes in Pyspark. However, I would like to optimize this piece of code using a for loop and a reference variable or something like that. My code is as follows:

Features_PM = (spark.read
          .jdbc(url=jdbcUrl, table='Features_PM',
                properties=connectionProperties))

Features_CM = (spark.read
          .jdbc(url=jdbcUrl, table='Features_CM',
                properties=connectionProperties))

I tried something like this but it didn't work:

table_list = ['table1', 'table2','table3', 'table4']

for table in table_list:
     jdbcDF = spark.read \
         .format("jdbc") \
         .option("url", "jdbc:postgresql:dbserver") \
         .option("dbtable", "schema.{}".format(table)) \
         .option("user", "username") \
         .option("password", "password") \
         .load()

Source for the above snippet: https://community.cloudera.com/t5/Support-Questions/read-multiple-table-parallel-using-Spark/td-p/286498

Any help would be appreciated. Thanks

2
  • Get all the table name for that DB in a list, now create a generic function and read all the table name by iterating the list.. in this way you can have a function in order to read all the tables ... code reusability Commented Nov 24, 2020 at 16:14
  • Can someone help me with the code please. Commented Jan 3, 2021 at 11:18

1 Answer 1

1

You can use the following code to achieve your end goal. You will get a dictionary of dataframes where the key is the table name and value is teh appropriate dataframe

def read_table(opts):
    return spark.read.format("jdbc").options(**opts).load()

table_list = ['table1', 'table2','table3', 'table4']



table_df_dict = {table: read_table({"url":"jdbc:postgresql:dbserver",
                                   "dbtable":"schema.{}".format(table),
                                   "user": "username",
                                   "password":"password"})
                 for table in table_list}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.