In Python, I have an existing Spark DataFrame that includes 135~ columns, called sc_df1. I also have a Pandas DataFrame with the exact same columns that I want to convert to a Spark DataFrame and then unionByName the two Spark DataFrames. i.e., sc_df1.unionByName(sc_df2).
Does anyone know how to use the schema of sc_df1 when converting the Pandas DataFrame to a Spark DataFrame, so that the two Spark DataFrames will have the same schema when unioning?
I know this isn't working, but below is essentially what I'm trying to do:
sc_df2 = sc.createDataFrame(df2, schema = sc_df1.dtypes)
sc_df1.schemawork?