I have created a dataframe by reading the data from db2 and dataframe looks like below.
df1.show()
Table_Name | Source_count | Target_Count
----------------------------------------
Test_tab | 12750 | 12750
After that, I have added 4 columns with values hardcoded using withcolumn lit. after adding these columns count was changed.
df2 = df1.withColumn("").withColumn("").withColumn("").withColumn("")
df2.show()
| Table_Name | Source_count | Target_Count | batch | source | test_type | Record_ts |
|---|---|---|---|---|---|---|
| Test_tab | 12600 | 12750 | -1 | p1 | count | 2022-05-12 20:20:15 |
I didn't understand why this happens. df2 was created immediately after df1.
Can someone explain what are the possibilities for this change?
dfin your first snippet, butdf2is built fromdf1.df1calculated from another data or read directrly from files also if you can adddf2.explain()to your question