0

I have created a dataframe by reading the data from db2 and dataframe looks like below.

df1.show()


Table_Name | Source_count | Target_Count
----------------------------------------
Test_tab   |  12750       | 12750

After that, I have added 4 columns with values hardcoded using withcolumn lit. after adding these columns count was changed.

df2 = df1.withColumn("").withColumn("").withColumn("").withColumn("")

df2.show()
Table_Name Source_count Target_Count batch source test_type Record_ts
Test_tab 12600 12750 -1 p1 count 2022-05-12 20:20:15

I didn't understand why this happens. df2 was created immediately after df1.

Can someone explain what are the possibilities for this change?

5
  • Did you type your code correctly? You have df in your first snippet, but df2 is built from df1. Commented Nov 16, 2022 at 10:48
  • yes it is df1 not df. Commented Nov 16, 2022 at 10:51
  • Is your df1 calculated from another data or read directrly from files also if you can add df2.explain() to your question Commented Nov 16, 2022 at 11:08
  • df2.show() is another action that executed on underlying files. it seems underlying files might changed. Did you retry to run df1.show()? Commented Nov 16, 2022 at 11:43
  • Thank you... df2 = df1 df2=df2.withColumn()..... df2.show() i did this and working fine. Commented Nov 17, 2022 at 9:39

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.