I am trying to read a file and have the distinct values of each column in a dataframe having similar column names.
File has say 3 columns
EMP ID DEPT Salary
=============================
100 Sales 10000
200 MFG 10000
300 IT 10000
400 Sales 10000
500 MFG 10000
600 IT 10000
Expected Output
EMP ID DEPT Salary
========================
100 Sales 10000
200 MFG
300 IT
400
500
600
I have read the file, got the list of unique values as below
df=pd.read_csv('C:/Users/jaiveeru/Downloads/run_test1.csv')
cols=df.columns.tolist()
df1=pd.DataFrame()
df2=pd.DataFrame()
for i in cols:
lst=df[i].unique().tolist()
str1 = ','.join(lst)
lst2=[str1]
df1[i]=lst2
df2=pd.concat([df2,df1])
However as each column can have different number of unique values I am getting the below error
ValueError: Length of values does not match length of index