I have a data in CSV in below format:
"/some/page-1.md","title","My title 1"
"/some/page-1.md","description","My description 1"
"/some/page-1.md","type","Tutorial"
"/some/page-1.md","index","True"
"/some/page-2.md","title","My title 2"
"/some/page-2.md","description","My description 2"
"/some/page-2.md","type","Tutorial"
"/some/page-2.md","index","False"
"/some/page-2.md","custom_1","abc"
"/some/page-3.md","title","My title 3"
"/some/page-3.md","description","My description 3"
"/some/page-3.md","type","Tutorial"
"/some/page-3.md","index","True"
"/some/page-3.md","custom_2","def"
I am reading it to Pandas DataFrame:
df = pd.read_csv(csvFile, index_col=False, dtype=object, header=None)
print(df)
Output is following:
0 1 2
0 /some/page-1.md title My title 1
1 /some/page-1.md description My description 1
2 /some/page-1.md type Tutorial
3 /some/page-1.md index True
4 /some/page-2.md title My title 2
5 /some/page-2.md description My description 2
6 /some/page-2.md type Tutorial
7 /some/page-2.md index False
8 /some/page-2.md custom_1 abc
9 /some/page-3.md title My title 3
10 /some/page-3.md description My description 3
11 /some/page-3.md type Tutorial
12 /some/page-3.md index True
13 /some/page-3.md custom_2 def
I'd like to transform it to DataFrame in below format, where first header is "file" and values are from column 0. Other headers are taken from column 1 and values from column 2:
file title description type index custom_1 custom_2
0 /some/page-1.md My title 1 My description 1 Tutorial True NaN NaN
1 /some/page-2.md My title 2 My description 2 Tutorial False abc NaN
2 /some/page-3.md My title 3 My description 3 Tutorial True NaN def
Is there a way to do this with Pandas?
df.pivot(index=0, columns=1, values=2).rename_axis(index='file', columns=None).reset_index()