I am working on a project using Learning to Rank. Below is the example dataset format (taken from https://www.microsoft.com/en-us/research/project/letor-learning-rank-information-retrieval/). The first column is the rank, second column is query id, and the followings are [feature number]:[feature value]
1008 qid:10 1:0.004356 2:0.080000 3:0.036364 4:0.000000 … 46:0.00000
1007 qid:10 1:0.004901 2:0.000000 3:0.036364 4:0.333333 … 46:0.000000
1006 qid:10 1:0.019058 2:0.240000 3:0.072727 4:0.500000 … 46:0.000000
Right now, I am successfully convert my data into this following format in Pandas.DataFrame.
10 qid:354714443278337 3500 1 122.0 156.0 13.0 1698.0 1840.0 92.28260 ...
...
The first two column is already fine. What I need next is appending feature number to the remaining columns (e.g. first feature from 3500 become 1:3500)
I know I can append a string to columns by using this following command.
df['col'] = 'str' + df['col'].astype(str)
Look at the first feature, 3500, is located at column index 2, so what I can think of is appending column index - 1 for each column. How do I append the string based on the column number?
Any help would be appreciated.