I have following dataframe as an output of my python script. I would like to add another column with count per pmid and add the counter to the first row, keeping the other rows.
The dataframe looks like this:
df
PMID gene_symbol gene_label gene_mentions
0 33377242 MTHFR Matched Gene 2
1 33414971 CSF3R Matched Gene 13
2 33414971 BCR Other Gene 2
3 33414971 ABL1 Matched Gene 1
4 33414971 ESR1 Matched Gene 1
5 33414971 NDUFB3 Other Gene 1
6 33414971 CSF3 Other Gene 1
7 33414971 TP53 Matched Gene 2
8 33414971 SRC Matched Gene 1
9 33414971 JAK1 Matched Gene 1
Expected out is:
PMID gene_symbol gene_label gene_mentions count
0 33377242 MTHFR Matched Gene 2 1
1 33414971 CSF3R Matched Gene 13 9
2 33414971 BCR Other Gene 2 9
3 33414971 ABL1 Matched Gene 1 9
4 33414971 ESR1 Matched Gene 1 9
5 33414971 NDUFB3 Other Gene 1 9
6 33414971 CSF3 Other Gene 1 9
7 33414971 TP53 Matched Gene 2 9
8 33414971 SRC Matched Gene 1 9
9 33414971 JAK1 Matched Gene 1 9
10 33414972 MAK2 Matched Gene 1 1
How can I achieve this output?
Thanks
