I am new in Python and I was wondering if somebody can help me with the below task.
I am having the below dataframe df with the following columns:
- Primary ID
- Secondary ID
- Entity type
- Value
Each primary entity (Entity type: A) might consist of some secondary entities (Entity types; X or Y). For entity types A, the primary id is the same with the secondary id. Also, each primary entity and each secondary entity have a value.
In columns 'Sum of values Secondary id X' and 'Sum of values Secondary id Y', I want to have the aggregate value of the secondary entities (X and Y) which correspond to each primary entity. The aggregate values should be in the row of the primary entity.
So, my initial df is this:
| Primary ID | Secondary ID | Entity type | Value |
|---|---|---|---|
| 0109 | 0109 | A | 200 |
| 0109 | A234 | X | 100 |
| 0109 | A234 | X | 50 |
| 9996 | 9996 | A | 400 |
| 9996 | AAGT | X | 120 |
| 9996 | AABG | X | 30 |
| 9996 | 0082 | Y | 50 |
| A765 | A765 | A | 50 |
And I just want to add the 2 columns, without changing the format of the initial dataframe:
| Primary ID | Secondary ID | Entity type | Value | Sum of values Secondary id X | Sum of values Secondary id Y |
|---|---|---|---|---|---|
| 0109 | 0109 | A | 200 | 150 | 0 |
| 0109 | A234 | X | 100 | 0 | 0 |
| 0109 | A234 | X | 50 | 0 | 0 |
| 9996 | 9996 | A | 400 | 150 | 50 |
| 9996 | AAGT | X | 120 | 0 | 0 |
| 9996 | AABG | X | 30 | 0 | 0 |
| 9996 | 0082 | Y | 50 | 0 | 0 |
| A765 | A765 | A | 50 | 0 | 0 |
Thank you!