I have 4 columns in a dataframe, they are:
id external_id value description
Exemple table:
| ID | external_id | value | description |
|---|---|---|---|
| 1 | 65 | Pendent | Account Onboarding All iN |
| 1 | 65 | Lais | Gestor |
| 2 | 93 | Maria | Account Onboarding All iN |
| 3 | 454 | Renan | Gestor |
| 4 | 535 | Osiris | Modelo de negocio |
| 5 | 999 | Togo | Account Onboarding All iN |
| 6 | 25 | João | Gestor |
| 7 | 85 | Lima | Account CS SM |
| 8 | 22 | Teixeira | Account Onboarding SM |
I need to turn the descripton lines into columns and the value would be its respective value.
In general, I need to create a filter that creates the columns below, when it matches description:
Account Onboarding All iN
Account Onboarding SM
Account CS All iN"
Account CS SM
Account Operacional All iN
Gestor
Natureza
Modelo de negocio
Vertical
Each description above needs to become a column and if it exists the record will come from the value column.
Expected output:
| id | external_id | Account Onboarding All iN | Account Onboarding SM | Account CS All iN | Account CS SM | Account Operacional All iN | Gestor | Natureza | Modelo de negocio | Vertical |
|---|---|---|---|---|---|---|---|---|---|---|
| 1 | 65 | Pendent | None | None | None | None | Lais | None | None | None |
| 2 | 93 | Maria | None | None | None | None | None | None | None | None |
| 3 | 454 | None | None | None | None | None | Renan | None | None | None |
| 4 | 535 | None | None | None | None | None | None | None | Osiris | None |
| 5 | 999 | Togo | None | None | None | None | None | None | None | None |
| 6 | 25 | None | None | None | None | None | João | None | None | None |
| 7 | 85 | None | None | None | Lima | None | None | None | None | None |
| 8 | 22 | None | Teixeira | None | None | None | None | None | None | None |
One note is that if the ID is the same, the results should be on the same line.
How can I do this?
df.pivot(index=['ID', 'external_id'], columns='description', values='value')?