PANDAS dataframe concat and pivot data

Question

I'm leaning python pandas and playing with some example data. I have a CSV file of a dataset with net worth by percentile of US population by quarter of year. I've successfully subseted the data by percentile to create three scatter plots of net worth by year, one plot for each of three population sections. However, I'm trying to combine those three plots to one data frame so I can combine the lines on a single plot figure.

Data here: https://www.federalreserve.gov/releases/z1/dataviz/download/dfa-income-levels.csv

Code thus far:

import pandas as pd

import matplotlib.pyplot as plt
 
# importing numpy as np
import numpy as np
 
df = pd.read_csv("dfa-income-levels.csv")
df99th = df.loc[df['Category']=="pct99to100"]

df99th.plot(x='Date',y='Net worth', title='Net worth by percentile')

dfmid = df.loc[df['Category']=="pct40to60"]

dfmid.plot(x='Date',y='Net worth')
dflow = df.loc[df['Category']=="pct00to20"]

dflow.plot(x='Date',y='Net worth')

data = dflow['Net worth'], dfmid['Net worth'], df99th['Net worth']
headers = ['low', 'mid', '99th']
newdf = pd.concat(data, axis=1, keys=headers)

And that yields a dataframe shown below, which is not what I want for plotting the data.

    low mid 99th
0   NaN NaN 3514469.0
3   NaN 2503918.0   NaN
5   585550.0    NaN NaN
6   NaN NaN 3602196.0
9   NaN 2518238.0   NaN
... ... ... ...
747 NaN 8610343.0   NaN
749 3486198.0   NaN NaN
750 NaN NaN 32011671.0
753 NaN 8952933.0   NaN
755 3540306.0   NaN NaN

Any recommendations for other ways to approach this?

I've edited your linked to include the income-levels.csv. waiting for approval — el_oso
– el_oso, Commented May 24, 2021 at 17:17

el_oso · Accepted Answer · 2021-05-24 17:13:48Z

1

#filter you dataframe to only the categories you're interested in
filtered_df = df[df['Category'].isin(['pct99to100', 'pct00to20', 'pct40to60'])]
filtered_df = filtered_df[['Date', 'Category', 'Net worth']]

fig, ax = plt.subplots() #ax is an axis object allowing multiple plots per axis
filtered_df.groupby('Category').plot(ax=ax)

answered May 24, 2021 at 17:13

el_oso

1,0717 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

MonkeyDLuffy · Accepted Answer · 2021-05-24 17:03:10Z

1

I don't see the categories mentioned in your code in the csv file you shared. In order to concat dataframes along columns, you could use pd.concat along axis=1. It concats the columns of same index number. So first set the Date column as index and then concat them, and then again bring back Date as a dataframe column.

To set Date column as index of dataframe, df1 = df1.set_index('Date') and df2 = df2.set_index('Date')
Concat the dataframes df1 and df2 using df_merge = pd.concat([df1,df2],axis=1) or df_merge = pd.merge(df1,df2,on='Date')
bringing back Date into column by df_merge = df_merge.reset_index()

answered May 24, 2021 at 17:03

MonkeyDLuffy

5581 gold badge9 silver badges25 bronze badges

Collectives™ on Stack Overflow

PANDAS dataframe concat and pivot data

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related