I'm leaning python pandas and playing with some example data. I have a CSV file of a dataset with net worth by percentile of US population by quarter of year. I've successfully subseted the data by percentile to create three scatter plots of net worth by year, one plot for each of three population sections. However, I'm trying to combine those three plots to one data frame so I can combine the lines on a single plot figure.
Data here: https://www.federalreserve.gov/releases/z1/dataviz/download/dfa-income-levels.csv
Code thus far:
import pandas as pd
import matplotlib.pyplot as plt
# importing numpy as np
import numpy as np
df = pd.read_csv("dfa-income-levels.csv")
df99th = df.loc[df['Category']=="pct99to100"]
df99th.plot(x='Date',y='Net worth', title='Net worth by percentile')
dfmid = df.loc[df['Category']=="pct40to60"]
dfmid.plot(x='Date',y='Net worth')
dflow = df.loc[df['Category']=="pct00to20"]
dflow.plot(x='Date',y='Net worth')
data = dflow['Net worth'], dfmid['Net worth'], df99th['Net worth']
headers = ['low', 'mid', '99th']
newdf = pd.concat(data, axis=1, keys=headers)
And that yields a dataframe shown below, which is not what I want for plotting the data.
low mid 99th
0 NaN NaN 3514469.0
3 NaN 2503918.0 NaN
5 585550.0 NaN NaN
6 NaN NaN 3602196.0
9 NaN 2518238.0 NaN
... ... ... ...
747 NaN 8610343.0 NaN
749 3486198.0 NaN NaN
750 NaN NaN 32011671.0
753 NaN 8952933.0 NaN
755 3540306.0 NaN NaN
Any recommendations for other ways to approach this?
