I have the following DataFrame:
df=pd.DataFrame(index = ['2018-01-01','2018-01-02','2018-01-03','2018-01-04'])
df["ticker"] = ['TSLA', 'TSLA', 'IBM', 'IBM']
df["price"] = ['1000', '1200', '101', '108']
df["volume"] = ['100000', '123042', '1087878', '108732']
df["marketcap"] = ['1.2T', '1.4T', '30B', '35B']
df.index.rename('Date', inplace=True)
df.set_index('ticker', append=True).unstack('ticker').swaplevel(axis=1).sort_index(axis=1,level=0, sort_remaining=False)
df:
TSLA IBM
price volume marketcap price volume marketcap
Date
2018-01-01 1000 100000 1.2T NaN NaN NaN
2018-01-02 1200 123042 1.4T NaN NaN NaN
2018-01-03 NaN NaN NaN 101 1087878 30B
2018-01-04 NaN NaN NaN 108 108732 35B
How can loop through the ticker (i.e. TSLA) and from that take only the price column for each date? So something like this:
for col in df.columns(level=0):
for i in df.index:
if df.columns(level=1)=="price":
df_price=df[col].loc[i]
And df_price looks something like this:
TSLA
Date
2018-01-01 1000
and so on for the rest of the prices and tickers. Thank you.
df.loc[:,pd.IndexSlice[:, 'price']]should do it, is it what you are after?df.loc(axis=1)[:, 'price']ordf.loc[:, df.columns.isin(['price'], level=1)]are other options.df.loc(axis=1)[('TSLA', 'price')]