I'm a newbie to Pandas and I'm trying to apply it to a script that I have already written. I have a csv file from which I extract the data, and use the columns 'candidate', 'final track' and 'status' for my data frame.
My problem is, I would like to filter the data, using perhaps the method shown in Wes Mckinney's 10min tutorial ('http://nbviewer.ipython.org/urls/gist.github.com/wesm/4757075/raw/a72d3450ad4924d0e74fb57c9f62d1d895ea4574/PandasTour.ipynb'). In the section In [80]: he uses aapl_bars.close_price['2009-10-15'].
I would like to use a similar method to select all the data which have * as a status. Data from the other columns are also deleted if there is no * in that row.
My code at the moment:
def establish_current_tacks(filename):
df=pd.read_csv(filename)
cols=[df.iloc[:,0], df.iloc[:,10], df.iloc[:,11]]
current_tracks=pd.concat(cols, axis=1)
return current_tracks
My DataFrame:
>>> current_tracks
<class 'pandas.core.frame.DataFrame'>
Int64Index: 707 entries, 0 to 706
Data columns (total 3 columns):
candidate 695 non-null values
final track 670 non-null values
status 670 non-null values
dtypes: float64(1), object(2)
I would like to use something such as current_tracks.status['*'], but this does not work
Apologies if this is obvious, struggling a little to get my head around it.
read_csvlike sodf=read_csv(filename, usecols=[0,10,11])or you can pass a list of the columns namesdf=read_csv(filename, usecols=['candidate', 'final track', 'status'])it will will load much quicker