15

Okay, so say I have a pandas dataframe x, and I'm interested in extracting a value from it:

> x.loc[bar==foo]['variable_im_interested_in']

Let's say that returns the following, of type pandas.core.series.Series:

24    Boss
Name: ep_wb_ph_brand, dtype: object

But all I want is the string 'Boss'. Wrapping the first line of code in str() doesn't help either, I just get:

'24    Boss\nName: ep_wb_ph_brand, dtype: object'

How do I just extract the string?

6
  • Can you add the output of type(x.loc[bar==foo]['variable_im_interested_in']) ... it's unclear to me what is being returned. If 'Boss' is the expected value stored in the relevant cell, there's no reason why that other index number, name and dtype stuff should be part of the value. Commented Feb 26, 2015 at 1:09
  • yeah @Mr. F it's a pandas.core.series.Series Commented Feb 26, 2015 at 1:11
  • 1
    Ah, it's a length-1 Series. So just access the 0th entry! Try this: x.loc[bar==foo]['variable_im_interested_in'][0]. Commented Feb 26, 2015 at 1:12
  • Hm, that totally makes sense, although adding [0] onto the end throws a pandas key error, and adding [:1] onto the end (to get rid of the error) returns the same pandas series, instead of the string... (also +1000 if your name references Arrested Development. Please say it does.) Commented Feb 26, 2015 at 1:16
  • @HillarySanders That's a typo, my bad. The index number is 24 in your case. Try [24] instead of [0], or try the .values[0] option I put in my answer below. Commented Feb 26, 2015 at 1:18

3 Answers 3

10

Based on your comments, this code is returning a length-1 pandas Series:

x.loc[bar==foo]['variable_im_interested_in']

If you assign this value to a variable, then you can just access the 0th element to get what you're looking for:

my_value_as_series = x.loc[bar==foo]['variable_im_interested_in']

# Assumes the index to get is number 0, but from your example, it might
# be 24 instead.
plain_value = my_value_as_series[0]

# Likewise, this needs the actual index value, not necessarily 0.
also_plain_value = my_value_as_series.ix[0]

# This one works with zero, since `values` is a new ndarray.
plain_value_too = my_value_as_series.values[0]

You don't have to assign to a variable to do this, so you could just write x.loc[bar==foo]['variable_im_interested_in'][0] (or similar for the other options), but cramming more and more accessor and fancy indexing syntax onto a single expression is usually a bad idea.

Also note that you can directly index the column of interest inside of the call to loc:

x.loc[bar==foo, 'variable_im_interested_in'][24]
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks Mr. F. The first two throw errors ([0] and .ix[0]), but the third strategy works (.values[0]).
@HillarySanders Yes, the first two errors are expected. For your case, it is printing out that the number of the index is 24, so you'll need to use 24 instead of 0. You will not need to do that for the case when you use .values since that is a new ndarray re-indexed from 0.
9

Code to get the last value of an array (run in a Jupyter notebook, noted with the >s):

> import pandas
> df = pandas.DataFrame(data=['a', 'b', 'c'], columns=['name'])
> df
    name
0   a
1   b
2   c
> df.tail(1)['name'].values[0]
'c'

1 Comment

That is beautiful and pythonic! Why that syntax vs "df['name'].tail(1).values[0]" Same?
0

You could use string.split function.

>>> s = '24    Boss\nName: ep_wb_ph_brand, dtype: object'
>>> s.split()[1]
'Boss'

1 Comment

Yeah. Sort of my last resort; it seems inelegant. But you are right.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.