Pandas python: getting one value from a DataFrame

Question

I implemented a function that goes to the first occurence of a valued in a panda dataframe but I feel the implementation is kindda ugly. Would you have a nicer way to implement it??

[mots] is an array of strings

# Sans doutes la pire implémentation au monde...
def find_singular_value(self, mots):
    bool_table = self.document.isin(mots)
    for i in range(bool_table.shape[0]):
        for j in range(bool_table.shape[1]):
            boolean = bool_table.iloc[i][j]
            if boolean:
                return self.document.iloc[i][j + 1]

Andy Pender · Accepted Answer · 2018-10-24 09:06:15Z

1

Here's a solution for getting the j+1 value. It uses df.unstack and df.shift

df = self.document.unstack()
vals = df[df.isin(mots).shift().fillna(False)]

vals will contain all of the j+1 values in self.documents. You can then select the first one as in my previous answer. Hopefully this works for you.

answered Oct 24, 2018 at 9:06

Andy Pender

1066 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Antoine Dussarps Over a year ago

Thanks! You helped me get a lot of new keywords in Panda :)

Andy Pender · Accepted Answer · 2018-10-23 12:53:05Z

1

This one liner should give you what you need.

self.document[self.document.isin(mots)].melt()["value"].dropna().values[0]

It applies your isin mask to the original df then finds the first non nan value using pd.melt and df.dropna

Here's a simple breakdown:

>>> df = pd.DataFrame({"a":[1,2,3],"b":[4,5,6],"c":[7,8,9]})
>>> df.isin([4,6])
       a      b      c
0  False   True  False
1  False  False  False
2  False   True  False
>>> df[df.isin([4,6])]
    a    b   c
0 NaN  4.0 NaN
1 NaN  NaN NaN
2 NaN  6.0 NaN
>>> df[df.isin([4,6])].melt()
  variable  value
0        a    NaN
1        a    NaN
2        a    NaN
3        b    4.0
4        b    NaN
5        b    6.0
6        c    NaN
7        c    NaN
8        c    NaN
>>> df[df.isin([4,6])].melt()["value"]
0    NaN
1    NaN
2    NaN
3    4.0
4    NaN
5    6.0
6    NaN
7    NaN
8    NaN
Name: value, dtype: float64
>>> df[df.isin([4,6])].melt()["value"].dropna()
3    4.0
5    6.0
Name: value, dtype: float64
>>> df[df.isin([4,6])].melt()["value"].dropna().values
array([ 4.,  6.])
>>> df[df.isin([4,6])].melt()["value"].dropna().values[0]
4.0
>>>

edited Oct 23, 2018 at 12:53

answered Oct 23, 2018 at 8:09

Andy Pender

1066 bronze badges

6 Comments

Antoine Dussarps Over a year ago

Hmm doesnt seem to output nothing (empty value). I'll investigate this later with pd.melt . Thanks for your answer anyway!!

Antoine Dussarps Over a year ago

Is "value" suppose to be a string like this? Should't ut be True?

Andy Pender Over a year ago

Yes value should should be a string. Using melt transforms the dataframe into two columns; 'variable' and 'value'. Then I take the 'value' series, drop the nan values and return the first result.

Andy Pender Over a year ago

I've added a breakdown of the operations to my answer. Does this help?

Antoine Dussarps Over a year ago

Oh yes, seems good, but i'm actually taking the cell just after the one I found! (j+1) I don't find the way to do it with your method... Would you have a way to get the coordinate of the cell found with isin?

|

Collectives™ on Stack Overflow

Pandas python: getting one value from a DataFrame

2 Answers 2

1 Comment

6 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

6 Comments

Your Answer

Sign up or log in

Post as a guest

Related