0

I am trying to pass a dataframe to a function and compute mean and std dev from different columns of the dataframe. When I execute each line of the function step by step (without writing a function as such) it works fine. However, when I try to write a function to compute, I keep getting this error:

TypeError: 'float' object has no attribute '__getitem__'

This is my code:

def computeBias(data):        

    meandata = np.array(data['mean'])
    sddata = np.array(data.sd)
    ni = np.array(data.numSamples)      

    mean = np.average(meandata, weights=ni)
    pooled_sd = np.sqrt((np.sum(np.multiply((ni - 1), np.array(sddata)**2)))/(np.sum(ni) - 1))

    return mean, pooled_sd


mean,sd = df.apply(computeBias)

This is sample data:

id           type             mean           sd              numSamples
------------------------------------------------------------------------
1             33              -0.43          0.40               101
2             23              -0.76          0.1                100
3             33               0.89          0.56               101
4             45               1.4           0.9                100

This is the full error traceback:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-134-f4dc392140dd> in <module>()
----> 1 mean,sd = df.apply(computeBias)

C:\Users\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\series.pyc in apply(self, func, convert_dtype, args, **kwds)
   2353             else:
   2354                 values = self.asobject
-> 2355                 mapped = lib.map_infer(values, f, convert=convert_dtype)
   2356 
   2357         if len(mapped) and isinstance(mapped[0], Series):

pandas\_libs\src\inference.pyx in pandas._libs.lib.map_infer (pandas\_libs\lib.c:66440)()

<ipython-input-133-2af38e3e29f0> in computeBias(data)
      1 def computeBias(data):
      2 
----> 3     meandata = np.array(data['mean'])
      4     sddata = np.array(data.sd)
      5     ni = np.array(data.numSamples)

TypeError: 'float' object has no attribute '__getitem__'

Does anyone know of any workaround? TIA!

6
  • Please edit in the full error traceback Commented Jul 7, 2017 at 19:06
  • @OferSadan: Done. Commented Jul 7, 2017 at 19:08
  • Did you google the error? There are quite a few stackoverflow.com/questions/25950113/… questions referencing that error. Commented Jul 7, 2017 at 19:12
  • What type do you expect data to be here? From the error message, it seems to be of type float. Make sure you are passing in a dictionary or other type with a __getitem__ method. Commented Jul 7, 2017 at 19:16
  • @gobrewers14: I did. But I really couldn't understand what was going on here. Because when I do the same step by step without calling a function, it works and gives me an output. Not quite sure what is going wrong inside the function. Commented Jul 7, 2017 at 19:16

1 Answer 1

1
meandata = np.array(data['mean'])
TypeError: 'float' object has no attribute '__getitem__'

__getitem__ is the method that Python tries to call when you use indexing. In the marked line that means data['mean'] is producing the error. Evidently data is a number, a float object. You can't index a number.

data['mean'] looks like you are either trying to get an item from a dictionary, or from a dataframe, using a named index. I won't dig into the rest of your code to determine what you intend.

What you need to do it understand what data really it, and what produces it.


You are using this in a df.apply(....), and apparently think that it just means

computeBias(df)   # or
computeBias(df.data)

Rather I suspect the apply is iterating, in some dimension, over the dataframe, and passing values or dataseries to your code. It isn't passing the whole dataframe.

Sign up to request clarification or add additional context in comments.

1 Comment

I figured it was some problem with the apply function. I used apply because I wanted to pass groupedby objects as well to the function. But it doesn't work anyway. So, I am just iterating over groups and computing the statistics by sending the respective columns.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.