2

I have a pandas DataFrame that has column which contains lists. I am trying to get the means of the lists in this column.

Here is an example of what my DataFrame looks like:

    Loc         Background
0   115227854   [0.000120481927711]
1   115227854   [0.000129117642312, 0.000131429072111, 0.00016...
2   115227855   [0.000123193166886]
3   115227855   [0.000142845482001, 0.000184789750329, 0.00018...
4   115227856   [0.000173490631506]

I would like to do something like this to set a new Mean column equal to the mean of the data in each of the lists found in the Background column:

sig_vars['Mean'] = sig_vars['Background'].mean()

And here is the DataFrame if needed:

df = {'Background': {0: [0.00012048192771084337],
  1: [0.00012911764231185137,
   0.0001314290721107509,
   0.000163015792154865,
   0.00018832391713747646,
   0.00019627513412134165,
   0.00020383723596708027,
   0.0002114408734430263,
   0.00022564565426983117,
   0.000247843759294141],
  2: [0.00012319316688567673],
  3: [0.00014284548200146926,
   0.00018478975032851512,
   0.00018864365214110544,
   0.00019392685725367248,
   0.00022931689046296532,
   0.00023965141612200435,
   0.00036566589684372596,
   0.00043096760847454704,
   0.0004584752423369138],
  4: [0.00017349063150589867]},
 'Loc': {0: 115227854, 1: 115227854, 2: 115227855, 3: 115227855, 4: 115227856}}

5 Answers 5

5

Use can also use np.mean to achieve the same:

import numpy as np
np.mean(df['Background'].tolist(), axis=1)
Sign up to request clarification or add additional context in comments.

Comments

4

Using tolist recreate the dataframe

pd.DataFrame(sig_vars['Background'].values.tolist()).mean(1)
Out[498]: 
0    0.000120
1    0.000189
2    0.000123
3    0.000270
4    0.000173
dtype: float64

#sig_vars['Mean'] = pd.DataFrame(sig_vars['Background'].values.tolist()).mean(1)

Comments

4

Use pandas.Series.apply:

    df['Mean'] = df['Background'].apply(np.mean)

Comments

1

list comprehension converting each list to array

df['Mean'] = [np.array(x).mean() for x in df.Background.values]

Comments

1

Here is what I can think of.

  1. Iterate through the specific column and and store it's mean in a DataFrame.

    df = pandas.DataFrame(sig_vars.iloc[i]['background'].mean() for i in range(len(sig_vars)),columns=['mean'])
    
  2. Join the column with the main dataframe.

    sig_vars = sig_vars.join(df)
    

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.