1

I want to calculate the daily mean of an array of values without considering negative values.

I use this array of data:

Val=['45','25','45','26','-999','87','9','5','4','5','78','98','14','25',
     '34','15','15','14'...]

that represents the hourly values of one month (30 days).

I tried to remove the negative values from the average but I didn't succeed.

What is the simplest way, in python, to calculate the daily mean and to get an array of 30 values?

Thanks for your help

Here is the code:

f=open('file.csv')

csv_f=csv.reader(f)

val=[]

for row in csv_f:

   val.append(row[0])

for i in range(0,len(val[:])-24,24):

   j=i+24

   mean(val[i:j])
2
  • 2
    are the values stored as strings? Commented Sep 5, 2015 at 13:56
  • How did you attempt to do it? Commented Sep 5, 2015 at 14:14

4 Answers 4

1

Here is one option:

Say you have array of 6 values and your interested in means on non-negative values of 3 element blocks (24 element blocks in your case).

In [14]: a = np.array([3,4,-999,5,-100,6], dtype=np.float)

In [15]: a[a < 0] = np.nan

In [16]: np.nanmean(a.reshape((-1, 3)), axis=1)
Out[16]: array([ 3.5,  5.5])
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks, but what the a.reshape((-1,3)),axis=1) means ? When I try it, i get this error : np.nanmean(a.reshape((-1,24)),axis=1) ValueError: total size of new array must be unchanged
@user5276228, a.reshape(-1, 3) means that your are reshaping an 1D array, into a 2D array with 3 columns and however many rows. ValueError: total size of new array must be unchanged means that total number of elements in your array is not divisible by 24.
1

try list comprehension with an IF condition together with int(). Need to slice original value list into daily chunks first

import random

values = random.sample(range(-5, 100), 96)

def mean(l):
    # list comprehension with if condition to remove negatives and cast to int
    l = [int(numeric_string) for numeric_string in l if int(numeric_string) > 0]

    return sum(l) / len(l)

def chunk(l, n):
    # slice the values list into n sized chunks
    return [l[int(i):int(i) + n] for i in range(0, len(l), n)]

y = [ mean(day) for day in chunk(values, 24)]

print(y)

1 Comment

But where do I mention that I want the daily average (each 24 element blocks) ?
0

Assuming Val contains strings, 24 values per day (so len(Val) is a multiple of 24):

# turn the 1D input array into a 2D array, one line per day
DailyVals = [[int(h) for h in Val[i:i+24]]
             for i in range(len(Val)/24)]
# prune negative values
ValidVals = [[h for h in day if h >= 0]
             for day in DailyVals]
# compute the mean
Mean = [sum(val) / len(val) for val in ValidVals]

Comments

0

Something like this would avoid having to create temporary lists:

vals = ['45','25','45','26','-999','87','9',
        '5','4','5','78','98','14','25',
        '34','15','15','14']

daily_mean = (sum(int(v) for v in vals if int(v) > -1) /
              sum(1 for v in vals if int(v) > -1)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.