How to calculate a moving average in python for sensor data?

Question

I am trying to create a simple array to collect sensor data and calculate a moving average within python. The final result would be stored in a variable which I could publish to MQTT. I have the code to collect the sensor data and to post it to MQTT, I just haven't figured out the bit about the array and calculating of moving averages.

I have searched for any bits of information on creating arrays and calculating moving averages but I have not found any examples of collecting data from a variable, entering that information into the array and after collecting 40 values, calculate a moving average and storing that information in a different variable.

Any advice is appreciated!

Please be gentle as I am a newbie python programmer...

  GNU nano 3.2                                                                                                                                                                                                                                                                                                mqtt.ph.py                                                                                                                                                                                                                                                                                                           

#!/usr/bin/env python3



import math

import sys

import time

from grove.adc import ADC

import paho.mqtt.client as mqtt # Import the MQTT library

# Our "on message" event

def messageFunction (client, userdata, message):
    topic = str(message.topic)
    message = str(message.payload.decode("utf-8"))
    print(topic + message)

ourClient = mqtt.Client("openhab_mqtt") # Create a MQTT client object
ourClient.username_pw_set(username="AAAAA",password="BBBBB")
ourClient.connect("192.168.1.XXX", XXXX) # Connect to the test MQTT broker
ourClient.subscribe("openhab") # Subscribe to the topic TDS
ourClient.on_message = messageFunction # Attach the messageFunction to subscription
ourClient.loop_start() # Start the MQTT client

# CONSTANTS

Vref = 4.95   # volts

Vcal_mv = 1903  # milli-volts (using raw value 539 * 4.95/1023 * 1000)  note 10-bit A2D

mv_per_ph = 59.6

ph_per_mv = 1.0 / mv_per_ph





class GrovePH:



    def __init__(self, channel):

        self.channel = channel

        self.adc = ADC()





    @property

    def PH(self):

        value = self.adc.read(self.channel)

        if value != 0:

            voltage_mv = value * Vref / 1023.0 * 1000

            PHValue = 7 - ((voltage_mv - Vcal_mv) * ph_per_mv)

            return PHValue

        else:

            return     


def main():



    if len(sys.argv) < 2:

        print('Usage: {} adc_channel'.format(sys.argv[0]))

        sys.exit(1)



    sensor = GrovePH(int(sys.argv[1]))

    print('Detecting PH...')



    while True:

        try:
             ourClient.publish("openhab/pH",'{:2.1f}'.format(sensor.PH)) # Publish message to MQTT broker
             time.sleep(1) # Sleep for a second

#            print('PH Value: {:2.1f}'.format(sensor.PH))

#            time.sleep(1)

        except KeyboardInterrupt:

            print("\nExiting...")

            sys.exit(0)



if __name__ == '__main__':

    main()

Mark Setchell · Accepted Answer · 2021-01-05 09:52:49Z

I find it easy to calculate moving average of samples by using a deque with a maximum number of entries in it. Then you can just keep adding samples and the length looks after itself:

#!/usr/bin/env python3

import collections
import random

# Ensure repeatable randomness ;-)
random.seed(42)

# Define a deque with max of 40 samples
samples = collections.deque(maxlen=40)

# Add 100 random samples to deque, each one between 10..20
for i in range(100):
    # Generate random sample
    value = random.randint(10,20)
    # Append to deque
    samples.append(value)
    # Show sample, number of samples, total and moving average
    N = len(samples)
    total = sum(samples)
    movingAvg = total/N
    print(f'value: {value}, num samples: {N}, total: {total}, moving average: {movingAvg}')

Check the number of samples in the output below and observe that it rises to 40 but goes no higher:

Sample Output

value: 20, num samples: 1, total: 20, moving average: 20.0
value: 11, num samples: 2, total: 31, moving average: 15.5
value: 10, num samples: 3, total: 41, moving average: 13.666666666666666
value: 14, num samples: 4, total: 55, moving average: 13.75
value: 13, num samples: 5, total: 68, moving average: 13.6
value: 13, num samples: 6, total: 81, moving average: 13.5
value: 12, num samples: 7, total: 93, moving average: 13.285714285714286
value: 11, num samples: 8, total: 104, moving average: 13.0
value: 20, num samples: 9, total: 124, moving average: 13.777777777777779
value: 18, num samples: 10, total: 142, moving average: 14.2
value: 11, num samples: 11, total: 153, moving average: 13.909090909090908
value: 19, num samples: 12, total: 172, moving average: 14.333333333333334
value: 16, num samples: 13, total: 188, moving average: 14.461538461538462
value: 10, num samples: 14, total: 198, moving average: 14.142857142857142
value: 10, num samples: 15, total: 208, moving average: 13.866666666666667
value: 11, num samples: 16, total: 219, moving average: 13.6875
value: 13, num samples: 17, total: 232, moving average: 13.647058823529411
value: 13, num samples: 18, total: 245, moving average: 13.61111111111111
value: 18, num samples: 19, total: 263, moving average: 13.842105263157896
value: 19, num samples: 20, total: 282, moving average: 14.1
value: 10, num samples: 21, total: 292, moving average: 13.904761904761905
value: 18, num samples: 22, total: 310, moving average: 14.090909090909092
value: 13, num samples: 23, total: 323, moving average: 14.043478260869565
value: 20, num samples: 24, total: 343, moving average: 14.291666666666666
value: 18, num samples: 25, total: 361, moving average: 14.44
value: 16, num samples: 26, total: 377, moving average: 14.5
value: 13, num samples: 27, total: 390, moving average: 14.444444444444445
value: 17, num samples: 28, total: 407, moving average: 14.535714285714286
value: 19, num samples: 29, total: 426, moving average: 14.689655172413794
value: 14, num samples: 30, total: 440, moving average: 14.666666666666666
value: 10, num samples: 31, total: 450, moving average: 14.516129032258064
value: 12, num samples: 32, total: 462, moving average: 14.4375
value: 16, num samples: 33, total: 478, moving average: 14.484848484848484
value: 15, num samples: 34, total: 493, moving average: 14.5
value: 14, num samples: 35, total: 507, moving average: 14.485714285714286
value: 12, num samples: 36, total: 519, moving average: 14.416666666666666
value: 13, num samples: 37, total: 532, moving average: 14.378378378378379
value: 15, num samples: 38, total: 547, moving average: 14.394736842105264
value: 11, num samples: 39, total: 558, moving average: 14.307692307692308
value: 11, num samples: 40, total: 569, moving average: 14.225
value: 16, num samples: 40, total: 565, moving average: 14.125
value: 11, num samples: 40, total: 565, moving average: 14.125
value: 15, num samples: 40, total: 570, moving average: 14.25
value: 15, num samples: 40, total: 571, moving average: 14.275

For the purists, I agree it is probably more efficient to keep my variable total as a running total of last N samples, and subtract the 41st oldest sample from the total as it leaves the window of interesting samples. That way you can avoid having to run sum(samples) on every iteration. But for small window sizes and non performance-critical uses, the suggested method definitely has the benefit of simplicity.

Keywords: Python, moving average, duque, fixed length, maximum length, window, windowed.

vaconingham · Accepted Answer · 2022-05-26 13:18:15Z

Here's a simple way to calculate moving averages using plain Python.

You may change the time window by changing the value in the window variable. For example, if you wanted a 30 minute time window, you would change the number to 3000000000.

In this example, the entries are saved in a dictionary named data. However, you may choose the data source that makes sense to you. The most important thing is that each entry within the collection has a timestamp with microseconds.

data = {}

def one_minute_averages():
    window = int(datetime.now().strftime("%H%M%S%f")) - 100000000
    history = {}
    for i in message_log.items():
        if i[0] >= window:
            history.update({i})
    for i in list(history):
        if i < window:
            history.pop(i)
    avg = sum(history.values()) / len(list(history))
    return avg

Note: You may want to add some error handling to avoid division by zero or if the function can't access your data.

I welcome you to comment :)

Collectives™ on Stack Overflow

How to calculate a moving average in python for sensor data?

2 Answers 2

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related