0

I'm writing an app to control water consumptions in different water meters. Simplifying it, I can get a list of readings of the consumption in each meter each 15 minutes. I have a query that returns something like this:

Date             | ReadingPoint | Consumption
-----------------+--------------+------------
03/05/2014 02:00 | 1            | 12
03/05/2014 02:00 | 2            | 12
03/05/2014 02:15 | 1            | 7
03/05/2014 02:15 | 2            | 7
03/05/2014 02:30 | 1            | 11
03/05/2014 02:30 | 2            | 11
03/05/2014 02:45 | 1            | 23
03/05/2014 02:45 | 2            | 23
....

I need to periodically store some pre-calculations over these readings. Some examples:

  • Store the total consumption of each day
  • Store the maximum reading of each day
  • Store the monthly average consumption

This calculations are user-defined, so I'm not sure how to modelize this. This pre-calculated tables will be intensively queried, so I think the ideal way to do it is to store each calculation in a different table, but I'm not sure if django can do this kind of dynamic model creation.

So here are my questions:

  • I'm sure there are tools to do this kind of things, but I've never needed to do this before, and I'd like to avoid days or weeks of research and testing. Is there any "standard" way to achieve this behaviour?

  • For launching the process(es) that make the pre-calculations I'm thinking of using Celery. Is it a right choice?

2 Answers 2

1

How about one generic computation table?

computation_id         | key                       | value
-----------------------+---------------------------+---------
monthly_average        | "reading_point_id+month"  | 1.2
weekly_maximum         | "reading_point_id+week"   | 9
monthly_global_maximum | "month"                   | 12

The key format would be defined along side the user defined computation. The computation_id and identifier would be your compound primary key. The computation_id could be a numeric field (that would give you faster index lookups).

I suspect performance would be quite good, despite the one large table.

I hope i'm not misunderstanding your requirements.

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you! I like this approach, although the computed id's could be a pain to generate/decode. Let me test it before accept your answer.
0

Have you considered using a simple Key-Value cache system like Redis or Memcached? You can store all the values on fly in values in the cache. If you ever need to recalculate, you can rebuild the cache. But otherwise, if all you want are values and fast lookup, a cache would work well. Without knowing what your queries will look like, it may be too premature to suggest any optimization for database tables.

If caching is too early for you, you can also write your own ModelManager methods to dynamically compute and store the values.

class WaterMeterManager(models.Manager):
  def pre_calculate(self):
    return self.get_query_set().aggregate(Avg('value'), Max('value'), Min('value'), Sum('value'))

class WaterMeter(models.Model):
    objects = WaterMeterManager()

You can use this as:

WaterMeter.objects.pre_calculate().get('value__min')

1 Comment

any easy way of rebuilding cache?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.