1

I am working with a dictionary where each key contain a list of tuples. It looks like this:

dict1 = {'key1': [(time1, value1), (time2, value2), (time3, value3)],
         'key2': [(time4, value4), (time5, value5), (time6, value6)],
         'key3': [(time7, value7), (time8, value8), (time9, value9)], ...}

What is wish to do for each key is to find the largest drop in 'valueX' from 'timeX' to 'timeY'.

The tuples are orderes so that

time1 < time2 < time3 

And it is (typically) true that

value1 > value2 > value3

Both things are true for all keys.

So looking at the first key, what I wish to do is to calculate

value2 - value1 and value3 - value2

And save the times that the biggest drop occurs. Let's say that

value2 - value1 > value3 - value2

Then I wish to save time1 and time2, since it was between those two time values that the largest drop occured.

I am thinking to use a for-loop like the following:

for key in dict1:
    for i in dict1[key]:

But I cannot figure out how to

1) loop through the values, calculate the difference between the present value and the past value, save this and compare it the the largest drop that has been observed

2) to save the times that correspond to the largest drop in 'value'.

I hope you can help me out here. Thanks a lot.

2 Answers 2

2

Assuming that the lists are already sorted by time, and you always want to compare consecutive values (and not, e.g. values that have the same time difference in between), you can use the zip(lst, lst[1:]) recipe to iterate consecutive pairs in the list, and use max with a custom key function to find the pair with the biggest difference.

def biggest_drop(timeseries):
    pairs = zip(timeseries, timeseries[1:])
    ((t1, v1), (t2, v2)) = max(pairs, key=lambda p: p[0][1] - p[1][1])
    return (t1, t2)

dict1 = {'key1': [("time1", 23), ("time2", 22), ("time3", 24)],
         'key2': [("time4", 12), ("time5", 9), ("time6", 3)],
         'key3': [("time7", 43), ("time8", 50), ("time9", 30)]}
print({k: biggest_drop(v) for k, v in dict1.items()})
# {'key3': ('time8', 'time9'), 'key2': ('time5', 'time6'), 'key1': ('time1', 'time2')}

Or shorter (but not necessarily better):

def biggest_drop(timeseries):
    return next(zip(*max(zip(timeseries, timeseries[1:]), 
                         key=lambda p: p[0][1] - p[1][1])))

Also, note that if you are looking for the biggest drop, you have to find the maximum for value1 - value2 instead of value2 - value1.

Sign up to request clarification or add additional context in comments.

Comments

2

For Python3, this problem can be solved in one line using itertools.accumulate:

from itertools import accumulate
import operator
def get_times(d):
    final_data = {a:[(b[0][0], b[1][0]) if list(accumulate([i[-1] for i in b], func = operator.sub))[0] > list(accumulate([i[-1] for i in b], func = operator.sub))[1] else (b[1][0], b[2][0])] for a, b in d.items()}
    return final_data

dict1 = {'key1': [(1, 3), (23, 12), (3, 5)],
 'key2': [(4, 41), (5, 54), (4, 6)],
 'key3': [(7, 17), (8, 18), (9, 19)]}
print(get_times(dict1))

Output:

{'key2': [(4, 5)], 'key3': [(7, 8)], 'key1': [(1, 23)]}

Note that since the variables time1, value1, etc were not specified, I used integers for both, although a string value for time variables and an integer value for the value variables is also valid.

6 Comments

in the above, where is it stated that we wish to run this piece of code for dict1?
@C.Refsgaard If you want to use different dictionaries with names other than dict1, then a function would be the best option. Please see my recent edit.
This gives invalid syntax, and also the variable b is not defined. Copy-paste error maybe? Also, why do you accumulate twice? And does this work for lists with more than three values?
@tobias_k Sorry for the error. I had the correct version in my text editor. In order for the code to be one line, I had to accumulate twice. Regarding input with lists of more than three values, this code will not work because the OP only specified two variables, time and value. However, this code can be modified to account for a list of tuples with length greater than two if the OP can update his post and mention his desired output.
I'm not talking about tuples with more than two values, but lists with more than three tuples. What this does is just check (in a very convoluted way) whether the difference between the first and second is greater than that of the second and third value.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.