Working with lists of tuples python

Question

I am working with a dictionary where each key contain a list of tuples. It looks like this:

dict1 = {'key1': [(time1, value1), (time2, value2), (time3, value3)],
         'key2': [(time4, value4), (time5, value5), (time6, value6)],
         'key3': [(time7, value7), (time8, value8), (time9, value9)], ...}

What is wish to do for each key is to find the largest drop in 'valueX' from 'timeX' to 'timeY'.

The tuples are orderes so that

time1 < time2 < time3

And it is (typically) true that

value1 > value2 > value3

Both things are true for all keys.

So looking at the first key, what I wish to do is to calculate

value2 - value1 and value3 - value2

And save the times that the biggest drop occurs. Let's say that

value2 - value1 > value3 - value2

Then I wish to save time1 and time2, since it was between those two time values that the largest drop occured.

I am thinking to use a for-loop like the following:

for key in dict1:
    for i in dict1[key]:

But I cannot figure out how to

1) loop through the values, calculate the difference between the present value and the past value, save this and compare it the the largest drop that has been observed

2) to save the times that correspond to the largest drop in 'value'.

I hope you can help me out here. Thanks a lot.

tobias_k · Accepted Answer · 2017-09-21 18:03:03Z

Assuming that the lists are already sorted by time, and you always want to compare consecutive values (and not, e.g. values that have the same time difference in between), you can use the zip(lst, lst[1:]) recipe to iterate consecutive pairs in the list, and use max with a custom key function to find the pair with the biggest difference.

def biggest_drop(timeseries):
    pairs = zip(timeseries, timeseries[1:])
    ((t1, v1), (t2, v2)) = max(pairs, key=lambda p: p[0][1] - p[1][1])
    return (t1, t2)

dict1 = {'key1': [("time1", 23), ("time2", 22), ("time3", 24)],
         'key2': [("time4", 12), ("time5", 9), ("time6", 3)],
         'key3': [("time7", 43), ("time8", 50), ("time9", 30)]}
print({k: biggest_drop(v) for k, v in dict1.items()})
# {'key3': ('time8', 'time9'), 'key2': ('time5', 'time6'), 'key1': ('time1', 'time2')}

Or shorter (but not necessarily better):

def biggest_drop(timeseries):
    return next(zip(*max(zip(timeseries, timeseries[1:]), 
                         key=lambda p: p[0][1] - p[1][1])))

Also, note that if you are looking for the biggest drop, you have to find the maximum for value1 - value2 instead of value2 - value1.

Ajax1234 · Accepted Answer · 2017-09-21 15:36:41Z

2

For Python3, this problem can be solved in one line using itertools.accumulate:

from itertools import accumulate
import operator
def get_times(d):
    final_data = {a:[(b[0][0], b[1][0]) if list(accumulate([i[-1] for i in b], func = operator.sub))[0] > list(accumulate([i[-1] for i in b], func = operator.sub))[1] else (b[1][0], b[2][0])] for a, b in d.items()}
    return final_data

dict1 = {'key1': [(1, 3), (23, 12), (3, 5)],
 'key2': [(4, 41), (5, 54), (4, 6)],
 'key3': [(7, 17), (8, 18), (9, 19)]}
print(get_times(dict1))

Output:

{'key2': [(4, 5)], 'key3': [(7, 8)], 'key1': [(1, 23)]}

Note that since the variables time1, value1, etc were not specified, I used integers for both, although a string value for time variables and an integer value for the value variables is also valid.

edited Sep 21, 2017 at 15:36

answered Sep 21, 2017 at 14:45

Ajax1234

71.7k9 gold badges67 silver badges110 bronze badges

6 Comments

C. Refsgaard Over a year ago

in the above, where is it stated that we wish to run this piece of code for dict1?

Ajax1234 Over a year ago

@C.Refsgaard If you want to use different dictionaries with names other than dict1, then a function would be the best option. Please see my recent edit.

tobias_k Over a year ago

This gives invalid syntax, and also the variable b is not defined. Copy-paste error maybe? Also, why do you accumulate twice? And does this work for lists with more than three values?

Ajax1234 Over a year ago

@tobias_k Sorry for the error. I had the correct version in my text editor. In order for the code to be one line, I had to accumulate twice. Regarding input with lists of more than three values, this code will not work because the OP only specified two variables, time and value. However, this code can be modified to account for a list of tuples with length greater than two if the OP can update his post and mention his desired output.

tobias_k Over a year ago

I'm not talking about tuples with more than two values, but lists with more than three tuples. What this does is just check (in a very convoluted way) whether the difference between the first and second is greater than that of the second and third value.

|

Collectives™ on Stack Overflow

Working with lists of tuples python

2 Answers 2

Comments

6 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

6 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related