-1

I am trying to sort a nested list by date (which I can). Then if the list has duplicate dates, sort the duplicate dates by time.

The first part of the list is the date or time, second part is the index.

Same index in both lists means they belong with each other:

  • [b'05-07-2024', 0] belongs with [b'15-21-00', 0]
  • [b'16-08-2024', 1] belongs with [b'23-41-01', 1]

I can sort one of the lists at a time like this:

index_list = []
for _, index in date_list:
    index_list.append(index)

The index_list is:

[0, 1, 2]

But index 1 & 2 should actually be swapped in this case because these are the lists:

date_list = [[b'05-07-2024', 0], [b'16-08-2024', 1], [b'16-08-2024', 2]]
time_list = [[b'15-20-55', 2], [b'15-21-00', 0], [b'23-41-01', 1]]

In the end I need a list of the indexes of the right order.

In this case that would be:

[0, 2, 1]
14
  • 3
    Make a better data structure where you have date and time belonging to the same index together. Then sorting it will be trivial. Commented Aug 17, 2024 at 19:15
  • 3
    Yes, but the solution will be to at least temporarily create that better structure even if you don't keep it. Commented Aug 17, 2024 at 19:20
  • 2
    Why? That's a horribly inconvenient way to hold the data. Commented Aug 17, 2024 at 19:29
  • 3
    From bad data model design comes inefficient and hard-to-maintain code. Why not do the right thing and revise your data source? Commented Aug 17, 2024 at 19:31
  • 1
    More likely it's a step or two further away. And see xyproblem.info Commented Aug 17, 2024 at 21:23

2 Answers 2

1

Combine date and time into a datetime object, then you can simply sort by it.

from datetime import datetime

sorted_indexes = sorted(
    [idx for date, idx in date_list],
    key=lambda idx: datetime.strptime(
        f"{date_list[idx][0].decode('utf-8')} {next(t[0] for t in time_list if t[1] == idx).decode('utf-8')}",
        '%d-%m-%Y %H-%M-%S'
    )
)

Output:

[0, 2, 1]
Sign up to request clarification or add additional context in comments.

Comments

1

The key is to find a suitable data structure which allows you to use the built-in sorted function directly.

For example, if your date strings were in YYYY-MM-DD format and your data was structured like this,

data = [
    ('2024-07-05', b'15-21-00', 0),
    ('2024-08-16', b'23-41-01', 1),
    ('2024-08-16', b'15-20-55', 2)
]

then you could use

sorted_indexes = [i for date, time, i in sorted(data)]

because sorted sorts lists of tuples lexicographically (i.e., first by date, then equal dates by time, and as a bonus, equal times by index).

Since you don't have this format and structure, you need to create it (temporarily).

You can use the datetime module to convert each date string:

from datetime import datetime

def convert_date(date):
    """Convert byte string in DD-MM-YYYY to Unicode string in YYYY-MM-DD format."""
    return datetime.strptime(date.decode(), "%d-%m-%Y").strftime("%Y-%m-%d")

You can use a dictionary to combine the corresponding dates and times:

tmp = {}
for date, i in date_list:
    tmp[i] = [convert_date(date)]
for time, i in time_list:
    tmp[i].append(time)  # assuming that i was already contained in date_list
data = [(date, time, i) for i, (date, time) in tmp.items()]

You can put everything in a function to keep it tidy:

def sort_dates_and_times(dates, times):
    tmp = {}
    for date, i in dates:
        tmp[i] = [convert_date(date)]
    for time, i in times:
        tmp[i].append(time)  # assuming that i was already contained in date_list
    data = [(date, time, i) for i, (date, time) in tmp.items()]
    return [i for date, time, i in sorted(data)]

sorted_indexes = sort_dates_and_times(date_list, time_list)

Another suitable data structure would be

data = {
    0: ('2024-07-05', b'15-21-00'),
    1: ('2024-08-16', b'23-41-01'),
    2: ('2024-08-16', b'15-20-55')
}

which incidentally already exists as tmp above.

(The difference between tuples and lists as values doesn't matter now.)

Then instead of post-processing the output of sorted to get the indexes,

data = [(date, time, i) for i, (date, time) in tmp.items()]
return [i for date, time, i in sorted(data)]

you would use a key argument to specify to sort the indexes not by the indexes but by the corresponding dates and times:

def date_time_of_index(i):
    return tmp[i]

return sorted(tmp, key=date_time_of_index)

Or more succinctly:

return sorted(tmp, key=lambda i: tmp[i])

Or even more succinctly (as suggested by no comment in a comment):

return sorted(tmp, key=tmp.get)

1 Comment

Hmm, ok. I'd maybe suggest the data structure tmp instead.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.