0

I have this nested list and want to sort list base on tuple key and value respectively.

data = [ 14, [('the', 3),
        ('governing', 1),
        ('wisdom', 1),
        ('about', 3),
        ('writing', 1)]]

And output is

output = [ 14, [('about', 3),
             ('governing', 1),
             ('wisdom', 1),
             ('writing', 1),
             ('the', 3)]]

sort something like this
[data_structure[0],sorted(data_structure[1],key = lambda x: x[1])]

yet without this slice sort and merge approach, if there is better way , please share. Looking for your clean pythonic approach.

definitely grateful for a python newbie like me.

4
  • it wil be better if you write the output you want to achieve. Commented Dec 27, 2018 at 21:54
  • Thanks @prashantrana, your suggestion is helpful and edited the post. Commented Dec 27, 2018 at 22:02
  • 2
    Can you explain, in what way does your current approach not suit your needs? To me it looks clean and pythonic enough. Somebody is going to suggest pandas for this, but I think that's going to be overkill for what you're doing. Commented Dec 27, 2018 at 22:04
  • I am expecting others integers or string values be part of the nested list items, and thought manual slice sort merge may not be efficient. thank you. Commented Dec 27, 2018 at 22:12

2 Answers 2

1

Given the indices clearly have meaning, a minor tweak for readability would be to unpack the source list to named variables (self-documenting names!), then repack. If you're like me, you can also drop the lambda (which I only use when no built-in can do the job, precisely as a cue that there is more than meets the eye) in favor of a operator helper made for this task.

For example:

# At top of file
from operator import itemgetter


# Unpack to useful names
doc_id, word_counts = data_structure

# Sort with self-documenting key function and repack
new_structure = [doc_id, sorted(word_counts, key=itemgetter(1))]

Performance-wise, I wouldn't expect major benefits or costs here; constructing an itemgetter has slightly higher fixed overhead than the lambda, but slightly lower per-item overhead when computing the keys. Unpacking to names is generally lower overhead than indexing, but of course you have to reload them, so it's probably a wash. Basically, I'm providing this answer solely to encourage more self-documenting code using useful names for both variables and functions.

Sign up to request clarification or add additional context in comments.

Comments

0

instead of using sorted function use sort

data = [ 14, [('the', 3),
        ('governing', 1),
        ('wisdom', 1),
        ('about', 3),
        ('writing', 1)]]

data[1].sort(key=lambda x:x[1])
print(data)

# output : [14, [('governing', 1), ('wisdom', 1), ('writing', 1), ('the', 3), ('about', 3)]]

this is more pythonic and easy readable .

1 Comment

Of course, this does assume the OP doesn't want a brand new list.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.