0
dictionary[pattern_key] = {"key": index_key, "document": index_source, "startPos":index_start, "endPos": index_end}

This is an extract of my list of dictionaries

{
'AGACAATCTC': {'startPos': '174', 'document': 'source-document01012.txt', 'endPos': '183', 'key': 'AGACAATCTC'}, 
'GGTCAGACAA': {'startPos': '18', 'document': 'source-document01012.txt', 'endPos': '27', 'key': 'GGTCAGACAA'}, 
'TAGATGAAGT': {'startPos': '102', 'document': 'source-document01012.txt', 'endPos': '111', 'key': 'TAGATGAAGT'}
}

How can i sort that by document and then by startPos ?

i tried something like this but does not work

sorted_dict = sorted(dictionary, key=itemgetter(pattern_key[document]))

script.py

#!/usr/bin/env python
import sys

dictionary = {};

for pattern in sys.stdin:

    if "," in pattern:
        pattern_key, pattern_source, pattern_start, pattern_end = pattern.strip().split(",")
        index_file =  open('index.txt', 'r')

        for line in index_file:
            if "," in line:
                index_key, index_source, index_start, index_end = line.strip().split(",")
                if pattern_key == index_key:
                    dictionary[pattern_key] = {"document": index_source, "startPos":index_start, "endPos": index_end}

sorted(dictionary.items(), key = lambda x: (x[1]['document'], int(x[1]['startPos'])))

for k, v in dictionary.items():
    print (k, '-->', v)
7
  • For output, do you want the entire dict values, or just the keynames? ['AGACAATCTC', 'TAGATGAAGT'] like that? Commented Mar 27, 2016 at 16:27
  • i need the entire output sorted by document and then startPos... Commented Mar 27, 2016 at 16:28
  • Sorted does NOT update the dictionary to have sorted values. You need to use the list of (<key>, <dict-val>) tuple returned by sorted() command. If you convert list of tuples to dictionary and then use it, it may again disturb the sorted order. Commented Mar 27, 2016 at 16:56
  • 1
    I don't see a list of dict. but instead a dict. of dict. or am I missing something here? Commented Mar 27, 2016 at 17:14
  • If you want a dictionary to be sorted, you'll need to use collections.OrderedDict. Regular dictionaries do not guarantee the order of the keys. Commented Mar 27, 2016 at 17:17

2 Answers 2

3

You can get the entries in the inner dictionary as keys for sorted:

sorted(dictionary.items(), key = lambda x: (x[1]['document'], int(x[1]['startPos'])))

A tuple key will be sorted first by the 0th element, then 1st, and so on.

Note that this produces a list of tuples, where each tuple is (str, dict).

EDIT:
In your context, the correct implementation is the following:

sorted_values = sorted(dictionary.items(), key = lambda x: (x[1]['document'], int(x[1]['startPos'])))

for k, v in sorted_values:
    print (k, '-->', v)
Sign up to request clarification or add additional context in comments.

6 Comments

Running this on your sample data returns: ('GGTCAGACAA', {'document': 'source-document01012.txt', 'endPos': '27', 'startPos': '18', 'key': 'GGTCAGACAA'}) ('TAGATGAAGT', {'document': 'source-document01012.txt', 'endPos': '111', 'startPos': '102', 'key': 'TAGATGAAGT'}) ('AGACAATCTC', {'document': 'source-document01012.txt', 'endPos': '183', 'startPos': '174', 'key': 'AGACAATCTC'}). This is sorted by document and startPos.
Can you give an example, for what value it did not sort using 'startPos'? Because I tried similar approach and it worked for me.
hmm weird, it gives me a randomized output? ('AAAGCTTACA', '-->', {'startPos': '132', 'document': 'source-document01012.txt', 'endPos': '141'}) ('GGAGAAATCT', '-->', {'startPos': '78', 'document': 'source-document01012.txt', 'endPos': '87'}) ('TCGGGAGCAA', '-->', {'startPos': '216', 'document': 'source-document01012.txt', 'endPos': '225'}) ('CGGTTTATGT', '-->', {'startPos': '204', 'document': 'source-document01012.txt', 'endPos': '213'}) ('TCACGTAGGA', '-->', {'startPos': '234', 'document': 'source-document01012.txt', 'endPos': '243'}) i removed "key" just for clarity, do not mind the "-->"
No the sort works. Are you sure everything else is in place and proper?
will post my whole script
|
2

Make your sorting based on your desired criteria then create a new OrderedDict from the sorted list, since dict cannot keep the sorting by it's nature:

>>> from collections import OrderedDict
>>>
>>> d = {'AGACAATCTC': {'endPos': '183', 'document': 'source-document01010.txt', 'key': 'AGACAATCTC', 'startPos': '174'}, 'GGTCAGACAA': {'endPos': '27', 'document': 'source-document01010.txt', 'key': 'GGTCAGACAA', 'startPos': '18'}, 'TAGATGAAGT': {'endPos': '111', 'document': 'source-document01011.txt', 'key': 'TAGATGAAGT', 'startPos': '102'}}
>>> 
>>> d_ordered = OrderedDict(sorted(d.items(), key=lambda t:(t[1]['document'], int(t[1]['startPos']))))
>>> 
>>> d_ordered
OrderedDict([('GGTCAGACAA', {'endPos': '27', 'document': 'source-document01010.txt', 'key': 'GGTCAGACAA', 'startPos': '18'}), ('AGACAATCTC', {'endPos': '183', 'document': 'source-document01010.txt', 'key': 'AGACAATCTC', 'startPos': '174'}), ('TAGATGAAGT', {'endPos': '111', 'document': 'source-document01011.txt', 'key': 'TAGATGAAGT', 'startPos': '102'})])

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.