0

I have two different lists and I need extract data from them according their name and then multiply them.

I have this lists:

query_tfidf = [0.8465735902799727, 0.8465735902799727]
documents_query = [['Aftonbladet', 'play', 0.0], ['Aftonbladet', 'free', 0.0],
 ['Radiosporten Play', 'play', 0.10769448286014331], ['Radiosporten Play', 'free', 0.0]]

And I need sort them according their name, for example:

{Aftonbladet: {play: 0.0, free: 0.0}, Radiosporten Play: {play: 0.10769448286014331, free: 0.0}

Then I need to extract data from each and multiply with query_tfidf and compute two variables. For example:

for each name:
    dot_product = (play_value * query_tfidf[0]) + (free_value * query_tfidf[1])
    query = sqrt((query_tfidf[0])^2 + (query_tfidf[1])^2)
    document = sqrt((play_value)^2 + (free_value)^2)

I'm a little bit desperate so I want to ask here. I'm using python 2.7.

2
  • 1
    Is documents_query always going to come in ordered nicely like that? [key1, 'play', val], [key1, 'free', val], [key2, 'play, val], ...? Commented Apr 25, 2014 at 22:43
  • yes, still same order, but those "play" or "free" are keywords, so their name can be changed. Commented Apr 25, 2014 at 22:48

2 Answers 2

1

Sorting the entries in your documents_query according to their name and keyword is very straightforward using dictionaries:

indexedValues = {}
for entry in documents_query:
    if entry[0] not in indexedValues:
        indexedValues[entry[0]] = {}
    indexedValues[entry[0]][entry[1]] = entry[2]

This will give you indexedValues that looks like what you asked for:

{'Aftonbladet': {'play': 0.0, 'free': 0.0}, 'Radiosporten Play': {'play': 0.10769448286014331, 'free': 0.0}
Sign up to request clarification or add additional context in comments.

Comments

1

Use collections.defaultdict to aggregate your data

from collections import defaultdict

results = defaultdict(dict)
for main_key, key, value in documents_query:
    results[main_key][key] = value

# dict(results)
# Out[16]: 
# {'Aftonbladet': {'free': 0.0, 'play': 0.0},
#  'Radiosporten Play': {'free': 0.0, 'play': 0.10769448286014331}}

What you are going to do with it later is bit unclear... but you should figure it out yourself, right?

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.