0

I have a list of dates (the last 30 days) that I build, and then I also have data returning from my database with dates and a count at those dates (I'll post some sample data after this description). I want to build a dictionary off of these two that will put in a placeholder value if the date is not returned from the database.

This is my list of dates - it also looks like this: http://screencast.com/t/VeB37A3k7KO

temp_dates = [
    datetime.date(2014, 4, 21),
    datetime.date(2014, 4, 22),
    datetime.date(2014, 4, 23),
    datetime.date(2014, 4, 24),
    ....
    datetime.date(2014, 5, 18),
    datetime.date(2014, 5, 19),
    datetime.date(2014, 5, 20),
    datetime.date(2014, 5, 21)
]

The data returned from my database is a list of dictionaries. It looks like this:

temp_data = [
    {u'daily_count': 3, u'total_count': 684, u'm_date': datetime.date(2014, 4, 21)},
    {u'daily_count': 2, u'total_count': 686, u'm_date': datetime.date(2014, 4, 22)},
    {u'daily_count': 32, u'total_count': 718, u'm_date': datetime.date(2014, 4, 23)},
    {u'daily_count': 1, u'total_count': 719, u'm_date': datetime.date(2014, 4, 25)},
    {u'daily_count': 1, u'total_count': 720, u'm_date': datetime.date(2014, 4, 26)},
    {u'daily_count': 17, u'total_count': 737, u'm_date': datetime.date(2014, 4, 29)},
    {u'daily_count': 1, u'total_count': 740, u'm_date': datetime.date(2014, 5, 2)},
    {u'daily_count': 1, u'total_count': 741, u'm_date': datetime.date(2014, 5, 4)},
    {u'daily_count': 1, u'total_count': 744, u'm_date': datetime.date(2014, 5, 6)},
    {u'daily_count': 2, u'total_count': 746, u'm_date': datetime.date(2014, 5, 8)}
    ...... etc.
]

I want to build a dictionary that will loop through the dates in temp_dates and if the date in temp_data matches, put the date as a new dictionary key with the total_count as the value. If there is a date that doesn't match then put in the previous value entered.

THIS IS WHAT I TRIED.

sql_info = {}
placeholder = 0

for i in temp_dates:
    for j in temp_data:
        if i == j['m_date']:
            sql_info[i] = j['total_count']
            placeholder = j['total_count']
            break
        else:
            sql_info[i] = placeholder

This doesn't work. It just puts in the placeholder every time, after putting in the first value on the first time through the loop. 684 http://screencast.com/t/BWUfFvYL

How can I fix this problem?


My working attempt

    for i in temp_dates:
        dd = i.strftime('%m-%d-%Y')
        sql_info[dd] = {}
        for j in temp_data:
            if i == j['m_date']:
                sql_info[dd]['total_count'] = j['total_count']
                placeholder = j['total_count']
                break
            else:
                if placeholder == 0:
                    placeholder = j['total_count'] - j['daily_count']
                sql_info[dd]['total_count'] = placeholder

If the date is not there the first time, calculate the total_count - daily_count to get the count that was there previously for that date. Expected output is this: http://screencast.com/t/0nCGTnAwJq ----- if there isn't a date there then I add it to the dict and put in the appropriate values (it's five different values per date that I put in).

6
  • what is the expected output of the data in your example? Commented May 21, 2014 at 21:09
  • what are new_dates and total_members etc..? Commented May 21, 2014 at 21:42
  • sorry i was messing around with some other stuff fixed :) Commented May 21, 2014 at 21:46
  • got a key error! what is placeholder_m? Commented May 21, 2014 at 21:55
  • its just placeholder fixed Commented May 21, 2014 at 21:55

3 Answers 3

2

Not fully sure if I get what you want but this keeps track of all placeholders and adds the second last value of total count using placeholder[-2] appends the previous value.

If you don't want the value to change until another date matches you can use a counter to keep track and use something like placeholder[-count]

sql_info = {}
placeholder = []
for i,j in zip(temp_data,temp_dates):
    placeholder.append(i['total_count'])
    if i['m_date'] in temp_dates:
        sql_info[j] = i['total_count']
    else:
        sql_info[j] = placeholder[-2]

This uses strftime to match your edited answer.

sql_info = {}
placeholder = []
count = 1
for i,j in zip(temp_data,temp_dates):
    dd = j.strftime('%m-%d-%Y')
    placeholder.append(i['total_count'])
    if i['m_date'] in temp_dates:
        sql_info[dd] = i['total_count']
    else:
        count += 1
        sql_info[dd] = placeholder[-count]
print sql_info
Sign up to request clarification or add additional context in comments.

4 Comments

so there was more to this that i didn't realize at first. I have a working copy but its not that great in the sense of it does a lot of computations. i'll edit my answer with a sample to show.
When I saw the question I thought it would be easy but turned out to be not so easy, the more you look at it the harder it seemed to get!How far off is my answer?
exactly... i did get it working but i have to do 5 for loops inside my temp_dates for loop.. bleh
I appreciate your attempt, but that doesn't do everything I need, and I still have to repeat that 5 times. I was looking for a more streamlined function to do it. anyways no worries i'll just stick with what I got :)
1

This is happening because you call "break" as soon as the function doesn't find i==j['m_date'] the first time.

In this example, because your first two values from i and j match each other, it will set placeholder 684 and then set it to sql_info[i] for the rest of the loop.

1 Comment

I realized that and removed it. it does what I wanted but it puts the place holder in the spot multiple times until it writes the new value in there.. is there a way to fix that?
0

The best choice is probably to alter your query to only select rows that m_date is in your list.

However I think

import bisect
def get_date_count_dict(list_of_dates,dates_count_dict):
    dates_items = sorted(dates_count_dict.items(),key=lambda item:item[0])
    sorted_dates,sorted_counts = zip(*dates_items)
    return dict([(a_date,sorted_counts[bisect.bisect(sorted_dates,a_date)])for a_date in list_of_dates])

new_data = dict([(d['m_date'],d['total_count']) for d in temp_data])
final_data = get_date_count_dict(temp_dates,new_data)

should work.

3 Comments

this does not put in a placeholder. i am building this for a daily cumulative graph. so I need a placeholder on the days that aren't in the database.
I really like your approach.. theres just one issue with it still. the placeholder you put in is a 0, it needs to be the value of the previous row. that way it'll be cumulative. your update looks like this.. screencast.com/t/WssfpAg6xM -------- what i need is this... screencast.com/t/5El4uqB08LvL
i get an error: tuple index out of range.. I debugged it and on 5-18 is where it failed. not sure how to see what the exact index error was but maybe that'll help

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.