i'm new to python and have big memory issue. my script runs 24/7 and each day it allocates about 1gb more of my memory. i could narrow it down to this function:
Code:
#!/usr/bin/env python
# coding: utf8
import gc
from pympler import muppy
from pympler import summary
from pympler import tracker
v_list = [{
'url_base' : 'http://www.immoscout24.de',
'url_before_page' : '/Suche/S-T/P-',
'url_after_page' : '/Wohnung-Kauf/Hamburg/Hamburg/-/-/50,00-/EURO--500000,00?pagerReporting=true',}]
# returns url
def get_url(v, page_num):
return v['url_base'] + v['url_before_page'] + str(page_num) + v['url_after_page']
while True:
gc.enable()
for v_idx,v in enumerate(v_list):
# mem test ouput
all_objects = muppy.get_objects()
sum1 = summary.summarize(all_objects)
summary.print_(sum1)
# magic happens here
url = get_url(v, 1)
# mem test ouput
all_objects = muppy.get_objects()
sum1 = summary.summarize(all_objects)
summary.print_(sum1)
# collects unlinked objects
gc.collect()
Output:
======================== | =========== | ============
list | 26154 | 10.90 MB
str | 31202 | 1.90 MB
dict | 507 | 785.88 KB
expecially the list attribute is getting bigger and bigger each cycle around 600kb and i don't have an idea why. in my opinion i do not store anything here and the url variable should be overwritten each time. so basically there should be any memory consumption at all.
what am i missing here? :-)
get_urlfunction—you see 600KB of list storage every loop? Or you see that with your real program, which does "magic" that you're not showing us?