I wrote a python script to dump my database, gzip it and move it to a cloud storage.
Locally everything works smooth and consumses basically no memory (max 20MB RAM, CPU is quite high, I/O is at max).
When I run it as a job in my kubernetes cluster, the memory usage piles up to about 1.6 GB.
That is more or less the size of my gzipped dumpfile.
Here's my dumping logic:
mysqldump_command = ['mysqldump', f'--host={host}', f'--port={port}', f'--user={username}',
f'--password={password}', '--databases', '--compact',
'--routines', db_name, f'--log-error={self.errorfile}']
print(f'## Creating mysql dump')
with open(self.filename_gzip, 'wb', 0) as f:
p1 = subprocess.Popen(mysqldump_command, stdout=subprocess.PIPE)
p2 = subprocess.Popen('gzip', stdin=p1.stdout, stdout=f)
p1.stdout.close() # force write error (/SIGPIPE) if p2 dies
p2.wait()
p1.wait()
I tried:
- setting
PYTHONUNBUFFERED=1, no effect. - I tried this logic but it was even worse
- I tried creating the dump first as a file and gzip it afterwards, was the worst of all my experiments
Any further ideas?