I'm trying to open and read .gz file, and keep getting the error -
zlib.error: Error -3 while decompressing data: invalid distance too far back
The file that I'm trying to read were created line by line, using this code
with gzip.open(output_path + out_fn, 'a') as fout:
json_str = json.dumps(json.loads(data)) + "\n"
json_bytes = json_str.encode('utf-8')
fout.write(json_bytes)
which ran after each time new streaming data came (to 'data' var).
This is the scrip that I used to read the .gz file -
with gzip.open(file_path) as f:
new_lines = []
for line in f:
new_l = line.decode('utf-8')
new_l = json.loads(new_l)
new_lines.append(new_l)
The error is raised from the 'for' line, after successfully reading some lines, so there might be something wrong with specific lines.
I'll be ok with only skipping those problematic lines if possible, or of course fixing the entire file.
Edit:
I've uploaded the .gz file here - https://www.dropbox.com/s/wp34maf8n8wb5ur/tweets.gz?dl=0
I couldn't find a smaller example sorry.