7

How to start creating my own filetype in Python ? I have a design in mind but how to pack my data into a file with a specific format ?

For example I would like my fileformat to be a mix of an archive ( like other format such as zip, apk, jar, etc etc, they are basically all archives ) with some room for packed files, plus a section of the file containing settings and serialized data that will not be accessed by an archive-manager application.

My requirement for this is about doing all this with the default modules for Cpython, without external modules.

I know that this can be long to explain and do, but I can't see how to start this in Python 3.x with Cpython.

2
  • what have you tried?. Anyway this is pretty normal way of doing things. Read the docs of zipfile module Commented Apr 2, 2013 at 6:07
  • @joojaa I can't see nothing really related to what I'm trying to do here in the official Python wiki, so I tried nothing python-specific and I looking for clues Commented Apr 2, 2013 at 6:10

3 Answers 3

5

Try this:

from zipfile import ZipFile
import json

data = json.dumps(['foo', {'bar': ('baz', None, 1.0, 2)}])

with ZipFile('foo.filetype', 'w') as myzip:
    myzip.writestr('digest.json', data)

The file is now a zip archive with a json file (thats easy to read in again in many lannguages) for data you can add files to the archive with myzip write or writestr. You can read data back with:

with ZipFile('foo.filetype', 'r') as myzip:
    json_data_read = myzip.read('digest.json')
    newdata = json.loads(json_data_read)

Edit: you can append arbitrary data to the file with:

f = open('foo.filetype', 'a')
f.write(data)
f.close()

this works for winrar but python can no longer process the zipfile.

Sign up to request clarification or add additional context in comments.

2 Comments

but this is basically a zip file, I need to attach something that is not available in the archive as serialized data
Thats what the json is for it serializes data into the zipfile. The problem is that if you add arbitrary data into the file then the file may be corrupted. But yes i suppose you could try to append your arbitrary data to this file. BUT theres no guarantee the archiver can process this file anymore. Theres no way to guarantee all archivers work the same way.
1

Use this:

import base64
import gzip
import ast

   
def save(data):
    data = "[{}]".format(data).encode()
    data = base64.b64encode(data)
    return gzip.compress(data)

def load(data):
    data = gzip.decompress(data)
    data = base64.b64decode(data)
    return ast.literal_eval(data.decode())[0]

How to use this with file:

open(filename, "wb").write(save(data)) # save data

data = load(open(filename, "rb").read()) # load data

This might look like this is able to be open with archive program but it cannot because it is base64 encoded and they have to decode it to access it.

Also you can store any type of variable in it! example:

open(filename, "wb").write(save({"foo": "bar"})) # dict
open(filename, "wb").write(save("foo bar")) # string
open(filename, "wb").write(save(b"foo bar")) # bytes

# there's more you can store!

Comments

0

This may not be appropriate for your question but I think this may help you.

I have a similar problem faced... but end up with some thing like creating a zip file and then renamed the zip file format to my custom file format... But it can be opened with the winRar.

1 Comment

Well technically, .docx and .pptx is also a zip file, so it's not that much of a problem.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.