41

Let's say I have something like this:

d = { "abc" : [1, 2, 3], "qwerty" : [4,5,6] }

What's the correct way to progammatically get that into a file that I can load from python later?

Can I somehow save it as python source (from within a python script, not manually!), then import it later?

Or should I use JSON or something?

2
  • 1
    Here's a couple more: dataset and jsonpickle. Commented Mar 26, 2016 at 16:05
  • The easiest way would be JSON because it's structuring data similar to Python dictionary. Luckily, python has a bundled JSON module. All you need to do is just import json. Commented Dec 25, 2020 at 3:55

7 Answers 7

73

Use the pickle module.

import pickle
d = { "abc" : [1, 2, 3], "qwerty" : [4,5,6] }
afile = open(r'C:\d.pkl', 'wb')
pickle.dump(d, afile)
afile.close()

#reload object from file
file2 = open(r'C:\d.pkl', 'rb')
new_d = pickle.load(file2)
file2.close()

#print dictionary object loaded from file
print new_d
Sign up to request clarification or add additional context in comments.

7 Comments

What's the r in front of the path mean?
The r'' denotes a raw string, described here: docs.python.org/reference/lexical_analysis.html#string-literals. Basically, it means that backslashes in the string are included as literal backslashes, not character escapes (though a raw string can't end in a backslash).
I've corrected the example—the file needs to be opened in binary mode. It still needs to be for Python 2, but it won't fail as dramatically.
Make sure you read the Python documentation (including for the appropriate version) and don't just rely on examples! :) docs.python.org/3.0/library/pickle.html (Sorry for the comment spam!)
Technically pickling will work for text mode files, so long as you're not using a binary pickle format (ie. protocol = 0) and you use it consistently (ie. also use text mode for reading back). Using binary is generally a better idea though, especially if you could be moving data between platforms.
|
16

Take your pick: Python Standard Library - Data Persistance. Which one is most appropriate can vary by what your specific needs are.

pickle is probably the simplest and most capable as far as "write an arbitrary object to a file and recover it" goes—it can automatically handle custom classes and circular references.

For the best pickling performance (speed and space), use cPickle at HIGHEST_PROTOCOL.

Comments

8

Try the shelve module which will give you persistent dictionary, for example:

import shelve
d = { "abc" : [1, 2, 3], "qwerty" : [4,5,6] }

shelf = shelve.open('shelf_file')
for key in d:
    shelf[key] = d[key]

shelf.close()

....

# reopen the shelf
shelf = shelve.open('shelf_file')
print(shelf) # => {'qwerty': [4, 5, 6], 'abc': [1, 2, 3]}

Comments

5

JSON has faults, but when it meets your needs, it is also:

  • simple to use
  • included in the standard library as the json module
  • interface somewhat similar to pickle, which can handle more complex situations
  • human-editable text for debugging, sharing, and version control
  • valid Python code
  • well-established on the web (if your program touches any of that domain)

1 Comment

JSON ain't valid Python. It looks so, superficially, but use some bools and you'll see the problem (JSON uses true and false, while Python uses True and False). Also: JSON arrays (dicts) only have string keys. So it doesn't preserve the data structure correctly.
5

You also might want to take a look at Zope's Object Database the more complex you get:-) Probably overkill for what you have, but it scales well and is not too hard to use.

Comments

3

Just to add to the previous suggestions, if you want the file format to be easily readable and modifiable, you can also use YAML. It works extremely well for nested dicts and lists, but scales for more complex data structures (i.e. ones involving custom objects) as well, and its big plus is that the format is readable.

Comments

1

If you want to save it in an easy to read JSON-like format, use repr to serialize the object and eval to deserialize it.

repr(object) -> string

Return the canonical string representation of the object. For most object types, eval(repr(object)) == object.

6 Comments

Consider ast.literal_eval() (docs.python.org/library/ast.html#ast.literal_eval) as an alternative to eval().
The main thing I don't like about this solution is that you have an object in the structure where the eval(repr()) identity doesn't hold, repr() will "succeed" but then eval() will barf.
@John You will be pilioried for that answer... were's S.Lott?
pickle, YAML, JSON, etc. are all safer and work with more types than this method. IMO, eval() should be avoided whenever possible.
@Jason: Actually, pickle is not any safer than eval - malicious input can execute code just as easily, and here at least it is obvious that it is doing so, so I think downvoting this is a little unfair. There are other reasons to avoid eval() (eg. only handles objects with evalable repr()s and silently loses data if they don't self-eval, as Miles pointed out), but security wise, it's no worse than pickle.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.