1

I have a small and simple movie registration app that lets a user register a new movie in the registry. This is currently only using pickled objects and saving the objects is not a problem but reading an unknown number of pickled objects from the file seems to be a little more complicated since i cant find any sequence of objects to iterate over when reading the file.

Is there any way to read an unknown number of pickled objects from a file in python (read into an unknown number of variables, preferably a list) ?

Since the volume of the data is so low i dont see the need to use a more fancy storage solution than a simple file.

When trying to use a list with this code:

film = Film(title, description, length)
film_list.append(film)
open_file = open(file, "ab")
try:
  save_movies = pickle.dump(film_list, open_file)
except pickle.PickleError:
  print "Error: Could not save film to file."

it works fine and when i load it i get a list returned but no matter how many movies im registering i still only get one element in the list. When typing len(film_list) it only returns the first movie that was saved/added to the file. When looking at the file it does contain the other movies that were added to the list but they are not being included in the list for some strange reason.

I'm using this code for loading the movies:

open_file = open(file, "rb")
try:
  film_list = pickle.load(open_file)
  print type(film_list) # displays a type of list
  print len(film_list) # displays that only 1 element is in the list
  for film in film_list: # only prints out one list item
    print film.name
except pickle.PickleError:
  print "Error: Unable to load one or more movies."
6
  • What about pickling a list ? Commented Feb 28, 2015 at 12:15
  • Contrary to your final statement, you obviously need a more fancy storage solution. At least one where you can determine different sections of the file corresponding to different pickled objects. The most obvious ways are 1) to store a dict/list of objects 2) store each different object in a different file 3) store an index that gives you offsets into the blocks written to the file. This kind of approach can be fast, but is brittle. it means things like rebuilding on item removal. 4) Try an off the shelve solution e.g. something like the built in shelve module (pun) 5) something more fancy. Commented Feb 28, 2015 at 12:33
  • Andre: i updated the original post Commented Feb 28, 2015 at 12:36
  • Preet: Your section point does not seem to be a very practical solution having hundreds of files everywhere and the 3 seems very hardcoded and basically you wont ever know the offets i think for every given file given that each movie has a longer description than the previous or the next. The dictionary approach however seems like a good one like Andrè pointed out but for some reason it doesnt quite seem to work when loading the list, see my original post which has now been updated. Commented Feb 28, 2015 at 12:39
  • I strongly believe in getting something working first. If you can hide it behind an interface, then you can change the implementation/performance details later however it pleases you. Having said that, I agree option 2 is stupidly unpractical. But its better to get a working slow POS than a fast program that doesn't do what its meant to. In terms of getting up and running, the single file per object approach is the most debuggable as well. It was in this getting up and running sanity-check spirit that I made the suggestion, I did not intend it to be construed as a production/live suggestion. Commented Feb 28, 2015 at 12:43

1 Answer 1

2

You can get an unknown amount of pickled objects from a file by repeatedly calling load on a file handle object.

>>> import string
>>> # make a sequence of stuff to pickle          
>>> stuff = string.ascii_letters
>>> # iterate over the sequence, pickling one object at a time
>>> import pickle
>>> with open('foo.pkl', 'wb') as f:
...     for thing in stuff:
...         pickle.dump(thing, f)
... 
>>> 
>>> things = []
>>> f = open('foo.pkl', 'rb')
>>> # load the first two objects
>>> things.append(pickle.load(f))
>>> things.append(pickle.load(f))
>>> # get the remaining pickled items
>>> while True:
...     try:          
...         things.append(pickle.load(f))
...     except EOFError:
...         break
... 
>>> stuff 
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
>>> things
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
>>> f.close()
Sign up to request clarification or add additional context in comments.

1 Comment

Hmm thats interesting, an external while look where you just break out of it when you encounter en EOFError..that seems to be a good solution..will be trying it out..thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.