How to load Python 3 Pickled SKlearn Model in Python 2

Question

I have a Python 3.6 script that trains an SKLearn model and then saves the model using the following code:

with open('filepath', 'wb') as f:
    pickle.dump(trained_model, f, protocol=2)

When I try to load the pickle in python 3.6, things work out just fine:

>>with open('filepath', 'rb') as f:
>>    model = pickle.load(f)
>>
>>model

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
        max_depth=None, max_features='auto', max_leaf_nodes=None,
        min_impurity_decrease=0.0, min_impurity_split=None,
        min_samples_leaf=1, min_samples_split=2,
        min_weight_fraction_leaf=0.0, n_estimators=80, n_jobs=1,
        oob_score=False, random_state=None, verbose=0,
        warm_start=False)

when I run this same pickle.load command in Python 2.7, I get the following error:

>>with open('filepath', 'rb') as f:
>>    model = pickle.load(f)

ValueError: non-string names in Numpy dtype unpickling

Looking at documentation and similar cases, setting protocol to 2 should make the pickle file compatible. What is causing this issue and how can I work around it?

I cannot say anything else at this point 'cuz you didn't provide any minimal reproducible example for me to diagnose. — ivan_pozdeev
– ivan_pozdeev, Commented Oct 31, 2017 at 2:28

fersarr · Accepted Answer · 2019-08-13 10:00:51Z

1

You can use pickle._load() instead of .load() to force using a pure-Python implementation and get a more useful traceback.

If the faulty part is in numpy's code though, you're still left to using a C debugger or tracing the source code by hand...
...Or using numpy pickle format spec on the part that is fed to numpy's unpickling routine and try to guess what is wrong with it!

pickletools.dis() does this for you! It prints a disassembly of pickle data, complete with offsets. Though you might still need the spec to find out the nature of the violation.

That said, 3.4. Model persistence — scikit-learn 0.19.1 documentation does warn that loading model data in another version and/or architecture is not supported and suggests saving source material instead.

edited Aug 13, 2019 at 10:00

fersarr

3,5414 gold badges31 silver badges35 bronze badges

answered Oct 30, 2017 at 20:24

ivan_pozdeev

36.6k19 gold badges115 silver badges165 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

fersarr Over a year ago

the links to _load() and 'force using a pure python..' seem to be pointing to the wrong lines. Permalinks should be used instead

Collectives™ on Stack Overflow

How to load Python 3 Pickled SKlearn Model in Python 2

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related