2

Is there some way to take a dataframe, say,

df = pd.DataFrame({'a':[1,2,3], 'b':[4,5,6]})

and store it in temp memory as a binary object that can then be opened with

open(df, 'rb')

So then, rather than do something like

open('/home/user/data.csv', 'rb')

the code would be

df = pd.DataFrame({'a':[1,2,3], 'b':[4,5,6]})

df_rb = *command to store in temp working memory as binary readable*

open(df_rb, 'rb')
5
  • Look into Python's pickle module: docs.python.org/3.8/library/pickle.html Commented Jun 2, 2020 at 19:39
  • ive tried with that, but then i cant extract the filepath with it. so i would need to pickle it, get a filepath (without specifying a filepath), and then use with open. is there some way to do that? Commented Jun 2, 2020 at 19:41
  • Simply dump the DataFrame to a in-memory byte stream (using e.g. BytesIO docs.python.org/3/library/io.html#io.BytesIO) instead of to a file. Commented Jun 2, 2020 at 19:46
  • What problem are you trying to solve? Commented Jun 2, 2020 at 19:58
  • I'm trying to build a work around to a Django REST API issue; i posted an in depth question around that, but i think it was too in depth. This will give a simple work around without having to change up my django api source code Commented Jun 2, 2020 at 20:02

2 Answers 2

5

You could pickle it to an io.BytesIO object which is in memory

import pandas as pd
import pickle, io
df = pd.DataFrame({'a':[1,2,3], 'b':[4,5,6]})
f = io.BytesIO()
pickle.dump(df,f)
f.seek(0)    # necessary to start reading at the beginning of the "file"
dg = pickle.load(f)

In [48]: dg==df
Out[48]: 
      a     b
0  True  True
1  True  True
2  True  True
Sign up to request clarification or add additional context in comments.

2 Comments

this was helpful.. the df.to_pickle will close the io object though, so I preferred to use pickle.dump so I could control when I closed it personally.
@NickBrady - thanx.
1

Pandas has df.to_pickle() method:

From the docs:

Pickle (serialize) object to file.

df.to_pickle("./dummy.pkl")

Then read this pickled df using read_pickle()

From the docs:

Load pickled pandas object (or any object) from file.

unpickled_df = pd.read_pickle("./dummy.pkl")

3 Comments

will the "./dummy.filetype" work on any computer? what i mean is, is there any risk to someone downloading a function i write, running the function (writing a file to './file'), and getting some "directory does not exist" error? it seems like that couldnt be the case
I think it will run on any machine. You can always handle the case where pickle writes in a dir that always exists.
This does not work if there is no filesystem, e.g. aws lambda

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.