2

using python 3.11 with jupyter notebook, with several notebooks run simultaneously and they all read same csv to pandas file, and use it, none of these notebooks write or change that file. as these file very big and several notebooks use it simultaneously is it possible to only read the file in one notebook and that somehow other notebooks will be able to use without the need to read seperatly in each notebook?

so far each of those notebooks contain line like that:

df = pd.read_csv('/home/data/bigi.csv')

it seems like a big waste of computational sources, memory and time

5
  • 1
    Why not using the same notebook? Also what kind of pandas operations do you use? Most create copies anyway, so you'll copy the data in memory several times anyway Commented Oct 26, 2023 at 14:40
  • those are separate cods that do different things. it is not possible to use them in one notebook. in each of them some parts of the data is copied to create new mach smaller df's than working with those Commented Oct 26, 2023 at 14:48
  • yes, why not combine all the notebooks in one notebook Commented Oct 26, 2023 at 15:09
  • In addition to %store magic mozway pointed out, you can use pickling of the dataframe read from the csv and then use the pandas.read_pickle() method elsewhere so that you only have one pandas.read_csv() command among the several notebooks. That has the advantage it will work in pure Python, too, which you'll probably want to be moving to if speed is a concern. Commented Oct 26, 2023 at 18:23
  • If each notebook needs just a subset, you could store the data in SQLite or similar and pick your slice. Commented Oct 27, 2023 at 9:47

1 Answer 1

0

Go to one Jupyter notebook

Read your csv there:

import pandas as pd
df = pd.read_csv(r"C:\Users\xxx\xxx\xxx.csv")

Now, I have saved this file as demo1.ipynb, you can save with any name.


Go to a fresh notebook:

pip install import-ipynb

import import_ipynb

then import the saved file:

import demo1  

#output
importing Jupyter notebook from demo1.ipynb

Now, you can read your df by:

print(demo1.df)

Edit: As @mozway suggested you can store your variable in one notebook and access them in another notebook via:

To store variable type the below command on your Jupyter cell:

%store df

#output
Stored 'df' (DataFrame)

To access this variable in a new notebook type:

%store -r df
Sign up to request clarification or add additional context in comments.

1 Comment

I'm pretty sure this will load a copy. You can already share objects with the %store magic, but this also creates copies.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.