Jupyter notebook memory overhead [closed]

Question

Closed. This question needs to be more focused. It is not currently accepting answers.

Want to improve this question? Guide the asker to update the question so it focuses on a single, specific problem. Narrowing the question will help others answer the question concisely. You may edit the question if you feel you can improve it yourself. If edited, the question will be reviewed and might be reopened.

Closed 2 months ago.

Improve this question

Every Jupyter notebook I open seems to use a constant 900-1200 MB of RAM continuously. Where is most of this overhead coming from? It seems to be relatively independent of the contents of the notebook, as most of the jump in usage is when the notebook is first opened, before any cells are even executed. In fact, sometimes after executing a number of the cells the RAM usage actually goes down.

To give an idea, in the notebook I have open now, the largest variable I've defined is an ~3000-element list of tuples, each of which contains 5 ints and a float. So it's not like I have several million-row data tables open--and as I said, the RAM usage was already up near that 1-GB mark before I even ran the code that generated this array.

I'm gathering that the Python runtime itself is somehow using most of this RAM, but if I run Python from the command line it doesn't do this. Also, the memory usage is showing up as being used by the browser process, NOT the background Python process (even when the Jupyter notebook is the only thing running in the browser--and when the browser was closed prior to launching the notebook, hence it isn't some kind of left over data from previously browsed websites that never got cleared).

Jupyter is complex program - much more complex than Python in command line. And this needs RAM. — furas
– furas, Commented Sep 17 at 2:23

pixel-process · Accepted Answer · 2025-09-17 03:35:07Z

1

If the output of the notebook(s) is large and not cleared that would likely explain this. Potential solutions include: 1) Restarting kernel and removing all output 2) Limiting displays (use .head() on dfs instead of displaying full dfs) 3) Avoid large, complex, and/or interactive plots-save them to files if needed.

For example the below code in a new notebooks returns values of ~70MB

import psutil, os, time
proc = psutil.Process(os.getpid())
for _ in range(10):
    print(proc.memory_info().rss / 1e6, "MB")
    time.sleep(5)

Add the below results in values of over 400 MB NOTE: If memory is a concern on your machine, probably do NOT test this.

import plotly.express as px
import pandas as pd
import numpy as np

# Create a big dataset
N = 2_000_000
df = pd.DataFrame({
    "x": np.random.randn(N),
    "y": np.random.randn(N),
    "color": np.random.choice(["A", "B", "C", "D"], size=N)
})

# Plotly scatter (this will be huge in output JSON)
fig = px.scatter(df, x="x", y="y", color="color", opacity=0.3)
fig.show()

When I save, close, and reopen this notebook, values remain 400MB+

answered Sep 17 at 3:35

pixel-process

4751 silver badge7 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

biohacker Sep 17 at 5:26

Thanks pixel-process. That likely explains it--I'd have thought that plots would just be relatively small images (since the data arrays being plotted don't persist, I have to re-generate them every time even though the actual graph is visible).

Collectives™ on Stack Overflow

Jupyter notebook memory overhead [closed]

1 Answer 1

1 Comment

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Related