How to clear Jupyter Notebook's output in all cells from the Linux terminal?

Question

I have a problem when the output from a notebook is really long and it's saved into the notebook, any time I want to open this particular notebook again the browser crashes and can't display correctly.

To fix this I have to open it with a text editor and delete all output from that cell causing the problem.

I wonder if there is a way to clean all output from the notebook so one can open it again without problem. I want to delete all output since deleting a specific one seems more troublesome.

minrk created a script for that. See gist.github.com/minrk/6176788. — cel
– cel, Commented Mar 6, 2015 at 22:03

Ciro Santilli OurBigBook.com · Accepted Answer · 2020-08-08 15:18:03Z

230

nbconvert 6.0 should fix --clear-output

The option had been broken for a long time previously, bug report with merged patch: https://github.com/jupyter/nbconvert/issues/822

Usage should be for in-place operation:

jupyter nbconvert --clear-output --inplace my_notebook.ipynb

Or to save to another file called my_notebook_no_out.ipynb:

jupyter nbconvert --clear-output \
  --to notebook --output=my_notebook_no_out my_notebook.ipynb

This was brought to my attention by Harold in the comments.

Before nbconvert 6.0: --ClearOutputPreprocessor.enabled=True

Same usage as --clear-output:

jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace my_notebook.ipynb
jupyter nbconvert --ClearOutputPreprocessor.enabled=True \
  --to notebook --output=my_notebook_no_out my_notebook.ipynb

Tested in Jupyter 4.4.0, notebook==5.7.6.

edited Aug 8, 2020 at 15:18

answered Dec 12, 2017 at 13:56

Ciro Santilli OurBigBook.com

393k120 gold badges1.3k silver badges1.1k bronze badges

Sign up to request clarification or add additional context in comments.

12 Comments

Louis R Over a year ago

This will convert the notebook to html, which does not seem to be what the op wants..

Ciro Santilli OurBigBook.com Over a year ago

@Jacquot What version of Jupyter are you in? I have just re-tested and it modifies the .ipynb inplace without creating HTML.

Louis R Over a year ago

I read too quickly your comment and didn't know the --inplace option ; I learned something. But it appears for my version 5.3.1, the option --clear-output is available, that summarizes --ClearOutputPreprocessor.enabled=True --inplace

Harold Over a year ago

The option --clear-output was broken, see issue #822. This has been fixed last month (July 2020) so it should work again in the next release.

Vivien Hung Dec 12, 2024 at 0:50

Not to criticize the answer, but my recent experience (Dec 2024) with nbconvert is that - if I hook it up with git filter, it slows down local git operation significantly. There are other folks experience the same in another similar question. We should have this perf impact in mind when using nbconvert with git filter.

|

dirkjot · Accepted Answer · 2019-09-27 04:46:24Z

70

If you create a .gitattributes file, you can run a filter over certain files before they are added to git. This will leave the original file on disk as-is, but commit the "cleaned" version.

For this to work, add this to your local .git/config or global ~/.gitconfig:

[filter "strip-notebook-output"]
    clean = "jupyter nbconvert --ClearOutputPreprocessor.enabled=True --to=notebook --stdin --stdout --log-level=ERROR"

Then create a .gitattributes file in your directory with notebooks, with this content:

*.ipynb filter=strip-notebook-output

How this works:

The attribute tells git to run the filter's clean action on each notebook file before adding it to the index (staging).
The filter is our friend nbconvert, set up to read from stdin, write to stdout, strip the output, and only speak when it has something important to say.
When a file is extracted from the index, the filter's smudge action is run, but this is a no-op as we did not specify it. You could run your notebook here to re-create the output (nbconvert --execute).
Note that if the filter somehow fails, the file will be staged unconverted.

My only minor gripe with this process is that I can commit .gitattributes but I have to tell my co-workers to update their .git/config.

If you want a hackier but much faster version, try JQ:

  clean = "jq '.cells[].outputs = [] | .cells[].execution_count = null | .'"

edited Sep 27, 2019 at 4:46

answered Sep 19, 2019 at 6:05

dirkjot

3,7841 gold badge26 silver badges18 bronze badges

6 Comments

sousben Over a year ago

this is the best of both worlds. Thanks for sharing this

Roly Over a year ago

Didn’t know about this. This is super-useful.

sousben Over a year ago

A slightly improved alternative is as follows. It cleans the metadata, and doesn't add outputs and execution_count to non code cells like the proposed JQ solution (which results in a warning):

clean = "jq '.cells |= map(if .\"cell_type\" == \"code\" then .outputs = [] | .execution_count = null else . end | .metadata = {}) | .metadata = {}'"

Nick Crews Over a year ago

As a final step, you probably want to scrub and recommit all of your existing notebooks, otherwise you could get heinous merge conflicts later. To do that run git add --renormalize . and then commit.

axolotl Jun 5 at 18:02

Is there a way to temporarily turn off the filter for a specific commit? E.g., if my repository is closer to maturation than it used to be and now I want to use the notebook as a demonstration of using the code including outputs and figures.

|

Kenneth Leung · Accepted Answer · 2021-10-13 02:26:10Z

16

nbstripout worked well for me.

Open the Jupyter terminal, navigate to the folder containing your notebook, and then run the following line:

nbstripout my_notebook.ipynb

answered Oct 13, 2021 at 2:26

Kenneth Leung

4004 silver badges9 bronze badges

2 Comments

jtlz2 Over a year ago

Excellent - or even nbstripout *.ipynb :)

Ben Wilson Over a year ago

Might be obvious for most, but you need to first install nbstripout with something like: pip install nbstripout

Hồng Ánh Trần · Accepted Answer · 2018-04-23 10:02:03Z

11

Use --ClearOutputPreprocessor.enabled=True and --clear-output

Following this command:

jupyter nbconvert --ClearOutputPreprocessor.enabled=True --clear-output *.ipynb

answered Apr 23, 2018 at 10:02

Hồng Ánh Trần

1211 silver badge2 bronze badges

Comments

gbro3n · Accepted Answer · 2020-10-24 20:06:23Z

To extend the answer from @dirkjot to resolve issue regarding sharing configuration:

Create a local .gitconfig file, rather than modifying .git/config. This makes the command that needs to be run on other machines slightly simpler. You can also create a script to run the git config command:

git config --local include.path ../.gitconfig

Note I have also changed the log level to INFO because I did want to see confirmation that the clean was running.

repo/.gitconfig

[filter "strip-notebook-output"]
    clean = "jupyter nbconvert --ClearOutputPreprocessor.enabled=True --to=notebook --stdin --stdout --log-level=INFO"

repo/.gitattributes

*.ipynb filter=strip-notebook-output

repo/git_configure.sh

git config --local include.path ../.gitconfig

Users then just need to run:

$ chmod u+x git_configure.sh
$ ./git_configure.sh

user8750647 · Accepted Answer · 2018-12-04 07:43:30Z

4

Use clean_ipynb, which not only clears notebook output but can also clean the code.

Install by pip install clean_ipynb

Run by clean_ipynb hello.ipynb

answered Dec 4, 2018 at 7:43

user8750647

1 Comment

Wayne Over a year ago

nbclean is a tool that can do that with some handy additional features, such as only removing only certain blocks of code/text, that make it handy for use for teaching.

Cimbali · Accepted Answer · 2023-06-18 10:10:22Z

I must say I find jupyer nbconvert painfully slow for the simple job of clearing some sub-arrays and resetting some execution numbers. It’s a superior solution in maintainability because that tool is expected to be updated if there is a change in the notebook source code format. However, the alternate solution below is faster and may also be useful if you don’t have nbconvert 6.0 (I have an environment running 5.6.1 at the moment…)

A very simple jq (a sort of sed for json) script does the trick very fast:

jq 'reduce path(.cells[]|select(.cell_type == "code")) as $cell (.; setpath($cell + ["outputs"]; []) | setpath($cell + ["execution_count"]; null))' notebook.ipynb > out-notebook.ipynb

Very simply, it identifies code cells, and replaces their outputs and execution_count attributes with [] and null respectively.

_{Or if you only want to remove the outputs and keep execution numbers, you can do even simpler:
jq 'del(.cells[]|select(.cell_type == "code").outputs[])' notebook.ipynb > out-notebook.ipynb}

joelostblom · Accepted Answer · 2022-10-28 22:16:43Z

As mentioned in one of the previous answers you can use the command-line json processor jq to perform this task notably quicker than with nbconvert. A complete command for getting rid of metadata, outputs and execution counts can be found in this blog post:

jq --indent 1 \
    '
    (.cells[] | select(has("outputs")) | .outputs) = []
    | (.cells[] | select(has("execution_count")) | .execution_count) = null
    | .metadata = {"language_info": {"name":"python", "pygments_lexer": "ipython3"}}
    | .cells[].metadata = {}
    ' 01-parsing.ipynb

If desired, you could modify to just clean a specific part of the output, such as execution counts (recursively wherever they occur in the json), and then add this as a git filter:

[filter "nbstrip"]
    clean = jq --indent 1 '(.. |."execution_count"? | select(. != null)) = null'
    smudge = cat

And add the following to ~/.config/git/attributes to have the filter applied globally to all your local repos:

*.ipynb filter=nbstripout

There is also nbstripout which is made for this purpose, but it's a bit slower.

Amir · Accepted Answer · 2023-03-30 15:01:55Z

0

I suggest using pre-commit approach, using something like:

  - repo: local
    hooks:
      - id: jupyter-nb-clear-output
        name: jupyter-nb-clear-output
        files: \.ipynb$
        stages: [commit]
        language: python
        entry: jupyter nbconvert --ClearOutputPreprocessor.enabled=True --inplace
        additional_dependencies: ['jupyterlab']

also explained more in this blog.

answered Mar 30, 2023 at 15:01

Amir

3553 silver badges11 bronze badges

3 Comments

Wayne Over a year ago

This update reminds me that there's also a GitHub action for cleaning notebooks, too. See here.

Laurynas G Over a year ago

when I do it this way on Github Desktop, I get the following error: "jupyter-nb-clear-output..................................................Failed - hook id: jupyter-nb-clear-output - exit code: 1 Executable jupyter not found"

Amir Over a year ago

does this thread help?

Mihai.Mehe · Accepted Answer · 2023-05-06 02:14:11Z

0

Parse the json:

#LARGE Notebook Clean Make a Copy FIRST and Run this only on the COPY!!!!

import json 
filename = 'COPY_of_Huge_Notebook.ipynb' 
f = open(filename) 
large_ntbk = json.load(f) 
f.close() 
outputs = large_ntbk['cells'] 
for o in outputs:
    if 'outputs' in o:
        outputs['outputs'] = []

small = open('small.ipynb', 'w') 
json.dump(large_ntbk, small, indent = 2) 
small.close()

answered May 6, 2023 at 2:14

Mihai.Mehe

5129 silver badges17 bronze badges

Comments

Manfred Weis · Accepted Answer · 2023-11-07 17:16:33Z

Here is my homebrew solution that I used from within a notebook to clear the 200MB output from another notebook:

with open('input.ipynb', 'r') as input_file, open('output.ipynb', 'w') as output_file:
    outblock=False 
    s2 = '   ],\n'
    s0 = '   "outputs": [],\n'
    s1 = '   "outputs": [\n'
    
    for line in input_file:
        if outblock:
            if line == s2:
                print(f'match{s2[:-1]}')
                outblock = False
                output_file.write(s0)
            continue     
        if line == s0:
            print(f'match{s0[:-1]}')
            output_file.write(line)
            continue
        if line == s1:
            print(f'match{s1[:-1]}')
            outblock = True
            continue
        output_file.write(line)

MikeB2019x · Accepted Answer · 2025-04-17 20:09:33Z

0

A function inspired by preceding answers that can be called on a list of files.

import json

def ntbk_clean(nb_path):  <==== pass in file path

    with open(nb_path, 'rb') as f:   <==== open the file
        notebook = json.load(f)
        outputs = notebook['cells']   <==== body of notebook

    for index, o in enumerate(outputs):
        if 'outputs' in o:
            o['outputs'] = []   <===== where there is an output replace with empty list

    with open(nb_path, 'w') as f:
        json.dump(notebook, f, indent = 2). <=== write out the modified notebook body to same path

answered Apr 17 at 20:09

MikeB2019x

1,2973 gold badges14 silver badges37 bronze badges

Collectives™ on Stack Overflow

How to clear Jupyter Notebook's output in all cells from the Linux terminal?

12 Answers 12

12 Comments

6 Comments

2 Comments

Comments

Comments

1 Comment

Comments

Comments

3 Comments

Comments

Comments

Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

12 Answers 12

12 Comments

6 Comments

2 Comments

Comments

Comments

1 Comment

Comments

Comments

3 Comments

Comments

Comments

Comments

Linked

Related