Further to my comment to the OP, what follows is hopefully a minimum working example for my suggested solution, which I have tested. I've included some slightly verbose comments, docstrings, and output to explain both the approach and the code, so hopefully it is self-explanatory. It also includes basic error handling, though you may want to add more robust validation for your own use.
Note that most of what follows is gleaned from the official nbformat documentation, the standard/recommended library for programmatic notebook manipulation.
As mentioned, nbformat doesn't have built-in section slicing, but we can implement LaTeX-style \include functionality by:
- Reading notebooks with
nbformat.read()
- Extracting cells between headings (by finding markdown cells starting with
# Heading)
- Concatenating the desired sections
- Writing the result with
nbformat.write()
This gives us a build script where we list sections as [(filename, heading_name)] tuples, similar to $\LaTeX$'s \include command.
For demonstration purposes, first we create the 3 (very simple) sample notebooks, as per the OP (lists.ipynb, functions.ipynb, modules.ipynb), each with multiple sections marked by level-1 headings (# Heading$^\ddagger$). Then we define a function to extract the relevant sections (extract_section()), by reading a notebook and extracting all cells between the specified heading and the next level-1 heading. Finally, we create a function to merge them (merge_notebooks() for the simple case of merging entire notebooks, and merge_sections() to merge specific sections using [(filename, heading)] tuples $\LaTeX$-style.
And of course the MWE would not be complete without showing how to use the code in practice.
import nbformat
from nbformat.v4 import new_notebook, new_markdown_cell, new_code_cell
def create_sample_notebooks():
"""Create sample notebooks for demonstration"""
# Create lists.ipynb
nb_lists = new_notebook()
nb_lists.cells = [
new_markdown_cell("# Lists\n\nIntroduction to Python lists."),
new_code_cell("my_list = [1, 2, 3, 4, 5]\nprint(my_list)"),
new_markdown_cell("Lists are mutable and can contain any type."),
new_markdown_cell("# List Methods\n\nCommon list operations."),
new_code_cell("my_list.append(6)\nprint(my_list)")
]
with open('lists.ipynb', 'w', encoding='utf-8') as f:
nbformat.write(nb_lists, f)
# Create functions.ipynb
nb_functions = new_notebook()
nb_functions.cells = [
new_markdown_cell("# Functions\n\nDefining and using functions in Python."),
new_code_cell("def greet(name):\n return f'Hello, {name}!'\n\nprint(greet('World'))"),
new_markdown_cell("Functions help organise and reuse code."),
new_markdown_cell("# Advanced Functions\n\nLambda functions and decorators."),
new_code_cell("square = lambda x: x**2\nprint(square(5))")
]
with open('functions.ipynb', 'w', encoding='utf-8') as f:
nbformat.write(nb_functions, f)
# Create modules.ipynb
nb_modules = new_notebook()
nb_modules.cells = [
new_markdown_cell("# Modules\n\nImporting and using Python modules."),
new_code_cell("import math\nprint(f'Pi is approximately {math.pi:.2f}')"),
new_markdown_cell("Modules extend Python's functionality.")
]
with open('modules.ipynb', 'w', encoding='utf-8') as f:
nbformat.write(nb_modules, f)
print("Created sample notebooks: lists.ipynb, functions.ipynb, modules.ipynb")
def extract_section(notebook_file, heading_name):
"""
Extract cells between a specific heading and the next top-level heading.
Args:
notebook_file: Input notebook filename
heading_name: The heading text to search for (without the # prefix)
Returns:
List of cells in the section, or empty list if heading not found
"""
try:
with open(notebook_file, 'r', encoding='utf-8') as f:
nb = nbformat.read(f, as_version=4)
except FileNotFoundError:
print(f"Error: {notebook_file} not found")
return []
except Exception as e:
print(f"Error reading {notebook_file}: {e}")
return []
section_cells = []
in_section = False
for cell in nb.cells:
if cell.cell_type == 'markdown' and cell.source.startswith('# '):
# Extract heading text (remove '# ' and any trailing whitespace/newlines)
cell_heading = cell.source.split('\n')[0].replace('# ', '').strip()
if cell_heading == heading_name:
in_section = True
section_cells.append(cell)
elif in_section:
# Hit the next top-level heading, stop
break
elif in_section:
section_cells.append(cell)
if not section_cells:
print(f"Warning: Heading '{heading_name}' not found in {notebook_file}")
return section_cells
def merge_notebooks(notebook_files, output_file, add_separators=False):
"""
Merge multiple Jupyter notebooks into a single notebook.
Args:
notebook_files: List of input notebook filenames
output_file: Output notebook filename
add_separators: If True, add markdown separators between notebooks
"""
if not notebook_files:
print("Error: No notebook files specified")
return
merged = None
for i, fname in enumerate(notebook_files):
print(f" Reading {fname}...")
try:
with open(fname, 'r', encoding='utf-8') as f:
nb = nbformat.read(f, as_version=4)
if merged is None:
merged = nb
else:
# Add separator between notebooks if requested
if add_separators and i > 0:
merged.cells.append(
new_markdown_cell(f"\n---\n\n*Source: {fname}*\n")
)
# Extend cells from subsequent notebooks
merged.cells.extend(nb.cells)
except FileNotFoundError:
print(f"Error: {fname} not found, skipping...")
continue
except Exception as e:
print(f"Error reading {fname}: {e}, skipping...")
continue
if merged is None:
print("Error: No notebooks could be merged")
return
print(f"Writing merged notebook to {output_file}")
with open(output_file, 'w', encoding='utf-8') as f:
nbformat.write(merged, f)
def merge_sections(sections, output_file, add_separators=False):
"""
Merge specific sections from multiple notebooks (LaTeX-style \\include).
Args:
sections: List of tuples [(filename, heading_name), ...]
output_file: Output notebook filename
add_separators: If True, add markdown separators between sections
Example:
merge_sections([
('lists.ipynb', 'Lists'),
('functions.ipynb', 'Functions')
], 'custom_lecture.ipynb')
"""
if not sections:
print("Error: No sections specified")
return
merged = new_notebook()
for i, (fname, heading) in enumerate(sections):
print(f" Extracting '{heading}' from {fname}...")
section_cells = extract_section(fname, heading)
if section_cells:
# Add separator between sections if requested
if add_separators and i > 0:
merged.cells.append(
new_markdown_cell(f"\n---\n\n*Source: {fname} → {heading}*\n")
)
merged.cells.extend(section_cells)
if not merged.cells:
print("Error: No sections could be extracted")
return
print(f"Writing merged notebook to {output_file}")
with open(output_file, 'w', encoding='utf-8') as f:
nbformat.write(merged, f)
if __name__ == '__main__':
# Step 1: Create sample notebooks
print("Step 1: Creating sample notebooks...")
create_sample_notebooks()
# Step 2: Example 1 - Merge entire notebooks (basic approach)
print("\nStep 2: Merging entire notebooks...")
merge_notebooks(
notebook_files=[
'lists.ipynb',
'functions.ipynb',
'modules.ipynb'
],
output_file='combined_lecture.ipynb',
add_separators=False
)
# Step 3: Example 2 - Merge specific sections (LaTeX-style \include)
print("\nStep 3: Merging specific sections (LaTeX-style)...")
merge_sections(
sections=[
('lists.ipynb', 'Lists'),
('functions.ipynb', 'Functions'),
],
output_file='custom_lecture.ipynb',
add_separators=True
)
# Step 4: Example 3 - Cherry-pick subsections
print("\nStep 4: Cherry-picking specific subsections...")
merge_sections(
sections=[
('lists.ipynb', 'Lists'),
('functions.ipynb', 'Advanced Functions'), # Only the advanced section
],
output_file='advanced_topics.ipynb',
add_separators=True
)
When running the code, you should see the following output:
Step 1: Creating sample notebooks...
Created sample notebooks: lists.ipynb, functions.ipynb, modules.ipynb
Step 2: Merging entire notebooks...
Reading lists.ipynb...
Reading functions.ipynb...
Reading modules.ipynb...
Writing merged notebook to combined_lecture.ipynb
Step 3: Merging specific sections (LaTeX-style)...
Extracting 'Lists' from lists.ipynb...
Extracting 'Functions' from functions.ipynb...
Writing merged notebook to custom_lecture.ipynb
Step 4: Cherry-picking specific subsections...
Extracting 'Lists' from lists.ipynb...
Extractin 'Advanced Functions' from functions.ipynb...
Writing merged notebook to advanced_topics.ipynb
After running the script once to generate the sample notebooks, you can customise it for your own needs:
# Merge entire notebooks
merge_notebooks(
notebook_files=['intro.ipynb', 'advanced.ipynb'],
output_file='complete_course.ipynb'
)
# LaTeX-style: select specific sections only
merge_sections(
sections=[
('basics.ipynb', 'Variables'),
('basics.ipynb', 'Functions'),
('advanced.ipynb', 'Classes'),
],
output_file='custom_lesson.ipynb'
)
$\ddagger$
Note that the extract_section() function extracts content under a Level-1 heading (# Heading) as per the OP, and stops at the next Level-1 heading. This means:
- Sub-headings (
##, ###, etc.) within a section are correctly included as content, and
- it should work fine if you use
# exclusively for major module boundaries
However, if your actual notebooks use Level-2 headings (##) to separate major topics, you'll need to modify the stop condition in extract_section() to check for both # and ## headings. I hope that makes sense!
nbformatshould be the easiest/best $\endgroup$nbformatlacks built-in section slicing, but you can implement it easily if you 1) read them all (nbformat.read()), 2) extract cells between headings (by finding markdown cells starting with# Heading), 3) concatenate the required sections, 4) write withnbformat.write(),5) create a build script listing sections as[(filename, heading_name)]tuples. This gives you $\LaTeX$-style\includefor notebook sections. I haven't tested it, but if you need it, I can write it up as a proper (MWE) answer ? $\endgroup$