Restart Python interpreter from code on Databricks?

Question

When installing libraries directly in Databricks notebook cells via %pip install, the Python interpreter gets restarted. I have the understanding that in order for the newly installed packages to become visible and accessible to the rest of the notebook cells, the interpreter must be restarted.

How would it be possible to perform this interpreter restart programatically?

I am installing packages using a function call, based on requirements stored on a separate file, but I noticed that newly installed packages are not present, despite the installation seemingly taking place, either on notebook or cluster scope. I figured out that the reason might be the lack of interpreter restart from my installation code.

Is such a behavior possible?

Akmal Miah · Accepted Answer · 2023-02-03 09:28:06Z

5

you can use this:

dbutils.library.restartPython()

for more info look at https://learn.microsoft.com/en-us/azure/databricks/dev-tools/databricks-utils#dbutils-library-restartpython

answered Feb 3, 2023 at 9:28

Akmal Miah

591 silver badge2 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

desertnaut · Accepted Answer · 2024-09-17 21:32:52Z

1

Firstly, you can preinstall packages at cluster level or job level before you even start the notebook. I would recommend that before you try anything else. Trust me I've run thousands of libraries including custom built ones and have never needed to-do this.

On to your actual question. Yes but it will throw an exception.

This code will cause you to kill the python process on your DBX workspace

%sh

pid=$(ps aux | grep PythonShell | grep -v grep | awk '{print $2}')
kill -9  $pid

However, it poses a problem for you. If you run this as a bash notebook cell this will not be wrappable in try catch logic which makes automated workflows impossible.

I would have said you can call the shell commands from python but that wouldn't work as the exception would be thrown in that cell. You could perhaps use scala and sys_process library to achieve it but I'm no scala expert sadly.

edited Sep 17, 2024 at 21:32

desertnaut

60.8k32 gold badges155 silver badges183 bronze badges

answered May 25, 2022 at 23:57

Scott Bell

2773 silver badges12 bronze badges

4 Comments

lazarea Over a year ago

That's very insightful @Scott Bell, thank you for the detailed answer. I'm aware I can manually install packages on the cluster, or use pip install directly in a notebook, but neither of these seem to allow me to programatically install packages by retrieving package names and version numbers from an external file. That's why I had the idea to write a custom function (with Libraries API, POST requests) but then due to the interpreter not being restarted, the newly installed packages were not (always) accessible for the upcoming cells in the notebook

Scott Bell Over a year ago

So I can help with that. You should look up cluster init scripts and this answer covers that. I have achieved what you want before not using the above and instead keeping a file in source control and having my CI/CD process call the databricks API while loading this file. Thus is programmatic & source controlled.

Scott Bell Over a year ago

This is an example for devops learn.microsoft.com/en-us/azure/databricks/dev-tools/ci-cd/…

Naveen Reddy Marthala Over a year ago

@ScottBell, I use databricks in an enterprise environment, hence can't modify the package's versions at the cluster level. In the article you had linked, there was a hyperlink for "Notebook-scoped python libraries", which installs pip packages with the magic command %pip, but that still warns me to restart python runtime. How do I install packages once (in something like a venv only for the current notebook or for myself) and use it throughout the notebook?

Collectives™ on Stack Overflow

Restart Python interpreter from code on Databricks?

2 Answers 2

Comments

4 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related