Using User defined function written in python across notebooks in azure Databricks

Question

I have created one function using python in Databricks notebook

%python


import numpy as np 
from pyspark.sql.functions import udf
# from pyspark.sql.types import DateType
def get_work_day(start_date,work_days_to_be_added,site_work_days,holidays_list):
  holidays_list = list(holidays_list)
  if (site_work_days == 5):
    work_days = '1111100'
  elif (site_work_days == 6):
    work_days = '1111110'
  elif (site_work_days == 7):
    work_days = '1111111'
  elif (site_work_days == 1):
    work_days = '1000000'
  elif (site_work_days == 2):
    work_days = '1100000'
  elif (site_work_days == 3):
    work_days = '1110000'
  elif (site_work_days == 4):
    work_days = '1111000'  
  dt = np.busday_offset(start_date,work_days_to_be_added,roll='forward',weekmask = work_days,holidays=holidays_list)
  return str(dt)


spark.udf.register("get_work_day()", get_work_day)

It works fine when I call it from the same notebook but throws an error when I call it from other notebooks.

I am calling the above function in SQL code and SQL code works fine when executing in the same notebooks but it breaks when I run it in other notebooks

Select column_with_date_value,get_work_day(column_with_date_value,4,4,('2021-05-06','2021-05-07')) from db.samp

The error I get is

DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: Undefined function: 'get_work_day'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.; line 1 pos 28

Can anyone please tell me on how to register this functions so that they can used accross notebooks.

Kain · Accepted Answer · 2021-10-27 14:11:26Z

I found a couple good resources that helped me figure it out.

Forum that explains using the %run 'path': https://community.databricks.com/s/question/0D53f00001HKHdNCAX/can-i-run-one-notebook-from-another-notebook

Youtube video explains how to use dbutils.notebook.run('path',timeout_in_seconds): https://www.youtube.com/watch?v=B1DyJScg0-k&t=180s *Though this only allows you to run another notebook, passing it variables and getting its exit, it does not allow using functions in another Notebook (not that I could find)

To simplify it from what I did: Notebook1 (path = './Notebook1') contained a function:

def PrintFunction():
    print("Hello World")
    
dbutils.notebook.exit("SUCCESS")

Notebook2 must do the:

%run './Notebook1'

command in a separate command block in Databricks and must come before the command block that is using the function.

In the following command block you can simply call the function:

PrintFunction()

Result: Hello World

Karthikeyan Rasipalay Durairaj · Accepted Answer · 2021-10-27 15:31:41Z

1

You can handle %run command along with widgets. example

pgm-01:

dbutils.widgets.dropdown("even","2",['2','4','6','8'])
print('this is even widgets',dbutils.widgets.get('even'))

dbutils.widgets.dropdown("odd","1",['1','3','5','9'])
print('this is even widgets',dbutils.widgets.get('odd'))

So that you can call pgm-01 from pgm-02 and pass the value

%run ./pgm1 $odd='100'

Tested Screen print :

answered Oct 27, 2021 at 15:31

Karthikeyan Rasipalay Durairaj

2,33922 silver badges42 bronze badges

Collectives™ on Stack Overflow

Using User defined function written in python across notebooks in azure Databricks

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related