Submitting a databricks notebook run specifying a cluster pool?

Question

I'm submitting a one time, throw away notebook job with:

azuredatabricks.net/api/2.0/jobs/runs/submit

$json = @"
{
    "run_name": "integration testing notebook task",
    "existing_cluster_id": "$global:clusterID",
    "timeout_seconds": 3600,
    "notebook_task": {
        "notebook_path": "$global:notebookPath"
    }
}
"@

However, rather than specify an existing cluster ID, (which I had to create myself initially) I want it to use a cluster from the existing pool. How is this possible? The schema doesn't seem to accept instance_pool_id for this request.

Alex Ott · Accepted Answer · 2021-02-09 15:25:19Z

0

You need to use create request with new_cluster instead, and inside its definition specify the instance_pool_id, the same way as for normal clusters. Something like this:

$json = @"
{
    "run_name": "integration testing notebook task",
    "new_cluster": : {
      "spark_version": "7.3.x-scala2.12",
      "node_type_id": "r3.xlarge",
      "aws_attributes": {
        "availability": "ON_DEMAND"
      },
      "num_workers": 10,
      "instance_pool_id": "$global:poolID"
    },
    "timeout_seconds": 3600,
    "notebook_task": {
        "notebook_path": "$global:notebookPath"
    }
}
"@

But this will create a cluster with machines from a pool, not attach to some cluster that is already allocated there.

answered Feb 9, 2021 at 15:25

Alex Ott

88.1k10 gold badges110 silver badges157 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Submitting a databricks notebook run specifying a cluster pool?

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related