1

I'm submitting a one time, throw away notebook job with:

azuredatabricks.net/api/2.0/jobs/runs/submit

$json = @"
{
    "run_name": "integration testing notebook task",
    "existing_cluster_id": "$global:clusterID",
    "timeout_seconds": 3600,
    "notebook_task": {
        "notebook_path": "$global:notebookPath"
    }
}
"@

However, rather than specify an existing cluster ID, (which I had to create myself initially) I want it to use a cluster from the existing pool. How is this possible? The schema doesn't seem to accept instance_pool_id for this request.

1 Answer 1

0

You need to use create request with new_cluster instead, and inside its definition specify the instance_pool_id, the same way as for normal clusters. Something like this:

$json = @"
{
    "run_name": "integration testing notebook task",
    "new_cluster": : {
      "spark_version": "7.3.x-scala2.12",
      "node_type_id": "r3.xlarge",
      "aws_attributes": {
        "availability": "ON_DEMAND"
      },
      "num_workers": 10,
      "instance_pool_id": "$global:poolID"
    },
    "timeout_seconds": 3600,
    "notebook_task": {
        "notebook_path": "$global:notebookPath"
    }
}
"@

But this will create a cluster with machines from a pool, not attach to some cluster that is already allocated there.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.