3

I am trying to copy a file to Azure Databricks DBFS through Azure Devops pipeline. The following is a snippet from the yml file I am using:

stages:
- stage: MYBuild
  displayName: "My Build"
  jobs:
    - job: BuildwhlAndRunPytest
      pool:
        vmImage: 'ubuntu-16.04'

      steps:
      - task: UsePythonVersion@0
        displayName: 'Use Python 3.7'
        inputs:
          versionSpec: '3.7'
          addToPath: true
          architecture: 'x64'

      - script: |
          pip install pytest requests setuptools wheel pytest-cov
          pip install -U databricks-connect==7.3.*
        displayName: 'Load Python Dependencies'

      - checkout: self
        persistCredentials: true
        clean: true

      - script: |
          echo "y
          $(databricks-host)
          $(databricks-token)
          $(databricks-cluster)
          $(databricks-org-id)
          8787" | databricks-connect configure
          databricks-connect test
        env:
          databricks-token: $(databricks-token)
        displayName: 'Configure DBConnect'

      - script: |
          databricks fs cp test-proj/pyspark-lib/configs/config.ini dbfs:/configs/test-proj/config.ini

I get the following error at the stage where I am invoking the databricks fs cp command:

/home/vsts/work/_temp/2278f7d5-1d96-4c4e-a501-77c07419773b.sh: line 7: databricks: command not found

However, when I run databricks-connect test, it is able to execute the command successfully. Kindly help if I am missing some steps somewhere.

1 Answer 1

3

The databricks command is located in the databricks-cli package, not in the databricks-connect, so you need to change your pip install command.

Also, for databricks command you can just set the environment variables DATABRICKS_HOST and DATABRICKS_TOKEN and it will work, like this:

- script: |
    pip install pytest requests setuptools wheel
    pip install -U databricks-cli
  displayName: 'Load Python Dependencies'

- script: |
    databricks fs cp ... dbfs:/...
  env:
    DATABRICKS_HOST: $(DATABRICKS_HOST)
    DATABRICKS_TOKEN: $(DATABRICKS_TOKEN)
  displayName: 'Copy artifacts'

P.S. Here is an example on how to do CI/CD on Databricks + notebooks. You could be also interested in cicd-templates project.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.