4

I try to set up Databricks Connect to be able work with remote Databricks Cluster already running on Workspace on Azure. When I try to run command: 'databricks-connect test' it never ends.

I follow official documentation.

I've installed most recent Anaconda in version 3.7. I've created local environment: conda create --name dbconnect python=3.5

I've installed 'databricks-connect' in version 5.1 what matches configuration of my cluster on Azure Databricks.

    pip install -U databricks-connect==5.1.*

I've already set 'databricks-connect configure as follows:

    (base) C:\>databricks-connect configure
    The current configuration is:
    * Databricks Host: ******.azuredatabricks.net
    * Databricks Token: ************************************
    * Cluster ID: ****-******-*******
    * Org ID: ****************
    * Port: 8787

After above steps I try to run 'test' command for databricks connect:

    databricks-connect test

and as a result procedure starts and stops after warning about MetricsSystem as it is visible below:

    (dbconnect) C:\>databricks-connect test
    * PySpark is installed at c:\users\miltad\appdata\local\continuum\anaconda3\envs\dbconnect\lib\site-packages\pyspark
    * Checking java version
    java version "1.8.0_181"
    Java(TM) SE Runtime Environment (build 1.8.0_181-b13)
    Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
    * Testing scala command
    19/05/31 08:14:26 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
    19/05/31 08:14:34 WARN MetricsSystem: Using default name SparkStatusTracker for source because neither spark.metrics.namespace nor spark.app.id is set. 

I expect that process should move to next steps like it is in official documentation:

    * Testing scala command
    18/12/10 16:38:44 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
    18/12/10 16:38:50 WARN MetricsSystem: Using default name SparkStatusTracker for source because neither spark.metrics.namespace nor spark.app.id is set.
    18/12/10 16:39:53 WARN SparkServiceRPCClient: Now tracking server state for 5abb7c7e-df8e-4290-947c-c9a38601024e, invalidating prev state
    18/12/10 16:39:59 WARN SparkServiceRPCClient: Syncing 129 files (176036 bytes) took 3003 ms
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_\   version 2.4.0-SNAPSHOT
          /_/

    Using Scala version 2.11.12 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_152)
    Type in expressions to have them evaluated.
    Type :help for more information.

So my process stops after 'WARN MetricsSystem: Using default name SparkStatusTracker'.

What am I doing wrong? Should I configure something more?

7
  • It looks like this feature is in private preview so I wonder if that is causing the issue. Commented May 31, 2019 at 11:24
  • @Jon yes, I confirm: that feature is in preview but I found on internet forums that people use that. My case seems to be some technical problems specific to my configuration but I don't know what should I check/fix. Commented May 31, 2019 at 13:12
  • Oh interesting. I was going to try it myself when I got a chance today. I'll let you know how it goes :) Commented May 31, 2019 at 13:13
  • 1
    Fantastic, please let me know how you dealt with it. I have only my company laptop to test it, so at the same time I've a lot of security restrictions. I wonder how it might behave on another configuration. Good luck Jon. Commented May 31, 2019 at 13:39
  • lots of people seem to be seeing this issue with the test command on Windows. BUt if you try to use Databricks connect it works fine. Commented May 31, 2019 at 16:02

3 Answers 3

1

Looks like this feature isn't officially supported on runtimes 5.3 or below. If there are limitations on updating the runtime, i would make sure the spark conf is set as follows: spark.databricks.service.server.enabled true However, with the older runtimes things still might be wonky. I would recommend doing this with runtime 5.5 or 6.1 or above.

Sign up to request clarification or add additional context in comments.

Comments

0

Lots of people seem to be seeing this issue with the test command on Windows. But if you try to use Databricks connect it works fine. It seems safe to ignore.

Comments

0

Port 8787 was used for Azure in the past, but 15001 is now used for both Azure and AWS. Very old clusters may be using 8787, but all new clusters use 15001. Change the port using,

databricks-connect configure

add same configurations and change the port as 15001 After above step done I tried to run 'test' command,

databricks-connect test

then it worked

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.