My goal is to run a Spark job using Databricks, and my challenge is that I can't store files in the local filesystem since the file is saved in the driver, but when my executors tried to access the file, it didn't exist because it is located in the driver filesystem.
I want to use the Workspace to store my file; DBFS is not an option for me. The issue is that in my notebook, when I attempt to store the file with Python code, it works perfectly, and I can access the file using the Databricks UI.
The following Python code works:
import os
from pathlib import Path
# Define the path where you want to create the directory and file
directory_path = Path("/Workspace/Shared/credentials/test")
file_path = directory_path / "abc.txt"
# Create the directory if it doesn't exist
os.makedirs(directory_path, exist_ok=True)
# Create and write to the file
with open(file_path, 'w') as file:
file.write("This is a string stored in abc.txt")
I need to do this in Scala, but I tried the following code and encountered an issue:
%scala
import java.nio.file.{Files, Paths, StandardOpenOption}
val directoryPath = "/Workspace/Shared/credentials/test2"
val filePath = Paths.get(directoryPath, "abc.txt")
// Create the directory if it doesn't exist
val directory = Paths.get(directoryPath)
if (!Files.exists(directory)) {
Files.createDirectories(directory)
}
println(Files.exists(directory))
// Write the content to the file
val content = "This is a string stored in abc.txt"
Files.write(filePath, content.getBytes(), StandardOpenOption.CREATE, StandardOpenOption.TRUNCATE_EXISTING)
println(s"Directory created at: $directoryPath")
println(s"File created at: $filePath with content: '$content'")
However, I am receiving the following error:
FileSystemException: /Workspace/Shared/credentials: Operation not permitted
Ultimately, I want to use Spark with a Kafka configuration and specify the location of my JKS file:
spark.read
.format("kafka")
.option("includeHeaders", IncludeHeaders)
.option("kafka.bootstrap.servers", topic.innerSource.bootstrapServers.get)
.option("subscribe", parsedTopicName)
.option("kafka.security.protocol", jobConfig.kafka.get.securityProtocol)
.option("kafka.ssl.enabled.protocols", "TLSv1.2")
.option("kafka.ssl.keystore.location", MyLocationToTheWorkspace)
Do you have any suggestions on how I can use Scala to store my file in the workspace or in another location that all executors can access my JKS files?


whoamilocally isn't the same as when the code runs remotely in Databricks. On Mac/Linux, there generally is no /Workspace folder