You should be able to just load it from your PWD in the running driver. Yarn will start the master container process in the same folder as where --files will dump the file. For client mode that would be different, but for cluster mode it should work fine. For example, this works for me:
driver.py
from pyspark import SparkContext, SparkFiles
import os
with SparkContext() as sc:
print "PWD: " + os.getcwd()
print "SparkFiles: " + SparkFiles.getRootDirectory()
data = open('data.json')
print "Success!"
spark submit
spark-submit --deploy-mode cluster --master yarn --files data.json driver.py
Updated (comparing paths):
I updated my code to print both the PWD (which worked) and SparkFiles.getRootDirectory (which didn't work). For some reason the paths differ. I'm not sure why that is.. but loading files directly from the PWD is what I have always done for accessing files from the driver.
This is what paths printed:
PWD: /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rwidmaier/appcache/application_1539970334177_0004/container_1539970334177_0004_01_000001
SparkFiles: /var/lib/hadoop-yarn/cache/yarn/nm-local-dir/usercache/rwidmaier/appcache/application_1539970334177_0004/spark-e869ac40-66b4-427e-a928-deef73b34e40/userFiles-a1d8e17f-b8a5-4999-8
Update #2
Apparently, the way it works is --files and it's brethren only guarantee to provide the files in the SparkFiles.get(..) folder on the Executors, not on the Driver. HOWEVER, in order to ship them to the executors, Spark downloads them first to the PWD on the driver, which allows you to access it from there.
It actually only mentions the executors in the help text, not the driver.
--files FILES Comma-separated list of files to be placed in the working
directory of each executor. File paths of these files
in executors can be accessed via SparkFiles.get(fileName).
For reference, here is where the files are downloaded to the driver.
sparkContext.addPyFile(filename). It would probably also help if the file was on s3.addPyFileis justaddFilethat also adds it to the PYTHONPATH. Also, in the OP's case, he is trying to read the file from the driver, so adding it from the driver doesn't buy him anything. The driver would already have access at that point!