I am trying to migrate pyspark code from jupyter notebook to python script. However when I tried to use
from pyspark.sql import SparkSession
I have got an error No module named 'pyspark'
I have tried to find all
python3andpython2in system, run them as a shell and have tried to importpysparkin each shell. However, I have got the sameNo module named 'pyspark'in each shellWhen I tried to
import findsparkwithpython3/python2I had gotNo module named 'findspark'echo $PYTHONPATHandecho $SPARK_HOMEreturn empty stringI have tried to find all
spark-submitand run my script with them instead ofpython3. However, I have got an error forargparseuseFile "/export/home/osvechkarenko/brdmp_10947/automation_001/py_dynamic_report.py", line 206 if args.print: ^ SyntaxError: invalid syntaxWhen I used my script with python3 (without
pyspark) it had worked fine.
pyspark.__file__? That helps us to identify which of your envs works.