Is there any direct way to run shell scripts into dataproc cluster. currently i can run the shells through pysparkoperator (which calls aonther python file and then this python file calls shell script). I have searched many links but as of now not found any direct way .
It will be really helpful for me if anybody can tell me the easiest way.
directway, but in case you are not aware, you can 1)find the running Dataproc master node Name 2)gcloud compute sshto that instancegoogleapiclient.discovery.build('dataproc', 'v1', credentials=GoogleCredentials.get_application_default())to get the running dataproc, thensubprocess.Popenwithgcloud compute sshby passing the correct instance name