1

I'm getting the following exception when running a pyflink application:

  • I'm using start-cluster.sh to start the flink cluster
  • I'm using Python virtual environment to run the flink job (/root/Python3.6/venv.zip)
  • I've set archive path in the application (t_env.add_python_archive(archive_path="/root/Python3.6/venv.zip", target_dir=None))
  • I'm using UDFs and if I take the UDFs out, I won't get this exception and job runs successfully
Caused by: java.io.IOException: Cannot run program "/root/Python3.6/venv.zip/venv/bin/python" (in directory "/tmp/python-dist-ffa89c4c-527b-49d8-bae3-fd2fd6d3cd67/python-archives"): error=20, Not a directory
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
    at org.apache.flink.python.util.PythonEnvironmentManagerUtils.execute(PythonEnvironmentManagerUtils.java:193)
    at org.apache.flink.python.util.PythonEnvironmentManagerUtils.getPythonUdfRunnerScript(PythonEnvironmentManagerUtils.java:154)
    at org.apache.flink.python.env.beam.ProcessPythonEnvironmentManager.createEnvironment(ProcessPythonEnvironmentManager.java:156)
    at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.createPythonExecutionEnvironment(BeamPythonFunctionRunner.java:395)
    at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.lambda$open$0(BeamPythonFunctionRunner.java:243)
    at org.apache.flink.runtime.memory.MemoryManager.lambda$getSharedMemoryResourceForManagedMemory$5(MemoryManager.java:539)
    at org.apache.flink.runtime.memory.SharedResources.createResource(SharedResources.java:126)
    at org.apache.flink.runtime.memory.SharedResources.getOrAllocateSharedResource(SharedResources.java:72)
    at org.apache.flink.runtime.memory.MemoryManager.getSharedMemoryResourceForManagedMemory(MemoryManager.java:555)
    at org.apache.flink.streaming.api.runners.python.beam.BeamPythonFunctionRunner.open(BeamPythonFunctionRunner.java:246)
    at org.apache.flink.streaming.api.operators.python.AbstractPythonFunctionOperator.open(AbstractPythonFunctionOperator.java:131)
    at org.apache.flink.table.runtime.operators.python.AbstractStatelessFunctionOperator.open(AbstractStatelessFunctionOperator.java:110)
    at org.apache.flink.table.runtime.operators.python.scalar.AbstractPythonScalarFunctionOperator.open(AbstractPythonScalarFunctionOperator.java:100)
    at org.apache.flink.table.runtime.operators.python.scalar.PythonScalarFunctionOperator.open(PythonScalarFunctionOperator.java:62)
    at org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:110)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:711)
    at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:687)
    at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:654)
    at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:958)
    at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:927)
    at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:766)
    at org.apache.flink.runtime.taskmanager.Task.run(Task.java:575)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: error=20, Not a directory
    at java.lang.UNIXProcess.forkAndExec(Native Method)
    at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
    at java.lang.ProcessImpl.start(ProcessImpl.java:134)
    at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)```
1
  • This was a stupid mistake. I had provided the wrong path for python virtual environment. Additionally I had to set the python.client.executable to the same path: ``` t_env.get_config().get_configuration().set_string( "python.client.executable", "/root/Python3.6/venv/bin/python") t_env.get_config().set_python_executable( "/root/Python3.6/venv/bin/python") ``` Commented Mar 17, 2022 at 16:03

1 Answer 1

1

This was an ignorant mistake. I had provided the wrong path for python virtual environment. And I didn't need to set the python-archive path in my case as I'm not using that in the code. Additionally I had to set the python.client.executable property point to the same path:

t_env.get_config().get_configuration().set_string( "python.client.executable", "/root/Python3.6/venv/bin/python") 
t_env.get_config().set_python_executable( "/root/Python3.6/venv/bin/python") 
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.