0

I want to call genetic variants with DeepVariant on an HPC for about 1000 cereal lines. I successfully ran DV for one line with the docker image they provide using Apptainer/Singularity, but for the full set I want to do that automatically.

Now my question, if I build a script to call every line, does it make a difference to start the container for each iteration or should I make my script so that the container is started once and DV called inside for each line?

1 Answer 1

0

If at all possible, your program should avoid directly interacting with the container system. If you have the option to start the service once and make a series of requests to it, that is probably better than repeatedly starting and stopping the container.

From a development point of view, things are generally a little easier if you don't have a hard dependency on Docker. This will mean it's easier to run your program in environments without Docker (brand-new developer systems, your CI environment) and you can more easily switch to non-Docker setups (Kubernetes, the service running remotely on a big remote host without a container).

There are two big problems with trying to use the Docker API to directly manage a container (or running docker CLI commands):

  1. It's all but trivial to docker run a container that takes over the entire host system: it is a huge security risk.
  2. This setup would be very specifically tied to Docker proper, and you'll have to write different orchestration code if you want to run something similar in a clustered environment like Kubernetes, or work nicely in a Compose-based setup.

The best case here is to launch your service dependency just once, outside your application code, and use something like an HTTP client library to send calls to it. This would be similar to how you use a containerized database: you (or a Compose file) creates a database container, and without doing anything Docker-specific, your application uses an ordinary database client to talk to it. Avoid anything that directly uses the Docker socket.

Sign up to request clarification or add additional context in comments.

1 Comment

I'm not sure if we talk about the same use-case. I'm not really "developing" anything but using software called deepvariant that is delivered as a docker image. Generally following this with singularity. Only that I want to do this a thousand times just with one changing input file. So I could loop over the files and start the container for each or start the container once and have it run a script that executes the real command for each.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.