Running Python Tensorflow on CPU and GPU in parallel

Question

I need to train a very large number of Neural Nets using Tensorflow with Python. My neural nets (MLP) are ranging from very small ones (~ 2 Hidden Layers with ~30 Neurons each) to large ones (3-4 Layers with >500 neurons each).

I am able to run all of them sequencially on my GPU, which is fine. But my CPU is almost idling. Additionally I found out, that my CPU is quicker than the GPU for my very small nets (I assume because of the GPU-Overhead etc...). Thats why I want to use both my CPU and my GPU in parallel to train my nets. The CPU should process the smaller networks to the larger ones, and my GPU should process from the larger to the smaller ones, until they meet somewhere in the middle... I thought, this is a good idea :-)

So I just simply start my consumers twice in different processes. The one with device = CPU, the other one with device = GPU. Both are starting and consuming the first 2 nets as expected. But then, the GPU-consumer throws an Exception, that his tensor is accessed/violated by another process on the CPU(!), which I find weird, because it is supposed to run on the GPU...

Can anybody help me, to fully segregate my to processes?

It won't work as you expect, if you are using both CPU and GPU. They need more communication which is much slower than calculation in GPU. So it's OK, and running a small one is slower using GPU because there will be more data transporting/communication. — Sraw
– Sraw, Commented Oct 25, 2017 at 9:51
Further, "3-4 Layers with >500 neurons each" is just a baby net. — Sraw
– Sraw, Commented Oct 25, 2017 at 9:55
@Sraw: Thanks for answering. But to clarify: I dont want to train ONE SINGLE net on CPU and GPU together. I want the CPU to train net1 and GPU to train net2 independently. So there is no need for communication between both processes. In fact I want to completely segregate them. I just want to use the calc-power of my idling CPU.. — Sauer
– Sauer, Commented Oct 25, 2017 at 10:09
@Sraw: You're probbably right with the baby-net post, but my dataset isn't challenging enough for larger networks... — Sauer
– Sauer, Commented Oct 25, 2017 at 10:10

Max F. · Accepted Answer · 2018-01-29 13:11:14Z

1

Do any of your networks share operators? E.g. they use variables with the same name in the same variable_scope which is set to variable_scope(reuse=True)

Then multiple nets will try to reuse the same underlying Tensor structures.

Also check it tf.ConfigProto.allow_soft_placement is set to True or False in your tf.Session. If True you can't be guaranteed that the device placement will be actually executed in the way you intended in your code.

answered Jan 29, 2018 at 13:11

Max F.

3382 silver badges8 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Running Python Tensorflow on CPU and GPU in parallel

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related