0

I have achieved from yesterday my first trial to train one modele :

python object_detection/legacy/train.py --train_dir=CP --pipeline_config_path=faster_rcnn_inception_v2_coco.config

After few times (10 or 20 secondes ) i am no more able to enter something with the mouth or key board. Update of GPU (nvidia-smi) is freeze. After few minutes i did a reset, and verify the content of CP. It is no more empty. What I can see, it is that hard drive is all the time working.

I did the same a second time, but let the process continue till the morning. CP directory has been updated (till model.ckpt-491).

Now few word to describe my configuration : CPU : i5 RAM : 8 giga OS : Ubuntu 18.04 GPU 1 : GT 730 used for visualisation GPU 2 : GTX 1060

ncvv : V9.0 and nvidia-smi give :

+-----------------------------------------------------------------------------+ | NVIDIA-SMI 390.87 Driver Version: 390.87 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce GT 730 Off | 00000000:01:00.0 N/A | N/A | | N/A 34C P0 N/A / N/A | 703MiB / 2001MiB | N/A Default | +-------------------------------+----------------------+----------------------+ | 1 GeForce GTX 106... Off | 00000000:06:00.0 Off | N/A | | 0% 33C P8 4W / 120W | 2MiB / 6078MiB | 0% Default | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+

Initially i have installed every thing to work only with one GPU (GT 730, as I did not have the second one at this time). Yesterday I received the new video card, and without doing something, it was recognize by nvidia-smi, and it was used directly by Tensorflow. Without any other modification.

Now my questions :

  • the fact that i did not install a driver for this new card could be the issue (I did not use it for visualisation) ?
  • or some point in the config file (I reduce the maxsize to 600*480) and lower batch_size to 1 could be modified to avoid my issue ?

Thanks you for your help Jean-Marie

1 Answer 1

0

I buy more RAM (total 24 giga), and this time execution is fast. No more impossibility to use my computer. Even more I am able to increase image size !

Probably somethink evident for most of you. But that just in case of some one add the same issue.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.