I have 8 GPUs, 64 CPU cores (multiprocessing.cpu_count()=64)
I am trying to get inference of multiple video files using a deep learning model. I want some files to get processed on each of the 8 GPUs. For each GPU, I want a different 6 CPU cores utilized.
Below python filename: inference_{gpu_id}.py
Input1: GPU_id
Input2: Files to process for GPU_id
from torch.multiprocessing import Pool, Process, set_start_method
try:
set_start_method('spawn', force=True)
except RuntimeError:
pass
model = load_model(device='cuda:' + gpu_id)
def pooling_func(file):
preds = []
cap = cv2.VideoCapture(file)
while(cap.isOpened()):
ret, frame = cap.read()
count += 1
if ret == True:
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
pred = model(frame)[0]
preds.append(pred)
else:
break
cap.release()
np.save(file[:-4]+'.npy', preds)
def process_files():
# all files to process on gpu_id
files = np.load(gpu_id + '_files.npy')
# I am hoping to use 6 cores for this gpu_id,
# and a different 6 cores for a different GPU id
pool = Pool(6)
r = list(tqdm(pool.imap(pooling_func, files), total = len(files)))
pool.close()
pool.join()
if __name__ == '__main__':
import multiprocessing
multiprocessing.freeze_support()
process_files()
I am hoping to run inference_{gpu_id}.py files on all GPUs simultaneously
Currently, I am able to successfully run it on one GPU, 6 cores, But when I try to run it on all GPUs together, only GPU 0 runs, all others stop giving below error message.
RuntimeError: CUDA error: invalid device ordinal.
The script I am running:
CUDA_VISIBLE_DEVICES=0 inference_0.py
CUDA_VISIBLE_DEVICES=1 inference_1.py
...
CUDA_VISIBLE_DEVICES=7 inference_7.py
os.environ["CUDA_VISIBLE_DEVICES"]=str(gpu_id)?device='cuda'