I'm using google colab free Gpu's for experimentation and wanted to know how much GPU Memory available to play around, torch.cuda.memory_allocated() returns the current GPU memory occupied, but how do we determine total available memory using PyTorch.
4 Answers
PyTorch can provide you total, reserved and allocated info:
t = torch.cuda.get_device_properties(0).total_memory
r = torch.cuda.memory_reserved(0)
a = torch.cuda.memory_allocated(0)
f = r-a # free inside reserved
Python bindings to NVIDIA can bring you the info for the whole GPU (0 in this case means first GPU device):
from pynvml import *
nvmlInit()
h = nvmlDeviceGetHandleByIndex(0)
info = nvmlDeviceGetMemoryInfo(h)
print(f'total : {info.total}')
print(f'free : {info.free}')
print(f'used : {info.used}')
pip install pynvml
You may check the nvidia-smi to get memory info.
You may use nvtop but this tool needs to be installed from source (at the moment of writing this).
Another tool where you can check memory is gpustat (pip3 install gpustat).
If you would like to use C++ CUDA (this compiles with nvcc):
#include <iostream>
#include "cuda.h"
#include "cuda_runtime_api.h"
using namespace std;
int main( void ) {
int num_gpus;
size_t free, total;
cudaGetDeviceCount( &num_gpus );
for ( int gpu_id = 0; gpu_id < num_gpus; gpu_id++ ) {
cudaSetDevice( gpu_id );
int id;
cudaGetDevice( &id );
cudaMemGetInfo( &free, &total );
cout << "GPU " << id << " memory: free=" << free << ", total=" << total << endl;
}
return 0;
}
4 Comments
torch.cuda.memory_cached has been renamed to torch.cuda.memory_reservedimport pynvml instead of from pynvml import *, else this may cause conflict with other code. For example, modeling_roberta.py throws TypeError: '_ctypes.UnionType' object is not subscriptable. pynvml.nvmlInit(), h = pynvml.nvmlDeviceGetHandleByIndex(0), info = pynvml.nvmlDeviceGetMemoryInfo(h)In the recent version of PyTorch you can also use torch.cuda.mem_get_info:
https://pytorch.org/docs/stable/generated/torch.cuda.mem_get_info.html#torch.cuda.mem_get_info
torch.cuda.mem_get_info()
It returns a tuple where the first element is the free memory usage and the second is the total available memory.
6 Comments
total_memory + reserved/allocated) as it provides correct numbers when other processes/users share the GPU and take up memory.with torch.cuda.device(device): info = torch.cuda.mem_get_info() see: https://github.com/pytorch/pytorch/issues/76224I use this code:
def get_ram():
mem = psutil.virtual_memory()
free = mem.available / 1024 ** 3
total = mem.total / 1024 ** 3
total_cubes = 24
free_cubes = int(total_cubes * free / total)
return f'RAM: {total - free:.2f}/{total:.2f}GB\t RAM:[' + (total_cubes - free_cubes) * '▮' + free_cubes * '▯' + ']'
def get_vram():
free = torch.cuda.mem_get_info()[0] / 1024 ** 3
total = torch.cuda.mem_get_info()[1] / 1024 ** 3
total_cubes = 24
free_cubes = int(total_cubes * free / total)
return f'VRAM: {total - free:.2f}/{total:.2f}GB\t VRAM:[' + (
total_cubes - free_cubes) * '▮' + free_cubes * '▯' + ']'