Given the following glsl declarations (this is just an example):
struct S{
f16vec3 a;
float16_t b;
f16vec3_t c;
float16_t d;
};
shared float16_t my_float_array[100];
shared S my_S_array[100];
I have the following questions:
- How much shared memory will be used by a given declaration, in the above example for instance?
- Which memory layout is used for variables in shared memory? std140, std430 or something else?
- How does this play with bank conflicts?
I was able to get the total shared memory required by a program using glGetProgramBinary and skiping until the begining of the text part indicated by a line starting with "!!NV":
...
!!NVcp5.0
OPTION NV_shader_buffer_load;
OPTION NV_internal;
OPTION NV_gpu_program_fp64;
OPTION NV_shader_storage_buffer;
OPTION NV_bindless_texture;
OPTION NV_gpu_program5_mem_extended;
GROUP_SIZE 4 4 4;
SHARED_MEMORY 4480;
SHARED shared_mem[] = { program.sharedmem };
...
This is rather indirect though and does not tell much about the alignment/packing rules.