2

I have a fairly simple fragment shader used to handle a situation with multiple lights (code below trimmed down for clarity, only two lights shown).

The broad idea is to sum the various lighting contributions for each fragment, and it work fine, however I have found that it is unstably so on my hardware (Android HTX Desire X).

Measuring FPS, it becomes apparent that there is a single vec4 addition line which is causing the FPS to drop by 10.

What could be causing this performance hit on such an apparently simple operation?

void main (void)
{
    vec4 v = u_ViewModelMatrix * vec4(v_Vertex, 1.0);
    vec3 nv = normalize(-v.xyz);
    vec3 normalVector = normalize((u_ViewModelTransposeMatrix * vec4(normalize(v_Normal), 0.0)).xyz);

    vec4 finalColour = vec4(0.0, 0.0, 0.0, 1.0);

    // LIGHT 0
    lightPosition = vec4(u_LightData[2], u_LightData[3], u_LightData[4], 1);
    lightColour = vec4(u_LightData[5], u_LightData[6], u_LightData[7], 1.0) * u_LightData[0];

    lightVector = normalize((u_ViewMatrix * lightPosition).xyz - v.xyz);
    halfwayVector = normalize(lightVector + nv);

    facing = dot(normalVector, lightVector);
    if (facing >= 0.0) {
        finalColour = finalColour + diffuseColour * facing * lightColour;
    }

    // LIGHT 1
    lightPosition = vec4(u_LightData[LIGHTS_FLOATS_PER_LIGHT*1+2],
                         u_LightData[LIGHTS_FLOATS_PER_LIGHT*1+3],
                         u_LightData[LIGHTS_FLOATS_PER_LIGHT*1+4],
                         1);
    lightColour = vec4(u_LightData[LIGHTS_FLOATS_PER_LIGHT*1+5],
                       u_LightData[LIGHTS_FLOATS_PER_LIGHT*1+6],
                       u_LightData[LIGHTS_FLOATS_PER_LIGHT*1+7],
                       1.0) * u_LightData[LIGHTS_FLOATS_PER_LIGHT*1];

    lightVector = normalize((u_ViewMatrix * lightPosition).xyz - v.xyz);
    halfwayVector = normalize(lightVector + nv);

    facing = dot(normalVector, lightVector);
    if (facing >= 0.01) {
        vec4 qwe = diffuseColour * facing * lightColour;
// HERE .............
        finalColour = finalColour + qwe;  // takes 10 fps
// HERE ^^^^^^^^^^^^^
    }

    gl_FragColor = finalColour;
}

1 Answer 1

1

Branching causes this. Avoid using ifs and for loops. Replace

if (facing >= 0.0) {
    finalColour = finalColour + diffuseColour * facing * lightColour;
}

with

finalColour += max(0.0, facing) * diffuseColour * lightColour;

and

if (facing >= 0.01) {
    vec4 qwe = diffuseColour * facing * lightColour;
    // HERE .............
    finalColour = finalColour + qwe;  // takes 10 fps
    // HERE ^^^^^^^^^^^^^
}

with

finalColour += step(0.01, facing) * facing * diffuseColour * lightColour;

Don't worry if you will be calculating some values even when you don't need it. Since shaders are executed in parallel you can't get much faster than the slowest instance.

Also you should move as many things as possible to the vertex shader since it'll be executed just once for every vertex vs for every pixel in the fragment shader; basically you calculate everything that (tri)interpolates well in vertex shader and pass it as varyings:

  • Position and color of the lights
  • Vectors L, V and H (in this example at least)
Sign up to request clarification or add additional context in comments.

2 Comments

"Don't worry if you will be calculating some values even when you don't need it. Since shaders are executed in parallel you can't get much faster than the slowest instance." - Yeah, in the same way like the branching versions would have done. It's not that branching is evil right away, just that unnecessary computations might be done, like in your branchless version, except that the branch-based version might prevent the computations if the whole block doesn't take the branch, while your branchless version will always do all operations.
This was very useful - moving some of the redundant code from the frag shader to the vertex shader made a big difference. Less so removing the if/then calls.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.