Skip to main content
Tweeted twitter.com/StackGameDev/status/864247901148631042
deleted 117 characters in body; edited title
Source Link
Nicol Bolas
  • 26.1k
  • 3
  • 78
  • 104

Syncronization Parallelism in Opengl'sGPU's rasterization process

Based on this [article][1]article I understood the base principle inside the Rasterization algorithm:

For every triangle
Compute Projection, color at vertices
Setup line equations
Compute bbox, clip bbox to screen limits
For all pixels in bbox
Increment line equations
Compute curentZ
Compute currentColor
If all line equations>0 //pixel [x,y] in triangle
If currentZ<zBuffer[x,y] //pixel is visible
 Framebuffer[x,y]=currentColor
zBuffer[x,y]=currentZ 

What I don't understand is how it is implemented in parallel inside the GPU.

I consider 2 possible implemenationsimplementations of the algorithm inside the GPU:

  1. The first way is drawing every triangle one after the other(in 1 thread) as all of the pixels inside of them are ran in parallel. What bothers me about this way is that it's really slow for a large amount of triangles. The first way is drawing every triangle one after the other(in 1 thread) as all of the pixels inside of them are ran in parallel. What bothers me about this way is that it's really slow for a large amount of triangles.

  2. The second way is drawing every triangle in parallel as for every triangle all the pixels inside of it are ran in parallel too. This looks efficient to me, but I see a problem in the way zBuffer and Framebuffer data are synced as if 2 pixels try to occupy 1 spot there will be 2 threads trying to write on same data at same time. Considering that there are 2 buffers that need to be updated I don't see a way it could happen atomicly. The second way is drawing every triangle in parallel as for every triangle all the pixels inside of it are ran in parallel too. This looks efficient to me, but I see a problem in the way zBuffer and Framebuffer data are synced as if 2 pixels try to occupy 1 spot there will be 2 threads trying to write on same data at same time. Considering that there are 2 buffers that need to be updated I don't see a way it could happen atomicly.

Another observation I have is that when I draw 2 triangles at same coordinates it's always the last that gets drawn. This stops me from thinking that theresthere's some atomic way of doing pixel calculations as if there were the ouputoutput pixels of the triangle would be random based of the 2 input triangle colors.

The thing that I guess is that the implementation is something in the middle of my 2 guesses, but it's there were I give up and ask here.

I'm sorry for my english its probably awful, but I'm not used to write in such format in english. [1]: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-837-computer-graphics-fall-2012/lecture-notes/MIT6_837F12_Lec21.pdf

Syncronization in Opengl's rasterization process

Based on this [article][1] I understood the base principle inside the Rasterization algorithm:

For every triangle
Compute Projection, color at vertices
Setup line equations
Compute bbox, clip bbox to screen limits
For all pixels in bbox
Increment line equations
Compute curentZ
Compute currentColor
If all line equations>0 //pixel [x,y] in triangle
If currentZ<zBuffer[x,y] //pixel is visible
 Framebuffer[x,y]=currentColor
zBuffer[x,y]=currentZ 

What I don't understand is how it is implemented in parallel inside the GPU.

I consider 2 possible implemenations of the algorithm inside the GPU:

  1. The first way is drawing every triangle one after the other(in 1 thread) as all of the pixels inside of them are ran in parallel. What bothers me about this way is that it's really slow for a large amount of triangles.

  2. The second way is drawing every triangle in parallel as for every triangle all the pixels inside of it are ran in parallel too. This looks efficient to me, but I see a problem in the way zBuffer and Framebuffer data are synced as if 2 pixels try to occupy 1 spot there will be 2 threads trying to write on same data at same time. Considering that there are 2 buffers that need to be updated I don't see a way it could happen atomicly.

Another observation I have is that when I draw 2 triangles at same coordinates it's always the last that gets drawn. This stops me from thinking that theres some atomic way of doing pixel calculations as if there were the ouput pixels of the triangle would be random based of the 2 input triangle colors.

The thing that I guess is that the implementation is something in the middle of my 2 guesses, but it's there were I give up and ask here.

I'm sorry for my english its probably awful, but I'm not used to write in such format in english. [1]: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-837-computer-graphics-fall-2012/lecture-notes/MIT6_837F12_Lec21.pdf

Parallelism in GPU's rasterization process

Based on this article I understood the base principle inside the Rasterization algorithm:

For every triangle
Compute Projection, color at vertices
Setup line equations
Compute bbox, clip bbox to screen limits
For all pixels in bbox
Increment line equations
Compute curentZ
Compute currentColor
If all line equations>0 //pixel [x,y] in triangle
If currentZ<zBuffer[x,y] //pixel is visible
 Framebuffer[x,y]=currentColor
zBuffer[x,y]=currentZ 

What I don't understand is how it is implemented in parallel inside the GPU.

I consider 2 possible implementations of the algorithm inside the GPU:

  1. The first way is drawing every triangle one after the other(in 1 thread) as all of the pixels inside of them are ran in parallel. What bothers me about this way is that it's really slow for a large amount of triangles.

  2. The second way is drawing every triangle in parallel as for every triangle all the pixels inside of it are ran in parallel too. This looks efficient to me, but I see a problem in the way zBuffer and Framebuffer data are synced as if 2 pixels try to occupy 1 spot there will be 2 threads trying to write on same data at same time. Considering that there are 2 buffers that need to be updated I don't see a way it could happen atomicly.

Another observation I have is that when I draw 2 triangles at same coordinates it's always the last that gets drawn. This stops me from thinking that there's some atomic way of doing pixel calculations as if there were the output pixels of the triangle would be random based of the 2 input triangle colors.

The thing that I guess is that the implementation is something in the middle of my 2 guesses, but it's there were I give up and ask here.

Source Link

Syncronization in Opengl's rasterization process

Based on this [article][1] I understood the base principle inside the Rasterization algorithm:

For every triangle
Compute Projection, color at vertices
Setup line equations
Compute bbox, clip bbox to screen limits
For all pixels in bbox
Increment line equations
Compute curentZ
Compute currentColor
If all line equations>0 //pixel [x,y] in triangle
If currentZ<zBuffer[x,y] //pixel is visible
 Framebuffer[x,y]=currentColor
zBuffer[x,y]=currentZ 

What I don't understand is how it is implemented in parallel inside the GPU.

I consider 2 possible implemenations of the algorithm inside the GPU:

  1. The first way is drawing every triangle one after the other(in 1 thread) as all of the pixels inside of them are ran in parallel. What bothers me about this way is that it's really slow for a large amount of triangles.

  2. The second way is drawing every triangle in parallel as for every triangle all the pixels inside of it are ran in parallel too. This looks efficient to me, but I see a problem in the way zBuffer and Framebuffer data are synced as if 2 pixels try to occupy 1 spot there will be 2 threads trying to write on same data at same time. Considering that there are 2 buffers that need to be updated I don't see a way it could happen atomicly.

Another observation I have is that when I draw 2 triangles at same coordinates it's always the last that gets drawn. This stops me from thinking that theres some atomic way of doing pixel calculations as if there were the ouput pixels of the triangle would be random based of the 2 input triangle colors.

The thing that I guess is that the implementation is something in the middle of my 2 guesses, but it's there were I give up and ask here.

I'm sorry for my english its probably awful, but I'm not used to write in such format in english. [1]: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-837-computer-graphics-fall-2012/lecture-notes/MIT6_837F12_Lec21.pdf