Now I have a different circuit, definitely simpler and smaller, but also able unable to interpolate color-pixels, it just fills a triangle with a solid color as fast as it can.
it's composed by
- a circuit that calculates the min(x,y) set of points of the triangle
- three line-fillers in parallel and each line-filler fills a table
- a horizontal line-filler that use the table as input
accordingly:
- you give the vertices v1(x,y), v2(x,y), v3(x,y) of the triangle, coordinates are positive numbers)
- the first stage calculates y.min as min({ v1.y, v2.y, v3.y })
- the line-filler1 draws line { v1-v2 }, for each calculated point {x,y} it fills table[y-y.min]={x,_,_}
- the line-filler2 draws line { v2-v3 }, for each calculated point {x,y} it fills table[y-y.min]={_,x,_}
- the line-filler3 draws line { v3-v1 }, for each calculated point {x,y} it fills table[y-y.min]={_,_,x}
- line-fillers work in paralel, and this draws the outline of the triangle
- once completed, the last circuit scans the table, finds x_start,x_stop, and it draws horizontal lines
each table line or is empty, or it always only shows these configurations
table[i]={x'1,x'2,__} <--- v1-v2: x_start=x'1, x_stop=x'2, y=i+y.min
table[i]={x'1,__,x'3} <--- v3-v1: x_start=x'1, x_stop=x'3, y=i+y.min
table[i]={__,x'2,x'3} <--- v3-v1: x_start=x'2, x_stop=x'3, y=i+y.min
x'1 means x-point calculated by the line-filler1
x'2 means x-point calculated by the line-filler2
x'3 means x-point calculated by the line-filler3
pro: damn faster, it takes a max of 3 clocks per pixel, and the implementation takes less area
con: it requires an internal buffer for the table, and circuits to clear/set it, it doesn't interpolate pixel-colors