Remember, if you are using the GPU ram for CPU code or even GPU instructions as well, without any type of limiter for the geometry unit, carelessly drawing a line or rectangle somewhere will erase your code. In under a second, you can erase you entire entire Z80 4mb. (More like 3 seconds to erase all 512mb if you ever get around to re-doing the pixel-writer and optimize it for 128bit access.)
Yes of course - with a little tweaking and careful design, I could have the entire Z80's memory space in the GPU. Does the pixel-writer need re-writing then to be more DDR3-optimised?
It's more like the pixel writer need to be changed into a generic blitter copy function, with a few mask and mix functions. It needs to be designed smart so that it can perform multiple tasks during this copy.
I'm thinking the controls will look like this:
------------------------------------------------------------------
copy length (set the number of cycles the copy should run)
copy source data A bits width (8,16,32,64,128)
copy beginning source address A
copy source address A inc/dec step size per copy cycle.
copy source data B bits width (8,16,32,64,128)
copy beginning source address B
copy source address B inc/dec step size per copy cycle.
copy A-B function, only use source A or B/add AB/subtract AB/mult AB/divide AB/float or int/ XOR/and/or/ mask in and mask out levels.
copy destination Y data bits width (8,16,32,64,128)
copy beginning destination Y address
copy destination address inc/dec step size size per copy cycle.
--------------------------------------------------------
With the above, you should have enough to do graphics, blending, sound stereo/mono stream processing as well as volume control and channel mixing at different depths as well as resampling/pitch bending including generic math-co processing, FFT, multidimensional transformations, vector array scaling and a bunch of other things the Z80 has no business of attempting at such a scale.
You would then need to modify the geometry unit's pixel address generator to prepare the list of commands to feed this new generic memory processor to write the pixels, but now you can also drive this section directly by the Z80 to perform other functions.