What did you do?
I changed this:
bitplane_memory_data_to_pixel_colour get_pixel_colour_1 (
.ram_data_rdy ( rd_data_rdy_a ), // use immediate values when HIGH, latched values when LOW
.latched_word ( rd_data_cache[15:0] ), // 16-bit word from the catch
.latched_colour ( rd_cache_col[7:0] ), // 8-bit cached colour value
.latched_bpp ( rd_cache_bpp[3:0] ), // cached bits-per-pixel value
.latched_target ( rd_cache_bit[3:0] ), // cached target word/byte/nybble/crumb/bit
.immediate_word ( rd_data_in[15:0] ), // 16-bit word from GPU RAM
.immediate_colour ( colour[7:0] ), // current colour value
.immediate_bpp ( bpp[3:0] ), // current bits-per-pixel value
.immediate_target ( target[3:0] ), // current target word/byte/nybble/crumb/bit
.pixel_colour ( PX_COPY_COLOUR ) // current pixel colour value from above parameters
);
... to this:
bitplane_memory_data_to_pixel_colour get_pixel_colour_1 (
.ram_data_rdy ( rd_data_rdy_a ), // use immediate values when HIGH, latched values when LOW
.latched_word ( rd_data_cache[15:0] ), // 16-bit word from the catch
.latched_colour ( rd_cache_col[7:0] ), // 8-bit cached colour value
.latched_bpp ( rd_cache_bpp[3:0] ), // cached bits-per-pixel value
.latched_target ( rd_cache_bit[3:0] ), // cached target word/byte/nybble/crumb/bit
.immediate_word ( rd_data_in[15:0] ), // 16-bit word from GPU RAM
.immediate_colour ( rd_cache_col[7:0] ), // current colour value
.immediate_bpp ( rd_cache_bpp[3:0] ), // current bits-per-pixel value
.immediate_target ( rd_cache_bit[3:0] ), // current target word/byte/nybble/crumb/bit
.pixel_colour ( PX_COPY_COLOUR ) // current pixel colour value from above parameters
);
... for the reasons outlined in my post. I haven't had a lot of time this weekend to spend on this and I feel like I've been fumbling for solutions and going round in circles, wasting time.
Your simulation is still wrong.
It doesn't look anything like mine.
Yep, I realise that now that I've compared the two. Back when I ran the simulation earlier today I saw the delay had gone and thought, "Eureka!" Clearly it was premature and I should have waited to check the results in more detail, but I didn't have time.
Why did you change these? These were correct originally:
.latched_word ( rd_data_cache[15:0] ), // 16-bit word from the catch
.latched_colour ( rd_cache_col[7:0] ), // 8-bit cached colour value
.latched_bpp ( rd_cache_bpp[3:0] ), // cached bits-per-pixel value
.latched_target ( rd_cache_bit[3:0] ), // cached target word/byte/nybble/crumb/bit
.immediate_word ( rd_data_in[15:0] ), // 16-bit word from GPU RAM
.immediate_colour ( rd_cache_col[7:0] ), // current colour value
.immediate_bpp ( rd_cache_bpp[3:0] ), // current bits-per-pixel value
.immediate_target ( rd_cache_bit[3:0] ), // current target word/byte/nybble/crumb/bit
You just made things 10x worse for you.
I think I explained "why" pretty well in my previous post?
I clearly don't understand enough about the hardware or HDL subtleties to be able to see the problem. How long have you worked with this stuff? You must be qualified and have gone to university and studied EE or something similar, not to mention however many years working in the industry? I obviously don't have that training or experience (other than what you've been patient enough to teach me).
Why did you not concentrate on my words and this block?:
// set source data according to RAM read
source_bpp = ( ram_data_rdy ) ? immediate_bpp : latched_bpp ;
source_colour = ( ram_data_rdy ) ? immediate_colour : latched_colour ;
source_target = ( ram_data_rdy ) ? immediate_target : latched_target ;
source_word = ( ram_data_rdy ) ? immediate_word : latched_word ;
Maybe think of why I called these 'LATCHED' and 'IMMEDIATE'. Think of what you wrote in your own post above. Every single clue is there for you to solve.
Well
that block - to me - is a closed book.
ram_data_rdy is a wire (rd_data_rdy_a, in fact) - there's no delay on it that I'm aware of. So the source_ values are assigned instantly to either their
immediate values or their
latched values, depending on whether
ram_data_rdy is HIGH or LOW.
According to my limited understanding of the subject and my internal logic, there is an issue with the
immediate values - not the
latched values. That could be my first mistake, but I consider it a fair assumption based on what I know. My previous thinking was that there is no guarantee that the immediate_colour, target or bpp will have valid values by the time ram_data_rdy goes high as they all become invalid once cmd_rdy goes LOW. In fact, all their values go to zero.
In fact, this appears to be what is happening in the simulation - when rd_data_rdy_a goes HIGH, the immediate values (all zero) for color, target and bpp are used on the incoming data word from RAM, and a valid value isn't returned by the function until rd_data_rdy_a goes LOW and the inputs are switched over to the latched values instead, which have all the correct settings saved and thus a valid value is returned in PX_COPY_COLOUR.
I'm still not 100% on why you say my previous change was wrong, because of my thinking above.
Now I'm clearly missing a connection or I'm not making a link because I can't see any further than that. I'm just staring at waveforms and numbers and code and it's all just staring back at me.
I'm going to bed. Maybe sleeping on the problem (again) will help.