Author Topic: FPGA VGA Controller for 8-bit computer (Read 493017 times)

BrianHG · « **Reply #2650 on:** July 16, 2021, 08:16:20 pm »

Quote from: nockieboy on July 16, 2021, 05:14:29 pm

Quote from: BrianHG on July 16, 2021, 04:49:02 pm
Does your Z80 system have a clock/timing chip somewhere on it?

IE, can you test/time the speed of your reads?

This way, if we artificially enter a read delay with wait, can you tell is the performance slows down.

What I am thinking is during read cycle, add a timer/counter to delay placing the correct data on the Z80 output bus holding the wait and see if you can still read correct data.

It does - a Z80 CTC - but I'm wondering if it would be much easier to just add a constant delay in the GPU before returning any data, then up that delay until it becomes a noticeable slowdown when reading a 256-byte block of GPU RAM, for example?

Proceed.

nockieboy · « **Reply #2651 on:** July 21, 2021, 04:21:28 pm »

Update on progress - below are my musings, not really seeking help unless anyone can point me to a fast transistor to replace a 2N3904?

The GPU is able to pull WAIT low in response to memory accesses by the host Z80. It's not rock-solid yet, though - I can adjust the delay for each memory access by setting a value in a GPU register accessible via an IO port, and turn the test on or off via that same register (bit 0 controls whether or not to apply WAIT states).

Most of the time it works fine - but every now and again it locks up with frenetic activity on the WAIT line. I need to spend some more time looking into the specific timings of what's going, but here's what I've found so far:

What you see above is a nanosecond-accurate timing chart for a typical memory read operation by the host Z80 (in the uCOM's case, a 10MHz Z80 running at 8MHz), with the FPGA's clock (asynchronous to the Z80's) at the bottom. Pay special attention to the red bar - this denotes the amount of time the FPGA has to detect a memory read op and set the WAIT signal high to trigger the 2N3904 transistor to pull the Z80_WAIT line low.

TsWAIT(Cf) is the critical timing, the time for WAIT to stabilise LOW before it is sampled by the Z80 on the falling edge of its T2 clock cycle - on the above diagram, it has been extended to include the 2N3904's Td (activation delay), so instead of 20ns as quoted in the Z80's specifications for the 10MHz variant, it is 55ns because of the additional 35ns required by the 2N3904 to pull the WAIT line low. This is an approximation - I'm no expert at reading or understanding datasheets, this is my best guess and is likely a ballpark figure at best.

So this explains some of the crashes I was experiencing when I first tried inserting WAIT states, as I was waiting to decode a memory read, which requires RD to go low, and as you can see in the chart above it doesn't leave enough time to get WAIT pulled low in time. Instead, the FPGA is now pulling WAIT low the cycle it sees MREQ go low and the address lines match the GPU RAM address range. It's a lot more stable now, but I'm still getting occasional crashes - or certain crashes if I do a memory op with LDIR, one of the Z80's fast block-memory op commands.

So timing is still an issue. It just surprises me that it is an issue at all, because (to me) the FPGA is fast, but I guess not as fast as dedicated, discrete chips decoding the memory op and address range (or a chip select, more likely) in old peripheral equipment back in the day.

Is there anything faster than a 2N3904, I wonder? That 35ns delay is a killer.

BrianHG · « **Reply #2652 on:** July 21, 2021, 05:00:37 pm »

Have you considered learning about using Quartus' Signal Tap?
Since the FPGA you are using has free ram, and space, you can use the built in logic analyzer and see what the Z80 bus is doing in real time.

SiliconWizard · « **Reply #2653 on:** July 21, 2021, 05:22:17 pm »

Quote from: BrianHG on July 21, 2021, 05:00:37 pm

Have you considered learning about using Quartus' Signal Tap?
Since the FPGA you are using has free ram, and space, you can use the built in logic analyzer and see what the Z80 bus is doing in real time.

Other vendors have something similar. Never used that though. Something I would expect it that it would tend to make routing more difficult - especially in complex, or congested designs - and thus, hinder Fmax. Do you have experience with this and can you confirm or OTOH tell me that it usually has very little impact?

BrianHG · « **Reply #2654 on:** July 21, 2021, 07:10:10 pm »

Quote from: SiliconWizard on July 21, 2021, 05:22:17 pm

Quote from: BrianHG on July 21, 2021, 05:00:37 pm
Have you considered learning about using Quartus' Signal Tap?
Since the FPGA you are using has free ram, and space, you can use the built in logic analyzer and see what the Z80 bus is doing in real time.

Other vendors have something similar. Never used that though. Something I would expect it that it would tend to make routing more difficult - especially in complex, or congested designs - and thus, hinder Fmax. Do you have experience with this and can you confirm or OTOH tell me that it usually has very little impact?

The FMAX is only hindered by the maximum M9K/M10K memory block speed unless your FPGA is too full, or you are grasping signals from multiple clock domains simultaneously. Nockieby's design is only 125MHz for the Z80 bus and everything else. A breeze for the -6 Max10 he is using.

My guess they are just sampling the data into 1 or more memory blocks which a programmable latch enable. The second side of that memory port is just JTAG scanned in real-time during operation.

nockieboy · « **Reply #2655 on:** July 21, 2021, 09:40:04 pm »

Quote from: BrianHG on July 21, 2021, 05:00:37 pm

Have you considered learning about using Quartus' Signal Tap?
Since the FPGA you are using has free ram, and space, you can use the built in logic analyzer and see what the Z80 bus is doing in real time.

Ah, no, I hadn't thought of that - certainly sounds promising.

Ted/KC9LKE · « **Reply #2656 on:** July 22, 2021, 04:44:44 pm »

"Is there anything faster than a 2N3904, I wonder? That 35ns delay is a killer."

Not to distract the signal tap, just a passing thought, how about the SN74LVC07A Hex Buffer and Driver With Open-Drain Outputs?
It has a propagation delay of around 4ns and can sink 24ma I believe.
It is a hex package and might replace all 6 3904's on on your pcb.

Curious, 55ns seems a bit of a short response time, don't know the Z80 though

Best
Ted

nockieboy · « **Reply #2657 on:** July 22, 2021, 10:18:35 pm »

Quote from: Ted/KC9LKE on July 22, 2021, 04:44:44 pm

"Is there anything faster than a 2N3904, I wonder? That 35ns delay is a killer."

Not to distract the signal tap, just a passing thought, how about the SN74LVC07A Hex Buffer and Driver With Open-Drain Outputs?
It has a propagation delay of around 4ns and can sink 24ma I believe.
It is a hex package and might replace all 6 3904's on on your pcb.

That sounds very promising, assuming I've identified the correct issue in the first place of course.

An SN74LVC07A sounds like it could be perfect for the task. The WAIT line is pulled high by a 10K resistor connected to 5V, so that's 0.5mA that the '07 would need to sink if my maths (and understanding of basic electronics) is correct?

Quote from: Ted/KC9LKE on July 22, 2021, 04:44:44 pm

Curious, 55ns seems a bit of a short response time, don't know the Z80 though

Well, like I've said many times before, I'm no expert at this stuff and am purely self edu-ma-cated, so there's a good chance I've made a mistake somewhere or missed something, but I know enough to say that the Z80 is pretty tight on its timings, even for a 40+ year old chip running at 8MHz.

The 10MHz Z80 requires a minimum 20ns setup and hold for WAIT before the falling edge of T2, the clock cycle where the Z80 samples the WAIT line - this leaves about 45ns without any propagation delays in line drivers to pull WAIT low from when RD goes low. This could be up to 55-80ns (eyeballing from the timing chart) if I pull WAIT low the instant I see MREQ fall on its own, instead of waiting for MREQ then RD to fall. The reality is, I'm going to be applying WAIT states to memory writes AND reads, not just reads, so it makes sense to pull WAIT low the instant I see MREQ go low and have a relevant address - which is what I'm doing now as a result of making the timing chart above. It's still not perfectly stable though, so perhaps version 2 of the uCOM/DECA with the extra 30ns an SN74LVC07A would provide would make the difference.

BrianHG · « **Reply #2658 on:** July 22, 2021, 10:27:09 pm »

I have a ver.0.95 of my DDR3 controller to get out tonight, then I can take a look at your code.

Also, what's the series resistor from the FPGA to the base of the 2N3904. A lower value means a faster turn on time. Using a small value like 220ohm will mean the 2N3904 will probably be on and pulling down within <10ns or even closer to 5ns. It's truly a fast little bugger capable of over 200MHz if you drive the base with enough current. If you are using 1k series resistor to the base, when driving a 3v logic at the other end, expect a 15-20ns delay.

nockieboy · « **Reply #2659 on:** July 22, 2021, 10:37:43 pm »

Quote from: BrianHG on July 22, 2021, 10:27:09 pm

I have a ver.0.95 of my DDR3 controller to get out tonight, then I can take a look at your code.

Also, what's the series resistor from the FPGA to the base of the 2N3904. A lower value means a faster turn on time. Using a small value like 220ohm will mean the 2N3904 will probably be on and pulling down within <10ns or even closer to 5ns. It's truly a fast little bugger capable of over 200MHz if you drive the base with enough current. If you are using 1k series resistor to the base, when driving a 3v logic at the other end, expect a 15-20ns delay.

It's stuff like this that I miss.

The transistor circuit was a cut 'n' paste from somewhere else - I think the series resistor was to prevent overloading the IO output of whatever was driving the transistor, but what you've said makes perfect sense now that I think about it.

BrianHG · « **Reply #2660 on:** July 22, 2021, 10:38:09 pm »

Go Google on how to use SignalTap I & II. Google tutorials and search on youtube for videos so you can set something up. It will be useful in the future as you add much more complicated things to your GPU.

BrianHG · « **Reply #2661 on:** July 22, 2021, 10:41:22 pm »

Quote from: nockieboy on July 22, 2021, 10:37:43 pm

Quote from: BrianHG on July 22, 2021, 10:27:09 pm
I have a ver.0.95 of my DDR3 controller to get out tonight, then I can take a look at your code.

Also, what's the series resistor from the FPGA to the base of the 2N3904. A lower value means a faster turn on time. Using a small value like 220ohm will mean the 2N3904 will probably be on and pulling down within <10ns or even closer to 5ns. It's truly a fast little bugger capable of over 200MHz if you drive the base with enough current. If you are using 1k series resistor to the base, when driving a 3v logic at the other end, expect a 15-20ns delay.
It's stuff like this that I miss. The transistor circuit was a cut 'n' paste from somewhere else - I think the series resistor was to prevent overloading the IO output of whatever was driving the transistor, but what you've said makes perfect sense now that I think about it.

Just look at the 2N3904 data sheet. I believe delay and rise time is 35ns+35ns if you drive the base with 1ma. 220 Ohm would drive the base with around 10ma. Don't go below 100Ohm as the IO on the FPGA might not like a constant ~20ma current.

nockieboy · « **Reply #2662 on:** July 23, 2021, 09:55:59 pm »

Quote from: BrianHG on July 22, 2021, 10:41:22 pm

Just look at the 2N3904 data sheet. I believe delay and rise time is 35ns+35ns if you drive the base with 1ma. 220 Ohm would drive the base with around 10ma. Don't go below 100Ohm as the IO on the FPGA might not like a constant ~20ma current.

390 ohms is the nearest I have to 220 without going below 100. Have tried it out and can't see any difference in the crash rate... will need to spend some time learning to use Signal Tap.

BrianHG · « **Reply #2663 on:** July 23, 2021, 11:23:20 pm »

What's the pull up resistor on the bus?

nockieboy · « **Reply #2664 on:** July 24, 2021, 09:35:25 am »

Quote from: BrianHG on July 23, 2021, 11:23:20 pm

What's the pull up resistor on the bus?

On the Z80 side? There's a 10K pullup.

I've got a 10K pull-down on the FPGA signal line to the transistor as I didn't want the WAIT line to default to LOW all the time the FPGA was starting up.

BrianHG · « **Reply #2665 on:** July 24, 2021, 01:06:06 pm »

Quote from: nockieboy on July 24, 2021, 09:35:25 am

Quote from: BrianHG on July 23, 2021, 11:23:20 pm
What's the pull up resistor on the bus?

On the Z80 side? There's a 10K pullup.

I've got a 10K pull-down on the FPGA signal line to the transistor as I didn't want the WAIT line to default to LOW all the time the FPGA was starting up.

Yes, on the Z80 side. Try a 1k pullup.

BrianHG · « **Reply #2666 on:** July 25, 2021, 06:47:00 pm »

Note that the 'turn-off' time of a 2N3904 my not be unavoidable. Though, in the past, our Z80 problems have been narrowed down and connected to code in the bridge.

You need to get the SignalTap going so you can trap what happens before a freeze event.

My guess is that you are applying a wait when there is a read or write of an op-cope elsewhere outside the address of the display GPU.

nockieboy · « **Reply #2667 on:** July 26, 2021, 01:05:22 pm »

Quote from: BrianHG on July 25, 2021, 06:47:00 pm

Note that the 'turn-off' time of a 2N3904 my not be unavoidable. Though, in the past, our Z80 problems have been narrowed down and connected to code in the bridge.

I don't think the turn-off time is an issue - the WAIT line is designed to work asynchronously to the Z80's clock, so it shouldn't make any difference (to my inexperienced mind) if takes 2ns to go high again or 20ns. I'd like to think there's some form of Schottky effect on the Z80's WAIT input, so slow rises don't result in issues when the line is sampled, but it's either high or low when the WAIT line is sampled on the falling edge of T2.

Quote from: BrianHG on July 25, 2021, 06:47:00 pm

You need to get the SignalTap going so you can trap what happens before a freeze event.

My guess is that you are applying a wait when there is a read or write of an op-cope elsewhere outside the address of the display GPU.

This makes a lot more sense to me - it's far more likely to be an HDL issue in z80_bridge. It does look as though the WAIT line is getting out of synch with what the z80_bridge thinks it should be doing.

Will start taking a look at Signal Tap this week hopefully.

nockieboy · « **Reply #2668 on:** July 28, 2021, 03:44:43 pm »

Wow, SignalTap is pretty handy - easy enough to understand and setup, too.

The results, on the other hand, require some interpretation, but I've fixed a glitch thanks to SignalTap already. I was using a wire (mem_in_range) that went HIGH whenever the Z80 address bus held an address in the GPU's RAM range. What I didn't realise before SignalTap was that there are some fast transitions in the address bus, about a quarter of a Z80 clock cycle long, in which the address is unstable/random and these sometimes trigger the mem_in_range signal, causing unwanted behaviour in the Z80_bridge. I've fixed that now, replacing the wire with a three-register pipeline so that mem_in_range is only HIGH when the address has been stable for three clocks. That's eliminated some unwanted signals.

The next thing I'm seeing is a little odd - but I guess it's explainable by delays in the WAIT line going HIGH after the transistor is switched off, which BrianHG alluded to in a previous post recommending the reduction of the pullup resistor on the WAIT line from 10K to something smaller. According to SignalTap, it's taking approximately 8 Z80 CLK cycles for WAIT to go high enough for the Z80 to detect it as HIGH:

That purple-ish area is nearly 900 nanoseconds long (45 GPU clock cycles, give or take), showing the delay from when the Z80_bridge's internal Z80_WAIT signal goes low (to turn the transistor off that is pulling the Z80's WAIT line low) and when the Z80's actual WAIT line gets high enough for the Z80 to detect it.

In itself, this shouldn't be an issue other than WAIT states being far longer than necessary, but I'm still getting the odd lockup with WAIT fluctuating on and off. Testing continues.

BrianHG · « **Reply #2669 on:** July 28, 2021, 05:37:06 pm »

Can I see you latest code for driving the 'wait' line?

BrianHG · « **Reply #2670 on:** July 28, 2021, 06:16:36 pm »

Quote from: BrianHG on July 28, 2021, 05:37:06 pm

Can I see you latest code for driving the 'wait' line?

If you have a spare TTL-3.3v input on your PCB-DECA board, try tying the bus side 'wait' to an input and add that input to the Signal tap so you can see when the 'wait' timing looks like after the 2N3904. This will show you if all those TW wait states are due to the buss signal, or due to internal Z80 processing.

You might also be able to use a series 100k resistor from the Z80 buss wait line to an input as 100k series wont load an input protection diode too much. But, you would probably have to shrink the 10k pull-up as that 100k would be equal to a pull-down to a 3.3v rail.

nockieboy · « **Reply #2671 on:** July 28, 2021, 09:53:13 pm »

Quote from: BrianHG on July 28, 2021, 06:16:36 pm

Quote from: BrianHG on July 28, 2021, 05:37:06 pm
Can I see you latest code for driving the 'wait' line?
If you have a spare TTL-3.3v input on your PCB-DECA board, try tying the bus side 'wait' to an input and add that input to the Signal tap so you can see when the 'wait' timing looks like after the 2N3904. This will show you if all those TW wait states are due to the buss signal, or due to internal Z80 processing.

You might also be able to use a series 100k resistor from the Z80 buss wait line to an input as 100k series wont load an input protection diode too much. But, you would probably have to shrink the 10k pull-up as that 100k would be equal to a pull-down to a 3.3v rail.

I have a WAIT input to the FPGA from the Z80 bus already as part of the control buffer on the uCOM/DECA interface card, so this is not an problem. I'll take a look at what's going on with the Z80's WAIT line tomorrow. Didn't have time today.

Uh, what am I talking about - no I don't have a WAIT input from the Z80 bus. Well, I think I'll start by reducing the 10K pullup on the WAIT line anyway, see if it makes a difference dropping it to 1K.

Quote from: BrianHG on July 28, 2021, 05:37:06 pm

Can I see you latest code for driving the 'wait' line?

The latest version of Z80_bridge_v2.sv is attached, but to save you sifting through the haystack:

LINE/S NOTES
205-207 Sets up flags and address-valid pipeline
225 Sets up buffer to store read value from GPU before returning it to Z80 at end of WAIT period
227-228 Sets up signal to indicate when WAIT timer is zero
252-253 Update 3-clock address-valid pipeline to filter jitter on the address bus
309-340 Manages start of GPU RAM read and write ops
432-436 Manages updating the WAIT counter value via an IO port address
438-455 Manages updating the WAIT counter during a memory op and manages the end of a GPU RAM read or write op
461-468 Manages returned data from GPU RAM read

Probably not the best way to do it, but it's the second iteration of my initial attempt and, hopefully, there aren't too many obvious mistakes.

nockieboy · « **Reply #2672 on:** July 29, 2021, 12:06:25 pm »

So my pullups on the Z80's lines are all 4k7s, not 10k as in the schematic, including the WAIT line. The resistors are all under the Z80 chip itself and I didn't remember that I'd put lower values in until I lifted the chip earlier this morning. I've still replaced the WAIT 4k7 pullup with a 1K resistor, but now I seem to be getting problems with the GPU. The system locks up with WAIT constantly asserted. I've made the WAIT-insertion HDL optional and compiled without it and the system runs fine - but I haven't changed anything since last night other than reducing the pull-up from 4k7 to 1k.

I'm still able to use SignalTap on the 'locked up' system, and it looks like changing to 1k may have pulled two WAIT states (Tw) back, but there's still 6 additionals.

I've compared Z80_bridge_v2's code from today and last night (when it worked) and, if I remove the checks making WAIT-insertion optional, the HDL is identical. So it appears the change to a 1k pullup on WAIT is causing these issues? I'll confirm this later today if I get to replace the 4k7 resistor.

nockieboy · « **Reply #2673 on:** July 29, 2021, 02:44:42 pm »

Yes, I can confirm that fitting a 1k pullup to the WAIT line causes the system to lock up with the DECA/GPU attached, but not as long as it doesn't try to mess with the WAIT line.

I've refitted the 4k7 pullup and it's fine again now.

gcewing · « **Reply #2674 on:** July 31, 2021, 11:53:02 am »

Another trick you could try for speeding up the transistor is to put a smallish capacitor across the base resistor.

Quote from: nockieboy on July 29, 2021, 02:44:42 pm

Yes, I can confirm that fitting a 1k pullup to the WAIT line causes the system to lock up with the DECA/GPU attached, but not as long as it doesn't try to mess with the WAIT line.

Maybe you have some kind of logic error that's being masked when the wait state is unusually long? Could the FPGA be releasing the WAIT line too early, before it's actually ready for the Z80 to continue doing stuff?


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: FPGA VGA Controller for 8-bit computer (Read 493017 times)

Share me