Author Topic: Poor man's FPGA debugging techniques (Read 4232 times)

PatrickCPE · « **Reply #25 on:** September 05, 2022, 02:00:00 am »

I'll preface this with the caveat I have no idea what you're trying to debug rather than issues on the design, and also you asked for simple so this might be a bit more in the realm of "complicated but extremely useful given the right scenario".

Here's my typical steps for writing something.

1. Lint design as I write it (I like Verillator's linting tool a lot)
2. Testbench for each individual module you have within the design (and the associated regression suite, something simple like Make with some grepping works fine)
3. Testbench for top level of design also added to regression.
4. Check timing analysis as you go (Xilinx has this, Lattice must have it in their real tool, I'm not sure if the opensource tools support this at all)
5. Gate level sim with generated .sdf file(Depending on the scope of your project doing GLS on the entire thing might be a bit much)

With all of those steps done when you put the design on the part it should work assuming your testbenches properly tested the behavior. If your design doesn't work and you truly can't sim it then there are a few options depending on the complexity of the design.

1. Simple Designs - UART or Blinky or Pins and logic analyzer for debug as you mentioned
2. Complicated Bus Based Design - Internal Bus Scope and UART.

Elaborating on 2, most designs make use of some internal bus like AXI or APB or Wishbone, etc. Designs this complex usually have issues within the communications of different masters and slave devices. The Bus Scope is effectively an internal logic analyzer storing every transaction that occurred on the bus for N cycles. The caveat is that it uses resources on your chip, and the depth is directly related to how much ram you can give up. Your UART will act as a master and allow you to probe the bus scope for its internal data. You can set an internal trigger wire to tell the scope to record. A simple python script then formats that data for you to debug.

Zip CPU has a nice writeup https://github.com/ZipCPU/wbscope on his Wishbone bus scope here that I enjoyed and it worked when I tried it out. Alternatively, I know Xilinx has a Internal scope IP you can drag and drop into your designhttps://www.xilinx.com/products/intellectual-property/chipscope_ila.html, and I've been told Alterra does as well. Maybe Lattice does too (this seems like it: https://www.latticesemi.com/-/media/LatticeSemi/Documents/UserManuals/RZ/Reveal34UserGuide.ashx?document_id=50887)? The Xilinx one even allows you to bind to individual wires rather than just the bus itself.

Again, this might be a bit much but I don't really know what you're trying to debug here, knowing may help a bit

DiTBho · « **Reply #26 on:** September 05, 2022, 06:43:10 am »

Quote from: PatrickCPE on September 05, 2022, 02:00:00 am

4. Check timing analysis as you go (Xilinx has this, Lattice must have it in their real tool, I'm not sure if the opensource tools support this at all)

This is an example of what I meant with - Chip specific must be vendor's tool specific - and why OpenSource tools need more hacking

berke · « **Reply #27 on:** September 05, 2022, 07:37:11 am »

Quote from: PatrickCPE on September 05, 2022, 02:00:00 am

I'll preface this with the caveat I have no idea what you're trying to debug rather than issues on the design, and also you asked for simple so this might be a bit more in the realm of "complicated but extremely useful given the right scenario".

For the purpose at hand I did find the bug that prompted me to start this thread.

What I was doing is nothing too complicated FPGA wise, a multi-channel GPS-controlled camera trigger controller listening on an I2C bus. An MCU could kind of do it but it would be a bitch to get the timings tight to less than 100 ns. Because it's proprietary I couldn't just drop a GPL I2C core in, had to write it myself and the NXP I2C spec is not ultra clear (i.e. exactly when does the master release SDA so that the slave may ACK it ?)

Quote

Here's my typical steps for writing something.

1. Lint design as I write it (I like Verillator's linting tool a lot)

Check, I compile with yosys and iverilog with -Wall ; but I couldn't get Verilator to like the SiliconBlue cell library simulation models.

Quote

2. Testbench for each individual module you have within the design (and the associated regression suite, something simple like Make with some grepping works fine)

Almost check, I have benches for complex modules but not for simple ones where I don't suspect bugs (e.g. clock divider or reset controller.)

Quote

3. Testbench for top level of design also added to regression.

Haven't been doing that but I think I should put the effort into it.

Quote

4. Check timing analysis as you go (Xilinx has this, Lattice must have it in their real tool, I'm not sure if the opensource tools support this at all)

icetime gives maximum frequencies and critical path delays but with generated symbol names, it's kind of unreadable and looks like this

Code: [Select]

     2.793 ns net_26661 ($abc$52038$techmap\S_GRN.$3\q[31:0][14]_new_inv_)
        odrv_7_18_26661_26803 (Odrv4) I -> O: 0.372 ns
        t7449 (Span4Mux_v4) I -> O: 0.372 ns
        t7448 (LocalMux) I -> O: 0.330 ns
        inmux_9_17_38161_38188 (InMux) I -> O: 0.260 ns
        lc40_9_17_3 (LogicCell40) in0 -> lcout: 0.449 ns

Quote

5. Gate level sim with generated .sdf file(Depending on the scope of your project doing GLS on the entire thing might be a bit much)

So with laugensalm's tip I can run LUT-level sims with iverilog but I'm not sure it really adds anything to a top-level Verilog simulation. Will have to check to see if the SiliconBlue cell libs have timing information and how well iverilog makes use of them.

Quote

With all of those steps done when you put the design on the part it should work assuming your testbenches properly tested the behavior. If your design doesn't work and you truly can't sim it then there are a few options depending on the complexity of the design.

Like I was saying the main issue is when your FPGA has to talk to the outside world (and usually has to, otherwise why are you using an FPGA?) and then what? You have to model all the external things in Verilog too if you can. Some people said "co-simulation" but I'm not sure I understand the concept.

Quote

1. Simple Designs - UART or Blinky or Pins and logic analyzer for debug as you mentioned

Done

Quote

2. Complicated Bus Based Design - Internal Bus Scope and UART.

UART done, for the next time I think I'll define a JTAG-like daisy chained debug bus with a fixed number of signals and some preprocessor macros to make it easier. If needed I can modify the Yosys code to have it output what's necessary.

Quote

Elaborating on 2, most designs make use of some internal bus like AXI or APB or Wishbone, etc. Designs this complex usually have issues within the communications of different masters and slave devices. The Bus Scope is effectively an internal logic analyzer storing every transaction that occurred on the bus for N cycles. The caveat is that it uses resources on your chip, and the depth is directly related to how much ram you can give up. Your UART will act as a master and allow you to probe the bus scope for its internal data. You can set an internal trigger wire to tell the scope to record. A simple python script then formats that data for you to debug.

Like I was saying there's a nice 512 kiB SRAM chip on the dev board, I could use it to store traces.

Another option is to debug it on an ECP5 which has the JTAG port.

Quote

Zip CPU has a nice writeup https://github.com/ZipCPU/wbscope on his Wishbone bus scope here that I enjoyed and it worked when I tried it out.

I'll check. Instantiating a CPU (PicoV32 or something) for debugging/self-test feels a bit excessive but that's more of a gut feeling, rationally it's not necessarily a stupid idea.

Quote

Alternatively, I know Xilinx has a Internal scope IP you can drag and drop into your designhttps://www.xilinx.com/products/intellectual-property/chipscope_ila.html, and I've been told Alterra does as well. Maybe Lattice does too (this seems like it: https://www.latticesemi.com/-/media/LatticeSemi/Documents/UserManuals/RZ/Reveal34UserGuide.ashx?document_id=50887)? The Xilinx one even allows you to bind to individual wires rather than just the bus itself.

Again, this might be a bit much but I don't really know what you're trying to debug here, knowing may help a bit

Thanks for the tips, I may end up switching to the proprietary toolchain but I'm slightly allergic to the Windows-style clicky-feely interfaces, I'm more of an Emacs+make in termal guy.

BTW thanks to everyone for the useful information (and guilt trips about not writing full benches

the core works, I'll switch back to the analog side of things.

laugensalm · « **Reply #28 on:** September 05, 2022, 08:10:47 am »

Quote from: SiliconWizard on September 03, 2022, 07:55:08 pm

...

But I think it's definitely nice to have a software simulator - not only speeds up debugging considerably, but also helps trying various approaches and directly see the performance impact with very short development cycles.

Sure for debugging long sequences of code, you'll need a bit of work to set up proper HDL simulations - you usually can't afford simulating millions of cycles until it runs into the bug you're trying to pinpoint. So that requires some thought and observation in order to locate the bug, then trying to find minimal sequences that reproduce it. For this, software simulation helps tremendously. It may look like unnecessary extra work, but for a complex system, the investment definitely pays off.

On a sidenote only, before flooding the TO with too much info: The CXXRTL backend in yosys also allows to build pretty neat executables for virtual SoCs with full freedom, so you can indeed trigger on an event to debug an error scenario in particular. And you'll skip the step of developing your own simulator. However, it has the same basic architecture/issues as Verilator, no built-in asynchronous/delay timing/delta cycling awareness possible, synthesizable code only, thus it will *not* eat the vendor cell models, so you'll have to stay with synchronous functional simulation most of the time. Since you also have to write the (co-)simulation front end and stimuli drivers yourself (in C++ or using a Python wrapper), I'd rather not promote that method to begin with.

Speaking of timing optimization: Yeah, yosys won't do that. But you can get a timing estimate from nextpnr that is pretty accurate. To be honest, at some point, novel methods of hardware generation outside the Verilog/VHDL domain are more effective than implicit attempts by the tools to optimize badly designed pipelines at mapping or even PnR level. In many cases in the past, I've found myself iterating through V-sources in order to avoid architecture-specific congestions. Anyhow, before this is becoming too off-topic: A two way approach isn't bad: yosys for verification/debugging, vendor tools for the final optimization (their debugging caps being again mostly horrible). Systematical bugs still can appear at any layer, but having more options helps to narrow down errors.

josuah · « **Reply #29 on:** September 05, 2022, 08:18:30 pm »

There is simulation (I use Verilator for its rather interesting correctness), and then there is synthesis (Yosys here).
Verilator acts more or less as a "syntax guardian", but some difference between what is supported by Verilator and Yosys lead me to simulation/hardware mismatch!

The reason: Feature missing from Yosys, that still lead to a valid output (with a Warning).

Debug method acquired: looking at the JSON output of Yosys...

If the content of "bits" are numbers, everything is all right: all "5" are connected together, all "6" are, all "7" are... they act as net names:

Code: [Select]

                "top.peri0.wb_dat_i": {
                    "hide_name": 0,
                    "bits": [
                        224,
                        223,
                        222,
                        221,
                        220,
                        219,
                        218,
                        216
                    ],

If the content of "bits" has any "x", then it suggests there is something that could not be synthesized, and not connected to something else: a placeholder value:

Code: [Select]

                "top.peri0.wb_dat_o": {
                    "hide_name": 0,
                    "bits": [
                        "x",
                        "x",
                        "x",
                        "x",
                        "x",
                        "x",
                        "x",
                        "x"
                    ],

This might also be visible on Yosys's "dot" output format, showing a diagram of the design.

PatrickCPE · « **Reply #30 on:** September 06, 2022, 04:24:44 am »

I still use IVerilog often for a lot of my stuff, alongside Verilator if I want to easily model stuff in C++. For things like gate level test benches I fire up the GUI and run it there.

Same thing here on the hating the tools bloated GUIs. I myself use Iverilog for the basic stuff more often then not. You can use Xilinx tools completely via the command line, but it's a fair bit of work to set up the makefile. This ought to be true for all the toolchains. Verilog mode in Emacs allows you to specify custom targets and it defaults to whatever makefile is in the pwd I believe, but I always just hop to my terminal and make there. You could probably set up a simple compiler flag in a makefile with some different flag definitions to chose whether you run on the open source tools or the propietary tools via the command line.

Glad you figured out your problem!


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: Poor man's FPGA debugging techniques (Read 4232 times)

PatrickCPE

Re: Poor man's FPGA debugging techniques

DiTBho

Re: Poor man's FPGA debugging techniques

berke

Re: Poor man's FPGA debugging techniques

laugensalm

Re: Poor man's FPGA debugging techniques

josuah

Re: Poor man's FPGA debugging techniques

PatrickCPE

Re: Poor man's FPGA debugging techniques

Share me