The DDR3 is coming.... Having a scope would have saved me almost a week as the CLK out to the ram wasn't oscillating due to unknown BS as it simulated fine both at logic and gate level properly, sound design, but something about the MAX10 as the chosen IOs just generated dead output. It's a pain avoiding a bunch of Altera's eccentricities as I avoid their built in DDRx PHY and do everything in a way which can be supported throughout all Altera FPGAs as well as be ported to other FPGA vendors.
Wow you're designing all this without a scope?
Damn clean coding and proper simulation with the DDR3 model...
Yes, Altera offers 'SignalTap' logic analyzer through the J-Tag just like there is one in Lattice Diamond and Xilinx, however, this stupid bug was me using the CLK output pin as a DDR output, 1/2 clock signal high and the other low to reconstruct a parallel clock. This is fine and works with most older Altera FPGAs like CycloneIV, but, the signal gets muted out for some reason on the MAX10s chosen IOs for the DDR3.
The Max10/CycloneV on the CK and DQS lines has a nice feature which offers 128x15ps taps on those IOs with a feedback so you may tune your sample times specifically for DDR3/4 interfaces. This also means less taps on the PLL. But, If I use the function, my code would not work older Cyclones & third party vendors FPGAs. These 2 chips also offer a dedicated 2 way FIFOs on the DQ lines merges with DQS and CK for clocking which I am not using. They can all be easily accessed and pre-packaged in 1 nice module by the free DDR2/3/4 PHY available with Quartus, but if I designed using it, where would the fun be in doing it all manually in a way which is cross-vendor compatible?
Right now, I'm phase-stepping on the PLL to create my required 3 clocks. This is not as fine as 15ps steps, but right now I do get a valid data window of around 10 steps out of 64. This should clean up as I'm now on the SDC file. Yes, I'm getting good data now without any timing constraints other than 'multicycle path' for signals crossing of clock domain boundaries.
I've generated 3.5k line of code, all documented and explained with comments at almost every line...