No doubt about using DDR3 @400MHz and its usefulness, but most of the basic dev boards only sport a 200MHz max chip.
200 MHz DDR3? Such chips don't exist. DDR3 standard defines a minimum frequency of 303 MHz, in reality you won't be able to find chip slower than 800 MHz for sale (at least not via official channels).
You're right, I was probably thinking of DDR2 instead here... Regarding availability, the lowest grade DDR3 available at Digikey is listed as 533MHz, but no stock, the lowest in stock are 667MHz chips.
That said, whatever DDR3 chip you use, I highly doubt you'll manage to get higher than 400MHz on an ECP5. Lattice's DDR3 controller IP states that 400MHz is the highest clock rate achievable, and with the highest speed grade ECP5...
And with a hand-written controller, I wouldn't be hugely surprised if you really struggled to reach those 400MHz.
We're again talking about the ECP5 here. If I want to be able to use DDR3 at 800MHz+, I'll go see elsewhere.
Also, as you said, Lattice's DDR3 controller is not even that good and the probability of getting/writing something better is doubtful, at least if you don't have years ahead of you and/or big bucks to shell out.
If PHY that Clarity generates is any good, I don't doubt that it's possible to design a full controller yourself. It will take some time for sure, but it seems that Brian is determined to make it happen (and his track record for achieving stuff he wants is excellent if Z80 thread is anything to go by), so if we all help him out a little, we can have something really good in the end. In fact this board might be a good platform for hardware validation of this project.
I don't doubt it either, as I said it has already been done (and with 100% open-source tools on top of that.) But it's going to take time, and performance and Fmax are likely going to be a little disappointing.
I'm not looking to develop the kind of applications you mentioned, at least not with such a board and an ECP5. So that would be much more modest.
That's why I think 2 x16 bit DDR3 is the optimal choice. Regular SDRAM is too much of a stone age tech for me to seriously consider.
Another choices are DDR2 or LPDDR1, which are a bit simpler it a sense that they don't require write/read leveling, but they require balanced tree layout for address/command and clock, so no way this can be accomplished on 4 layers. As a matter of fact, at this point I'm not convinced that it's possible to do a DDR3 flyby on 4 layers, but that is definitely easier than balanced tree.
I get it that SDR SDRAM is not for you. But it's still useful for a range of applications, and I for one see no reason to switch to a MUCH complex overall design if I don't need the performance.
Note that I'm now figuring YOU would actually be interested in designing such a board yourself (I remember you've designed other FPGA boards before, the latest being a Spartan 7 one?) It's obvious that if you embark on this project, the choice will be entirely yours. I was just listing what *I* would currently be happy with, but I'm not going to make those choices for you!