Big news on the FPGA front!
BUFGMUX instances cause hell with timing constraints. Adding a PLL did nothing to change the output phase noise, which is not correct. The problem was caused by the BUFGMUX used to switch between the internal external reference clocks, post-250 MHz DCM. Adding a timing ignore (TIG) constraint to the external reference connection fixed the constraint propagation issue. With the PLL, the overall timing uncertainty is less than 100 ps in the switched 250 MHz clock domain.
The MCB interface to DDR2 and the SPI SoC interface are both up and running. I have been able to successfully write to and read from both DDR2 chips from the new SPI interface with my bus pirate. There are some signal integrity issues occasionally, but they are likely due to using rather long flying leads off of my bus pirate. Adjusting the lead positioning significantly affects the reliability of the connection. This should not be an issue on the PCB. Currently the SoC interface uses an oversampled SPI implementation that won't be able to reliably run faster than perhaps 30 or 40 MHz. I'm going to look in to rewriting that to be source-synchronous so we can crank up the speed.
The SPI protocol that I implemented is a little bit funky, but here is how it works so far:
Address space
0 0000 0000 - 0 07ff ffff channel 1 memory
1 0000 0000 - 1 07ff ffff channel 2 memory
F 0000 0000 - f ffff ffff control and configuration
Commands
1010 bbbb - read to bank b
1011 bbbb - write from bank b
read command example
Read AA BB CC DD from bank 0 (channel 1 memory) at address 0x00001234
Read data start indicated by leading nonzero byte
MOSI A0 00 00 12 34 00 00 00 00 00 00 00 00
MISO 00 00 00 00 00 00 01 AA BB CC DD xx xx
write command example
Write AA BB CC DD to bank 0 (channel 1 memory) at address 0x00001234
MOSI B0 00 00 12 34 AA BB CC DD
MISO 00 00 00 00 00 00 00 00 00
Currently, only banks 0 and 1 are accessible (bank 0 accesses U8 and bank 1 accesses U12). To write, send 0xB0 or 0xB1, followed by the address, MSB first, followed by the data. To read, send 0xA0 or 0xA1, followed by the address, MSB first. The data will then come out following the first nonzero byte. This is the only way I could think of to get all of the timing to work out. The reply comes back after an undefined time (wrt. number of bytes), but it's usually going to be a delay of one byte (you get one 0, then a 1, then the read data). The delay is necessary as DDR2 accesses through the MCB are relatively high latency (~30 clocks at 250 MHz), and this can vary due to contention between multiple MCB ports. The addresses are 32 bit. Eventually I will add another bank for configuring the logic - DDS/DSP, modulation, frequency counters, etc.