So you do have data boundaries ? Where does the desired 32bits fall within that 200bits ? anywhere, or aligned ?
The first 32bits are the sync-pattern.
I really want to avoid any syncing on edges - 2x oversampling should resolve a lot of issues from the beginning.
The gap between the 220 bit bursts are about the same time as the 200 bit burst itself.
Got total control over the bitsream source.
Of course, you may not need a FPGA at all
1MBps is not super fast, and the Parallax P2X8C4M64P has 8 cores / 180MHz, and has multiple Sync and Async channels. It can manage 16 RX and 16 clocks if needed.
Thank you for the hint!
This one sounds really interesting.
I'm assuming 'good' crystals set the data rates here.
If you have control of the 200/220? bits, and sync is always first, some modest tuning can make your life a whole lot simpler.
Edge sampling is not a problem, with over sampling you are effectively making the edge decision later, but are never really sure which one to reject.
If you have gaps, and sync is first bits, you already have a start-edge, so you can define the first sync info as zero == start bit and read the next 31/32 bits,
( or you can make the stream fully 32b async, and send 7 wide characters = 238 bit times )
It is not mandatory to make the stream async, it just puts you on a path more travelled.
eg I
think the P2X8C4M64P is ok with an undefined STOP bit, so it could grab the first 32 bits of a sync stream, then pause until break found, then repeat.
The P2X8C4M64P has 60+ x 1-32b UARTS, and there are MCUs with UARTS up to 32bits.
The P2X8C4M64P also has break/gap in edges detector, so that can give block-sync info, you just need to define edges-rules. A 200b gap is quite relaxed for break detect.
If it was me, I'd change to 32b async and then a single core in P2X8C4M64P could manage all 16, or you could spread over 4 cores, to reduce the worst case trigger jitter.
My reading has the P2X8C4M64P UART sync to the start edge and interrupts/flags on the last mid bit, so it has very low jitter.
Your worst case would be if all 16 channels are very close to phase locked, but I'd guess sub 200ns jitter on trigger out is possible, 4 channels per core.