One of the hardest things to get right in a 'my first FPGA' design is the configuration interface, and it's the clocking that's at the root of the problem.
Chances are your FPGA has a clock input which is reasonably fast and which runs continuously. This master clock is used to drive the various counters, timers, state machines and other logic in your design, and unless you end up simply trying to do too much per clock cycle, it's unlikely to cause you too many problems. The timing relationships between all the various elements are known, and your synthesis tool contains a simulation model of the device which allows it to fit your logic in a way which is guaranteed to work.
The problem comes when trying to receive information from another source, such as your MCU, which runs off a different clock. If the relationship between the two clocks is unknown or is not fixed, then it's impossible to know when it's "OK" to sample any input from that source, because there's always the chance that it'll be changing state at just the wrong moment.
When this happens, at best you'll get the wrong data. At worst, it'll break your design in a way which, literally, seems to defy logic. (Search for "metastability" and have a good read).
So, at some point in your design, you need a mechanism to 'import' data from outside into the master clock domain, ie. that region of your logic which is all driven from the nice predictable, continuous clock.
For an SPI interface, and especially for one that doesn't need to run too quickly, the easiest way (IMHO) is to treat SCLK as an asynchronous logic signal, rather than actually using it as a true clock. Whenever you sample a signal which is not synchronous to a known clock, you can ensure it's read reliably by double-sampling. In VHDL:
IF mclk'event AND mclk = '1' THEN
sclk_meta <= sclk_pin; -- this is the one and only time that sclk_pin is referenced
<other code involving sclk_meta>
END IF;
The SPI interface updates when SCLK is recognised as having changed state. For example, if the interface is designed such that both master and slave set up on falling edges and sample on rising, your code might be:
IF mclk'event AND mclk = '1' THEN
sclk_meta <= sclk_pin;
sclk_prev <= sclk_meta;
IF sclk_prev = '0' AND sclk_meta = '1' THEN -- rising edge
mosi_m <= mosi_pin; -- no need to double sample this because we know it's stable at this point
END IF;
IF sclk_prev = '1' AND sclk_meta = '0' THEN -- falling edge
miso_pin <= miso_m;
END IF;
END IF;
This technique works well for fairly slow SPI clocks. The fastest SPI clock it can possibly work with reliably is about 1/4 the speed of the master clock; this is because there needs to be a master clock edge on which the following conditions are true:
- 2 clocks ago SCLK was low
- 1 clock again SCLK was high
- on this edge, MOSI is OK to sample, ie. is not changing