There are a few approaches, but first ask yourself if it needs to run as fast as 42MHz. Slowing things down will make life easier. Ground inductance is probably your biggest problem.
Only the clock and data need worrying about - as long as you leave sufficient settling time, CS and reset won't be an issue. If possible, try to live without readback if you can. If you really need it, e.g. for a ready signal, do readback at a slower clock rate.
If you want something that will be robust and reasonably flexible in terms of length, I'd put the clock and data over LVDS, which would require a tx/rx pair per display, and should have plenty of margin.
You could use multiple discrete LVDS drivers, but another option would be to use an FPGA, as it will probably have sufficient drivers in one chip. the coding would be trivial.
With carefully chosen drivers and cabling, and a good ground (at least every other core in a ribbon cable, maybe an additional heavy core to reduce inductance) , you might just get away with single ended drivers (ACT+damping resistor), but at that frequency my gut feeling is it will be marginal as soon as you go over about 100mm.
If this approach can be made to work then you can probably share drivers - e.g. one driver, with 4 seperate damping resistors to 4 displays.
You may also be able to do a half-way solution using AC series Cmos to drive differential signals, and LVDS receivers at the far end.
Take a look at the line driver section on TI's website to get an idea of what solutions & standards are out there - I'm sure there will be something useful.
For a 1-off, your time is going to be worth more than parts cost, so if you can find a solution that's a good fit in terms of channel count and bandwidth, it would be better to go with that than fiddle around with cheapskate options.
And obviously make sure your clock/data phasing is such that it gives you the maximum headroom for jitter.
You definitely want to have relatively few displays per driver - if nothing else this makes it easier to test, so once you have one driver working with, say, 4 displays, you can have good confidence it will work on the full system as it is just more copies of the same setup.