The quad SPI on some of the other STM32F4s (the F446) also lacks a maximum CS period register. The OCTOSPI peripheral does have a setting for that, so you could efficiently use that peripheral with this chip. But since you already got the boards etc. that change is moot.
With 21 MHz clock and 8us of time , you can just reliably transfer 20 bytes. That's 1 command, 3 address and 16 data bytes. Effective speed 16Mbit/s. Painful? Yes. But reading EEPROM was similar back in the day. But those EEPROMs were only 64Kbit, and would transfer data with tens or hundreds of kHz..
You could perhaps contrive something together with a timer to drive CS pin high/low (incl. 50ns high), fire an IRQ with correct delay, and initiate the first 32-bits of a read/write transaction along with a DMA transfer so that the hardware can do the rest (until you receive the next interrupt). But you would also need to move in/out data for the DMA ping-pong buffers.. so that's still quite a lot of CPU work to do every 8us.
Unfortunately that SPI peripheral doesn't have a FIFO, so you may also need to wait on the first 16 bits to transfer through. Maybe you can get away with 2 sequential loads into DR (as the first 16-bits should instantly move into the shift register, leaving a space free for TXE='H'?), but I suppose that's undocumented behaviour.
Concerning the refreshing.. the memory bandwidth by doing a continuous transfer is not there I suspect. E.g. if you have a 8MB SDRAM chip with a 32ms refresh interval, then that means a refresh frequency of 31.25Hz. That's a continuous bandwidth of 250MB/s if want to do that by manually sequentially reading all data out in 1 continuous burst. You could just about do that with a 16-bit 133MHz SDRAM chip (or 66MHz DDR). But this 1-bit 21MHz chip? No way I'm afraid.
If you're doing random access transfers to each page, you might as well let the chip do it
I suspect that ESP implementation is also doing 1 or 4 byte transfers (very inefficiently), or what also wouldn't surprise me.. a complete disregard of reading datasheets. We all know how products are QA tested. It looks like that datasheet is a direct copy-paste. Probably some Shenzen OEM is baking those SPI RAM chips, and Espressif just drops them onto their designs.
How much memory do you need? Because Microchip also has SRAM SPI chips that are 1Mbit in size and go up to 20MHz. And ISSI have some that even go up to 45MHz.