If the FPGA(s) etc are approaching maximum capability, then perhaps we should consider trade-offs.
The majority of tweaks on the list are UI-related and I seriously doubt the FPGA is involved with the UI itself beyond triggering, accumulating waveform data and calculating measurements/stats/events that require full sample rate processing. Same goes for the LXI, USB and related features - much too expensive to do in RTL.
Things like correcting the input DC offsets are mostly a software thing too, albeit with reuse of existing FPGA functions and other hardware: software puts the input muxes on enabled channels to GND, reads the FPGA's DC value measurements, adjusts input bias until all active channels read as close to zero DC as possible, restores the input muxes' original settings, done. The 1000z already does it during Self-Cal, but that is hopeless when offsets vary depending on channel combination or sometimes, even with the order in which they were enabled. Many scopes do this sort of quick-cal whenever channels get turned on/off or a vertical scale gets changed. If Rigol wrote their self-cal routines well, this "feature" should cost little more than a single function call to the relevant existing function that iteratively zeroes off currently enabled channel(s). They might want to use the previous calibration values as a starting point to skip the part of self-cal where inputs get railed though - the current self-cal routines appear to start from scratch every time.