I just found out I can control the DS1054Z via a simple telnet connection. You don't need anything fancier than that.
As a quick test I copied/pasted this into the terminal window. I wanted to see how fast the vertical position would change:
:CHAN1:OFFS -0.8
:CHAN1:OFFS -0.7
:CHAN1:OFFS -0.6
:CHAN1:OFFS -0.5
:CHAN1:OFFS -0.4
:CHAN1:OFFS -0.3
:CHAN1:OFFS -0.2
:CHAN1:OFFS -0.1
:CHAN1:OFFS 0.0
:CHAN1:OFFS 0.1
:CHAN1:OFFS 0.2
:CHAN1:OFFS 0.3
:CHAN1:OFFS 0.4
:CHAN1:OFFS 0.5
:CHAN1:OFFS 0.6
:CHAN1:OFFS 0.7
:CHAN1:OFFS 0.8
:CHAN1:OFFS 0.9
:CHAN1:OFFS 1.0
Result: About 5 steps per second - not terribly fast.

I'm fairly sure that's not a hardware limitation but it's possible the internal control interface works via SCPI as well and is seeing the same problem. 5 steps per second is about what I see when I twiddle the vertical position knob expertly.
ie. There could be a microcontroller reading the front panel and sending SCPI commands to the main chip to control the 'scope.
If so, the people reading the rotary encoder on the front panel might be fighting the same five-times-per-second limitation as I'm seeing. In that case it will never go completely smoothly because the SCPI command processor is slow.
(nb. this is all just speculation, I have no idea how the thing works internally but I've seen a few similar arrangements in other devices).
If it does go over SCPI there's still room for improvement though. A constant five updates per second would be much better than the "wait for the encoder to stop
then move" system they're using at the moment.
And ... it should be perfectly possible to attach an external potientiometer (or rotary encoder) for more direct control.