I don't think so, the STlink is only a low-level interface. The intelligence is coming from the software driving the STlink. And yes, I agree that Segger discovered that direct writing was faster at some point. The early versions also didn't use direct writing.
Reading doesn't take long, EBlink is first reading the current content at startup to fill the cache so it can skip unmodified sectors, this only takes less than a second for 250Kbyte image (H7) with a stlinkV3. So verify isn't that expansive. A flash write takes a certain amount of time which is specified by the vendor. If your transfer speed, the speed to fill the shift register on target side, is lower or almost equal to the worst case flash time, then there is no need to check the busy status of the flash and there is no need to use an external flash loader on target.