Making sure of that is very difficult. If you measure a mean time of X, what fudge factor should you apply to get to the worst case?
As for "if you have enough cpu power", yes that is a solution. But I'm reminded that you can get a brick to fly if you apply enough power. (Or search yootoob for "flying lawnmower"!)
Is this an argument in favor of or opposition to my post?
I like it, it's actually a really good analogy. It highlights the same gross excess, and rational economy.
Back in the day, it took crazy defense projects (or a few very dedicated and probably wealthy amateurs) to come up with junk like that (e.g., those ill fated flying-saucer platforms). Nowadays anyone with under a thousand bucks knocking around can slap together something like that!
In the same way, what used to require heroic assembler on one platform, with today's platforms is now trivial, even on a budget. Don't let some imagined combination of efficiency, elegance and so on, be the barrier to "good enough"!
Doing some boring housekeeping tasks? Don't worry about learning 8051 assembler, just slap in the AVR or STM32 you're familiar with. Cost reduce it later when you have time -- and more importantly, budget -- to!
Using 10s technology to force 90s games to "run" on an 80s console? Don't worry about learning VHDL for the bus interface, just use the rPi you're handy with!
Fluent in Python but the data-cranking problem would really do better on a DSP or FPGA? Toss in the $50 SBC, who cares!
And of course not that one should take such liberty for granted: there will always be some applications where the harder solution is required, so there is value in learning lower level things (even assembly). By all means, take the time to investigate them, as you can.
I'm reminded of the old joke about someone entering a programming competition. The winning entry was faster, but contained errors. The losing competitor remarked that he could have made his program ten times faster if he didn't have to give the correct result.
Yup. Timing from start of instruction(s) is what I meant, of course; but knowing when they start, is another matter (or when their outputs propagate to their targets).
Cache coherency is a killer, both in large systems or hard realtime systems.
The larger HPC systems appear to be settling on message passing architectures, which can avoid the problems of cache coherency.
Reminds me of this story:
https://randomascii.wordpress.com/2018/01/07/finding-a-cpu-design-bug-in-the-xbox-360/tl;dr they added an instruction to perform an incoherent prefetch, bypassing L2. Turns out... just putting that instruction into memory
anywhere executable at all,* introduced an extremely small probability that it would be speculatively executed, tainting coherency and setting up a crash with absolutely no warning or explanation.
*Hmm, doesn't say if it was tested quite
this far. A branch-indirect instruction could potentially be predicted to land on one, even if the target is not in any intended executable code path. (Also assuming memory is r/w/x tagged, so that general data doesn't get spec-ex.'d; that would just about damn it to a respin, I would guess!) Depends when and how such an instruction is decoded; maybe those are really slow on the platform, decoded late, and it's actually safe?
Tim