Key though is the determinism, or the lack of it. Not only does this frustrate end users, one of the biggest problems in troubleshooting is creating reproducible scenarios. When the rug's being contually being pulled from under your feet when the compiler decides to take a different route depending on a butterfly flapping its wings in Peru, it just makes the task another order of magnitude harder. Sometimes it's a case of less is more.
I presume you do turn off caches, because cache misses (either L1/2/3 or TLB) can cause your program to slow down by an order of magnitude. And that can happen because unrelated processes "evict" your data or virtual addresses from the caches.
Absolutely. While you make an important point, that is on very much the micro-level when dealing with an enterprise level system and any cache level issues are way down in the noise and generally not easily configurable if at all at the application layer. These days pretty much everything's on VMs and SANs anyway, the chance of your average VM admin giving a crap about cache hits is very unlikely, but cache is a key reason why things often run slower on VMs to physical boxes. In the physical world, the CPU is usually only sporadically used to full capacity on (relatively) well defined processes, whereas in the VM world it's frequently a race to have the underlying CPUs used as much as possible, particularly in commodity cloud-based services.
This has been one of my major concerns over the use of SANs and VMs in enterprise systems for the past fifteen or so years, that it's very difficult to provide _any_ form of reliable and guaranteed performance metrics as there are just too many unknowns. In general the SAN guys just want to give you JBOD solutions and simply aren't interested in spindle management. Luckily nowadays with tiered SANs the situation is alleviated.
But at the low level, absolutely. I place pinch point code in zero wait state memory and manually tier memory when necessary. That is something a compiler generally knows little about. For less intensive stuff, that can sit in slower cached memory. In general, in an embedded real time situation, you have to be able to guarantee and profile resposnse time, but luckily usually these systems are simple enough that there are a very well defined set of process that run.
Essentially, non-deterministic systems have kept the consultancy side of my business in work for a couple of decades and continue to do so. Most of my experience away from databases has been mostly with distributed systems using .NET rather than JEE in the enterprise, but I do have experience with both. Quite early on in .NET's lifecycle for example, Microsoft realised that slow start up was a major problem for anything non-trivial, and they released they needed the option of pre-compiling ahead of time. However, at the drop of a hat they will still recompile if the stars are not properly aligned. Unfortunately this also means that after security updates, it is not at all uncommon for all that code to recompile all over again, and this can take tens of minutes, churning your poor CPU cycles - cycles that would not be used if precompiled binaries had been distributed instead.
As a real example of where adaptive compilation doesn't work well is when you have multiple end users (or processes) with differing requirements using the same code. In this instance, the way one user uses the code is different to another user, and in some situations this can lead to serialisation, as there is only one execution plan which is being repeatedly recompiled, and you end up with massive overhead as the same code is recompiled on each context switch/iteration/execution - think in terms of 1,000 slower. This became very apparent, for example, in SQL Server 2000, and remains a problem to this day unless you enforce a per-session execution plan, which is expensive in memory... and, coincidentally, cache of course! I don't know how a Java hot-spot compiler deals with this situation, but it is very real with adaptive run time optimisers. Indeed, I wrote a paper on it at the time. It is exactly this kind of unexpected and unintended performance behaviour which makes me wary, but equally it's given me the opportunity to earn a few quid diagnosing and fixing it when it goes wrong too.
Anyway, these are just my experiences, and as I say I can hardly complain, these kinds of additional complexities and their unintended side effects are exactly what end up paying my bills after all!