I guess this will cause some trouble for the other CPU manufacturers. For example the STM32F439BIT6 costs EUR 10 in higher quantities, but has only 256 kB RAM and runs at 180 MHz. Why would anyone buy such a CPU anymore?
This shows a fundamental misunderstanding.
ST is not a "CPU manufacturer". STM32F439BIT6 is not a CPU. It's an MCU. Nobody buys it for its CPU core. You buy it if it offers the right combination of core, peripherals, documentation and tools to do the job. Especially the peripherals. For a typical MCU job, 256 kB RAM is just huge, and the core frequency is unimportant. Even if it processes a lot of data, the point is to use DMA to do it.
Modern ARM MCUs tend to offer advanced and plentiful peripherals, compared to old 8-bitters. This is a major argument for using them, IME. Not the core. The core "just is" what it is, and I have been happy with ARM - would be happy with many others, as well. It's unimportant.
ARM is acceptable because it hides its "silicon bloat" fairly well. It just works in a simple and understandable manner. But in a typical "modern ARM microcontroller", the ARM core is not the best or most interesting part, no way!
My current project involves STM32H743. I use it because I need all of this:
* Digital camera interface (80MHz synchronous parallel bus input with FIFO and DMA)
* HRTIM module for accurate, fast PWMs for software defined DC/DC
* Two comparators
* Two 3-phase motor drives
* All three ADCs are really needed
* Both two DACs - one creates control signals for DC/DC, other plays audio
* five SPIs
* A lot of DMA channels to serve all this.
.... AND a fast core, which can run without caches, in a predictable manner, completely from core-coupled ITCM memory, with most real-time data in core-coupled DTCM memory.
... in a single project.
It's about the big picture. The core itself is fairly unimportant, although in this case, I
also need the processing power - for which, the 64-bit data bus, 64-bit memories and fairly usable set of 64-bit instructions as well as SIMD set are good to have. This being said, in many of my projects, this processing power is secondary or completely unimportant.
The cache point is quite moot. Most ARM MCUs that are high-end enough to include cache, also include core-coupled memories, typically large enough to fit all timing critical code and data (at least when you write it sanely and not overbloated), and run caches disabled. In fact, I have never enabled caches in my projects. Non-critical slow parts run off the flash directly, else from ITCM. But I guess caches are important when running a lot of bloated stuff which just doesn't fit in core-coupled memories - and when you have too many abstaction layers inbetween so that you can no longer take advantage of the heterogenous structure of a modern MCU, but need to treat everything as a single big "blob" called "CPU", measured on megabytes and megahertz. I just don't work like that with MCUs. I use as many of the features an MCU can offer, in an efficient way, and
then I use an integrated single-board computer with enough resources when I want to run linux to do "computer tasks" (requiring gigabytes of memory, Internet connection, a lot of calculation on non-realtime data, etc. - in an abstract way where I don't need to think about the hardware too much).
Coming back to your "why would anyone" question. People still design in those 8-bitters and run them at 1MHz, with a few kilobytes of memory, in masses, because
they are enough for so many tasks. So why would 256K and 180MHz be "too little" for anyone?