I'd like to see the SIMD instruction set that they offer for "ML" or whatever.
They said. Helium. aka MVE. Eight vector registers of 128 bits each. Kind of a cut down but different version of SVE because apparently SVE doesn't scale in that direction.
I note that RISC-V V extension does scale to this implementation point: for a 32 bit CPU the smallest standard V extension configuration is 32 registers of 32 bits each. Depending on the needs of your algorithm you can dynamically adjust "LMUL" to use it as 16 registers of 64 bits, 8 registers of 128 bits each, or even 4 registers of 256 bits each. The ALU width is an implementation detail.
Helium overlays the 8 vector registers over the floating point registers. e.g. Vector register Q0 overlays single precision floating point registers S0, S1, S2, and S3.
In RISC-V the vector registers are not overlayed. But the "Zfinx" extension supports overlaying the floating point and integer registers.
This ARM documentation is really tricky. They say a lot without saying anything.
ARMv8-M extends the ARMv6-M architecture. i.e. it's like Cortex-M0{+}.
"ARMv8-M architecture with Main extension" aka "ARMv8-M mainline" extends ARMv7-M.
The Cortex-M85 implements "ARMv8.1-M mainline".
So the instruction set is at heart the same 32 bit Thumb2 ISA as Cortex M3/M4/M7.