Author Topic: ARM Announces Cortex-M85 (Read 2962 times)

Sal Ammoniac · « **on:** April 26, 2022, 05:34:10 pm »

https://www.arm.com/company/news/2022/04/arm-expands-total-solutions-for-iot-portfolio?utm_source=marketo&utm_medium=email&utm_campaign=2022_embiot-lowp_mk02_arm&utm_content=newsroom

https://community.arm.com/arm-community-blogs/b/internet-of-things-blog/posts/introducing-cortex-m85

30% faster than Cortex-M7
ARM Helium
TrustZone

SiliconWizard · « **Reply #1 on:** April 26, 2022, 06:16:08 pm »

Sounds impressive, but why do they all have to promote any new silicon through the lens of freaking IoT and machine learning bullcrap?

uer166 · « **Reply #2 on:** April 26, 2022, 07:03:09 pm »

Quote from: SiliconWizard on April 26, 2022, 06:16:08 pm

Sounds impressive, but why do they all have to promote any new silicon through the lens of freaking IoT and machine learning bullcrap?

Probably because it's easier to understand than "XYZ Crypto accelerator with die shield and SIMD instruction set".

Internet of Shit requires some specific security features, while ML requires many parallel multiply-accumulates at low precision. I wonder if this would be a good fit for some less-than-trivial graphics, I'd like to see the SIMD instruction set that they offer for "ML" or whatever.

AndyC_772 · « **Reply #3 on:** April 26, 2022, 07:07:58 pm »

Great. Where and when can I actually buy one?

uer166 · « **Reply #4 on:** April 26, 2022, 07:16:26 pm »

Quote from: AndyC_772 on April 26, 2022, 07:07:58 pm

Great. Where and when can I actually buy one?

Isn't this just a Core? So, 3-5 years from now for a production chip?

Sal Ammoniac · « **Reply #5 on:** April 26, 2022, 09:50:10 pm »

Quote from: AndyC_772 on April 26, 2022, 07:07:58 pm

Great. Where and when can I actually buy one?

2025 most likely

cdev · « **Reply #6 on:** April 26, 2022, 11:19:06 pm »

Id like to find an Arm M85 SBC! Any recommendations?

brucehoult · « **Reply #7 on:** April 27, 2022, 05:09:43 am »

Quote from: uer166 on April 26, 2022, 07:03:09 pm

I'd like to see the SIMD instruction set that they offer for "ML" or whatever.

They said. Helium. aka MVE. Eight vector registers of 128 bits each. Kind of a cut down but different version of SVE because apparently SVE doesn't scale in that direction.

I note that RISC-V V extension does scale to this implementation point: for a 32 bit CPU the smallest standard V extension configuration is 32 registers of 32 bits each. Depending on the needs of your algorithm you can dynamically adjust "LMUL" to use it as 16 registers of 64 bits, 8 registers of 128 bits each, or even 4 registers of 256 bits each. The ALU width is an implementation detail.

Helium overlays the 8 vector registers over the floating point registers. e.g. Vector register Q0 overlays single precision floating point registers S0, S1, S2, and S3.

In RISC-V the vector registers are not overlayed. But the "Zfinx" extension supports overlaying the floating point and integer registers.

This ARM documentation is really tricky. They say a lot without saying anything.

ARMv8-M extends the ARMv6-M architecture. i.e. it's like Cortex-M0{+}.

"ARMv8-M architecture with Main extension" aka "ARMv8-M mainline" extends ARMv7-M.

The Cortex-M85 implements "ARMv8.1-M mainline".

So the instruction set is at heart the same 32 bit Thumb2 ISA as Cortex M3/M4/M7.

nfmax · « **Reply #8 on:** April 27, 2022, 10:01:24 am »

Quote from: Sal Ammoniac on April 26, 2022, 09:50:10 pm

Quote from: AndyC_772 on April 26, 2022, 07:07:58 pm
Great. Where and when can I actually buy one?

2025 most likely

Yeah, maybe. Some time after the war...

SiliconWizard · « **Reply #9 on:** April 27, 2022, 05:38:09 pm »

Quote from: uer166 on April 26, 2022, 07:03:09 pm

Quote from: SiliconWizard on April 26, 2022, 06:16:08 pm
Sounds impressive, but why do they all have to promote any new silicon through the lens of freaking IoT and machine learning bullcrap?

Probably because it's easier to understand than "XYZ Crypto accelerator with die shield and SIMD instruction set".

Internet of Shit requires some specific security features, while ML requires many parallel multiply-accumulates at low precision. I wonder if this would be a good fit for some less-than-trivial graphics, I'd like to see the SIMD instruction set that they offer for "ML" or whatever.

Well all know it's just marketing BS because ML and IoT are trendy these days. But then they make it all sound as though there just wasn't anything happening in the world except ML and IoT, which is just stupid. I'd like to see what the life of those people would look like if all that was left containing MCUs/processors in general were just IoT devices and stuff using ML. Beware what you wish for...

Anyway! As though SIMD was only useful for ML.

Freaking current ML is mostly based on NNs that are basically implemented as convolution. Which ultimately just requires efficient MAC operations. Which are useful for almost any DSP work you can think of.

cdev · « **Reply #10 on:** April 27, 2022, 05:52:37 pm »

Imagine all the billions of people whose jobs they want to replace. That's a lot of money.
And their top priority

uer166 · « **Reply #11 on:** April 27, 2022, 05:57:35 pm »

Quote from: brucehoult on April 27, 2022, 05:09:43 am

Quote from: uer166 on April 26, 2022, 07:03:09 pm
I'd like to see the SIMD instruction set that they offer for "ML" or whatever.

They said. Helium. aka MVE. Eight vector registers of 128 bits each. Kind of a cut down but different version of SVE because apparently SVE doesn't scale in that direction.

Somehow that does not sound very impressive, seems like a dedicated MAC/NN/ML peripheral with its' own RAM is better value than trying to shoehorn it into a core.

brucehoult · « **Reply #12 on:** April 27, 2022, 10:01:15 pm »

Quote from: uer166 on April 27, 2022, 05:57:35 pm

Quote from: brucehoult on April 27, 2022, 05:09:43 am
Quote from: uer166 on April 26, 2022, 07:03:09 pm
I'd like to see the SIMD instruction set that they offer for "ML" or whatever.

They said. Helium. aka MVE. Eight vector registers of 128 bits each. Kind of a cut down but different version of SVE because apparently SVE doesn't scale in that direction.

Somehow that does not sound very impressive, seems like a dedicated MAC/NN/ML peripheral with its' own RAM is better value than trying to shoehorn it into a core.

Of course a system designer is free to do that, and it will be better for some things, especially with large amounts of data to process in a simple way.

But then
you run into the whole GPU bottleneck with having to move data backwards and forwards, send a program to the special unit, in yet another ISA, and an inability to mix vector and scalar calculations in a fine-grained way -- basically an instance of Amdahl's law.

uer166 · « **Reply #13 on:** April 27, 2022, 11:00:30 pm »

Quote from: brucehoult on April 27, 2022, 10:01:15 pm

But then
you run into the whole GPU bottleneck with having to move data backwards and forwards, send a program to the special unit, in yet another ISA, and an inability to mix vector and scalar calculations in a fine-grained way -- basically an instance of Amdahl's law.

Does ML have the same kind of bottleneck? I.e. if the output of the ML is very simple (e.g. classified a duck at coordinates X, Y), and all you provide it are camera frames, isn't all major data exchange localized to the accelerator? I suppose it's still the same issue of needing some real fast caches/RAM locally to walk through all the neuron connection weights/inputs/outputs.

brucehoult · « **Reply #14 on:** April 28, 2022, 12:38:19 am »

Quote from: uer166 on April 27, 2022, 11:00:30 pm

Quote from: brucehoult on April 27, 2022, 10:01:15 pm
But then
you run into the whole GPU bottleneck with having to move data backwards and forwards, send a program to the special unit, in yet another ISA, and an inability to mix vector and scalar calculations in a fine-grained way -- basically an instance of Amdahl's law.

Does ML have the same kind of bottleneck? I.e. if the output of the ML is very simple (e.g. classified a duck at coordinates X, Y), and all you provide it are camera frames, isn't all major data exchange localized to the accelerator? I suppose it's still the same issue of needing some real fast caches/RAM locally to walk through all the neuron connection weights/inputs/outputs.

I think you're probably right for ML, but then the whole idea if using vectors architecturally limited to 128 bits and only 128 bits in MVE (Helium) is a really bad fit for ML anyway. Sure it's 4x or 8x better than scalar arithmetic, but the longer vectors possible with SVE (128-4096 bits) makes much more sense. Or RISC-V V, which scales from 32 bits to 2^32 bits per vector register.

vad · « **Reply #15 on:** April 28, 2022, 04:21:56 am »

Nobody in their mind would train ML model on a microcontroller. You should be able, however, train TensorFlow model somewhere else, convert it in into TensorFlow Lite model and then run it on a microcontroller.

Running ML model does not require enormous CPU resources, and with the hardware acceleration I guess that a simple image classification or a simple voice recognition task would be feasible on such MCU at some low frame rates.

ali_asadzadeh · « **Reply #16 on:** April 28, 2022, 07:07:54 am »

They have announced M55 about 3 years ago, Nothing in production yet

Also the highest speed M7 from NXP i.mX RT1170 is announced more than a year ago, still the stuck is zero

nimish · « **Reply #17 on:** May 04, 2022, 04:50:51 am »

Quote from: vad on April 28, 2022, 04:21:56 am

Nobody in their mind would train ML model on a microcontroller.

Sometimes this is the only way. Plus tweaking a model with actual data and online continuous training is also useful.

NPU's do extend DSP instructions beyond basic stuff, see https://developer.arm.com/documentation/102420/0200/Programmers-model

It's pretty neat. You can express entire BLAS in very few instructions with limited datatypes. IF you can port your algo to a DNN style model you would get a huge speedup "for free"

Quote from: vad on April 28, 2022, 04:21:56 am

Running ML model does not require enormous CPU resources, and with the hardware acceleration I guess that a simple image classification or a simple voice recognition task would be feasible on such MCU at some low frame rates.

Can already be done for speech recognition and other relatively data-light tasks.

Every MCU manufacturer is desperate to get you to deploy some kind of ~AI/ML~ on the Edge!!! So it's a good way to get some free stuff. TinyML is a solution looking for a problem in most cases. That said, porting over some traditional filtering tasks to using an NN framework is a good way to spend a few days

Quote from: ali_asadzadeh on April 28, 2022, 07:07:54 am

They have announced M55 about 3 years ago, Nothing in production yet
Also the highest speed M7 from NXP i.mX RT1170 is announced more than a year ago, still the stuck is zero

I'd expect 3-5 ish years from ARM publicly announcing a core design to silicon and maybe longer for MCUs. The replacement cycles are long and they are very cost sensitive.

rt 1170 is available from avnet: https://www.avnet.com/wps/portal/us/products/avnet-boards/avnet-board-families/maaxboard/maaxboard-rt/ and google has a Coral dev board with one in the works apparently

ali_asadzadeh · « **Reply #18 on:** May 04, 2022, 11:01:08 am »

Quote

rt 1170 is available from avnet: https://www.avnet.com/wps/portal/us/products/avnet-boards/avnet-board-families/maaxboard/maaxboard-rt/ and google has a Coral dev board with one in the works apparently

Personally I prefer to design my Own boards than using an EVAL board, Because it can solve my specific problem, so I need chips not eval kits.

Simon · « **Reply #19 on:** May 08, 2022, 08:01:18 pm »

Quote from: SiliconWizard on April 26, 2022, 06:16:08 pm

Sounds impressive, but why do they all have to promote any new silicon through the lens of freaking IoT and machine learning bullcrap?

Because apparently actually writing code is old hat, we all know what IoT looks like, some dev board that you configure with some graphical interface in 5 seconds to do wonderful looking things. Machine learning? uh, same thing, oozes a sense of being a grunty thing you can chuck your library at.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: ARM Announces Cortex-M85 (Read 2962 times)

Share me