Author Topic: The Raspberry PI PICO 2, now has extra RISC-V cores  (Read 17411 times)

0 Members and 2 Guests are viewing this topic.

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15185
  • Country: fr
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #275 on: September 10, 2024, 09:51:08 pm »
I consider both issues found so far (susceptibility of the internal buck converter and this IO problem) pretty "severe" to consider it for any serious use and can't see how it can be sold as is as a part for any commercial product.

For hobbyist use, as long as you follow the workarounds, that should be usable.

I would have thought the other way around. Professionals embedding it inside something else, which they are designing, should be capable of designing so as to avoid the problems. It's non-ideal, but then so are transistors!

It's hobbyists who are far more likely to run into problems. Especially as they probably don't read the manual in full.

Very much a matter of perspective. Sure, as a professional you're more likely to understand the errata and implement appropriate workarounds, but (usually) the end result is much more critical than any hobby project.
If you kill a $1 MCU as a hobbyist or some parts of your board because you didn't closely follow the errata, that's not a big deal. Forums will just flood with topics about this and countermeasures will be spread relatively quickly. Sure, it may (or not) give this MCU a bad rep in the hobbyist market, but the silicon bugs are there already anyway and the RPi will have to deal with the hobbyist market appropriately if that matters to them.

As a professional, even though the workarounds seem straightforward, that still doesn't look good and doesn't inspire confidence. I have actually used the RP2040 in one commercial project with no failure so far. It had no real issues other than a meh ADC. Potential latch-up, though, is bad. I hadn't seen a single modern, commercial MCU (seen that on ASICs for student projects though for sure) with this kind of issue in a long time.

The workaround will often imply an increased power consumption, which may or may not be a problem. (But it may.) It also pretty much prevents using internal pull-ups, so requires additional passives. Finally, while the workaround is straightforward and shouldn't leave much surprise, that's still a problem that may be lurking in ways that haven't yet been completely identified.

The buck converter thing is even more concerning to me, as it's not (at least as far as I've personally read) clearly explained in details. It looks like an aggravated susceptibility to EMI, which obviously could be caused by other external factors than the orientation of its inductor. That doesn't look good. Of course, you still have the option to use an external rail (if I got it right), but that means having to add an external regulator. That's a pain.

I don't know how sales and uses are all going to work out for this chip at this point. I'm a bit curious about that.
 
The following users thanked this post: MK14

Offline RAPo

  • Frequent Contributor
  • **
  • Posts: 752
  • Country: nl
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #276 on: September 11, 2024, 03:17:36 pm »
There is a new video on the (Windows) compiler front:
https://youtu.be/e536gcOmMbc
 
The following users thanked this post: MK14

Online MK14Topic starter

  • Super Contributor
  • ***
  • Posts: 4853
  • Country: gb
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #277 on: September 11, 2024, 05:24:36 pm »
There is a new video on the (Windows) compiler front:
https://youtu.be/e536gcOmMbc

I watched it a few hours ago.  A bit over 10 minutes long.  (Gary Explains).
It is a good video.
It basically shows you one way of (easily/quickly) correctly installing the latest and properly working (on windows) compiler systems, for both the Arm M33 and RISC-V cores, then repeats the (interesting) benchmarks.

Which seem to show that this new RP2350 is generally around 1.5 to 2.5 times faster (on both Arm or RISC-V), than the previous (older/original) RP2040.  Except for the new hardware floating point (including doubles now), which can give a 6 to 7 times speed up compared to the old version (RP2040), but no speed up for the RISC-V cores, which don't have access to the floating point hardware, or their own version.

Apart from floating point (which is a lot, maybe x6 or x7 times slower, on the RISC-V, until it eventually gets its own floating point hardware, perhaps?, in the distant future), the RISC-V cores, are almost as fast as the Arm M33 ones, which I think is a very impressive achievement.
« Last Edit: September 11, 2024, 05:27:01 pm by MK14 »
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15185
  • Country: fr
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #278 on: September 11, 2024, 08:22:22 pm »
the RISC-V cores, are almost as fast as the Arm M33 ones, which I think is a very impressive achievement.

Their RV core appears to be a RPi project if I got it right? https://github.com/Wren6991/Hazard3 , rather than having used one of the existing open-source cores available.
They claim a Coremark of 3.81/MHz, which is one of the (if not the) best performance among existing small RV32 cores. That's not bad!
They wrote it in pure Verilog. I think the existing open-source RV FPUs are all more or less part of cores written in Chisel or SpinalHDL (don't remember which), so piggy-backing one of these in a pure Verilog project (while possible if you just reuse the Verilog output) may not make a lot of sense. And writing a decent FPU from scratch is not an easy task.

Keep in mind you can actually run one RV core and one ARM core in parallel on this chip, so you can have the ARM core do all the heavy FP.
 
The following users thanked this post: MK14

Online MK14Topic starter

  • Super Contributor
  • ***
  • Posts: 4853
  • Country: gb
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #279 on: September 11, 2024, 08:40:34 pm »
the RISC-V cores, are almost as fast as the Arm M33 ones, which I think is a very impressive achievement.

Their RV core appears to be a RPi project if I got it right? https://github.com/Wren6991/Hazard3 , rather than having used one of the existing open-source cores available.
They claim a Coremark of 3.81/MHz, which is one of the (if not the) best performance among existing small RV32 cores. That's not bad!
They wrote it in pure Verilog. I think the existing open-source RV FPUs are all more or less part of cores written in Chisel or SpinalHDL (don't remember which), so piggy-backing one of these in a pure Verilog project (while possible if you just reuse the Verilog output) may not make a lot of sense. And writing a decent FPU from scratch is not an easy task.

Keep in mind you can actually run one RV core and one ARM core in parallel on this chip, so you can have the ARM core do all the heavy FP.

Raspberry Pi made (or bought the rights to, I'm not sure), the double floating point unit, so they could have either duplicated it (but which may have exceeded their gate allowances), or digitally shared it between the cores (possibly, I don't know what technical difficulties, that would have created).

But for many uses of the RISC-V cores (given most people will probably use the Arm M33's, anyway), either floating point is not important, or emulating one in software, is good and fast enough.

I think the Raspberry Pi team, employed/hired/contracted (I'm not sure of the exact details), someone, (possibly a researcher, or similar), who had already created RISC-V cores (by copying and/or changing the readily available open ones).  So it was not too difficult, for him to give them one, for their project.

There do seem to be blogs and other information sources, with much more details, including the github pages, with those particular RISC-V cores, in it.  I've only really somewhat quickly glanced at such things, rather than extensively read them.
« Last Edit: September 11, 2024, 08:42:12 pm by MK14 »
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4281
  • Country: us
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #280 on: September 11, 2024, 11:13:34 pm »
Quote
Raspberry Pi made (or bought the rights to, I'm not sure), the double floating point unit
Note that it's a "double accelerator co-processor", not the full ARM double floating point hardware (which I guess is also a coprocessor.  But significantly different.)
 
The following users thanked this post: abraxalito, MK14

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4416
  • Country: nz
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #281 on: September 12, 2024, 02:16:11 am »
Apart from floating point [which they didn't add to the RISC-V cores] the RISC-V cores, are almost as fast as the Arm M33 ones, which I think is a very impressive achievement.

Yes, I agree that a single Raspberry Pi engineer, putting 20% of his time to making the RISC-V core, and ending up with something very very close to Arm's flagship M33 microcontroller core, is quite impressive.

Note also that Coremark is just one benchmark, and it's close enough that there are quite likely to be other benchmarks or algorithms on which the RISC-V cores are faster than the Arm cores.

I have a Pi Pico 2 sitting right in front of me, but I haven't yet used it.  I should probably check a few other benchmarks, not least of course my own Primes one :-) (which I wrote in 2016 before I knew RISC-V existed, to compare Pi 3 and Odroid XU4&C2 and x86, but which tends to do quite well on RISC-V)
 
The following users thanked this post: MK14

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15185
  • Country: fr
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #282 on: September 12, 2024, 02:28:54 am »
Yep, pretty nice work for the RV32 core. I haven't looked deeply into the HDL, but wasn't under the impression that it was an evolution of something existing. It seems 100% original. Though again, haven't inspected it in much detail.

@MK14: Regarding the FPU, I'm not sure what they implemented for the RP235X but a FPU is tightly-coupled to the core, so sharing it across two cores with completely different ISAs is probably hard enough/costly in terms of performance loss, that they haven't bothered. A FPU is not like a peripheral shared on some peripheral bus. Even if they had chosen to separate and "duplicate it", a FPU tailored for the ARM FPU instructions requires some non-trivial work to adapt it to RISC-V, and yes, we don't know about the licensing, they may have bought it and not even have the rights to do this.
 
The following users thanked this post: MK14

Online radiolistener

  • Super Contributor
  • ***
  • Posts: 3925
  • Country: ua
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #283 on: September 12, 2024, 05:23:56 am »
It will be interesting to compare RP2350 board with this STM32H562RGT6 board and see results as a table comparison - bench performance, float/double performance, ADC/DAC performance, power consumption, etc...
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4416
  • Country: nz
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #284 on: September 12, 2024, 06:39:34 am »
Yep, pretty nice work for the RV32 core. I haven't looked deeply into the HDL, but wasn't under the impression that it was an evolution of something existing. It seems 100% original. Though again, haven't inspected it in much detail.

I haven't looked at the HDL but I've read the manual for the core, trying to figure out where the performance comes from.

- short 3-stage pipeline with just 1 cycle penalty on branches of course helps minimise stalls.

- there is a tiny 1-entry BTB used only for taken backwards branches. i.e. inner loops. FE310, for example, has much more extensive branch prediction but doesn't perform as well on Coremark

- single-cycle multiplier is in fact quite a huge benefit in Coremark. To an unrealistic extent compared to most code I'd say, and all the other cores which have 3 or 4 cycle multipliers aren't wrong.  Almost all code out there just wants barrel shift, or maybe shift-and-add, not full on multiply. And single-cycle multiply can limit the clock speed ... but this chip only has to do 155 MHz and on a fairly small 40nm process. In contrast FE310 with 4 cycle multiply was doing 320 MHz on 180nm, and FU540 1.5 GHz on 28nm.

- it does register access in the instruction fetch stage, as the RISC-V encoding makes easy but Arm doesn't. The two register contents are already available at the start of the X (2nd) pipeline stage, as I believe is the EQ/LT/LTU comparison between them.

- there is an option to have a 2nd ALU that does only EQ/LT/LTU for branches with I think the result available a cycle earlier. It's interesting that RISC-V has *only* register-to-register comparisons (with, as mentioned above, register access started very early), not register-to-constant comparisons (except for the Zero register), so you don't have to wait for decoding of the instruction format and the various immediate formats. SLTI/SLTIU use the immediate from the instruction, but are only in the main ALU, a cycle later. EQ/LT/LTU with a boolean output is also a simpler computation than full subtraction. I assume the RP2350 has that option.
« Last Edit: September 12, 2024, 10:16:53 am by brucehoult »
 
The following users thanked this post: exe, MK14

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15185
  • Country: fr
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #285 on: September 12, 2024, 09:45:29 am »
Interesting. Yep, the single-cycle multiplier certainly makes a significant difference with Coremark. With my own core (which ended up as a 6-stage with good branch prediction and a comfy BTB), I was getting about 3 Coremark/MHz and that fell down to about 2.5 when I made the multiplier 3-cycle (to increase Fmax).

Register access at the instruction fectch stage looks interesting.
 

Offline exe

  • Supporter
  • ****
  • Posts: 2608
  • Country: nl
  • self-educated hobbyist
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #286 on: September 12, 2024, 10:40:49 am »
With my own core

Did you just make own CPU? How did you do that?
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4416
  • Country: nz
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #287 on: September 12, 2024, 11:08:10 am »
With my own core

Did you just make own CPU? How did you do that?

In recent years hundreds of people (maybe thousands) have designed their own CPU cores for use in the increasingly cheap and capable FPGAs that have become available.

Some have invented their own instruction set at the same time, but most are now using the RISC-V instruction set as it is (in RV32I form) very simple to implement, legal to share the results, and there is a large and growing body of software to run on them, so you can take ready-made open source assemblers, compilers, libraries, RTOSes etc rather than having to do all that yourself too. Even full Linux is possible on your own CPU.

Getting real ASIC chips made is of course another thing altogether. There is Tiny Tapeout, but what you can do there isn't big enough for a full CPU.
 

Offline coppice

  • Super Contributor
  • ***
  • Posts: 9291
  • Country: gb
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #288 on: September 12, 2024, 11:55:38 am »
With my own core
Did you just make own CPU? How did you do that?
Making a working CPU is pretty easy for a moderately competent engineer. FPGAs can make the assembly and test process pretty fast these days. Making is great CPU - fast, dense, dense code, power efficient, etc. - is what is hard.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4416
  • Country: nz
Re: The Raspberry PI PICO 2, now has extra RISC-V cores
« Reply #289 on: September 12, 2024, 01:02:08 pm »
With my own core
Did you just make own CPU? How did you do that?
Making a working CPU is pretty easy for a moderately competent engineer. FPGAs can make the assembly and test process pretty fast these days. Making is great CPU - fast, dense, dense code, power efficient, etc. - is what is hard.

DMIPS/LUT is an interesting metric. (On a given FPGA family)

VexRISCV is pretty good on this measure at 190 DMIPS from 816 LUTS on an Arty (i.e. Artix-7) with "small and productive" settings (RV32I, datapath bypass) That's 0.82 DMIPS/MHz at 232 MHz but it's best just to take the overall DMIPS as µarch and MHz are not independent variables.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf