Author Topic: RiscV v.s ARM Cortex instructions.  (Read 1460 times)

0 Members and 2 Guests are viewing this topic.

Offline MTTopic starter

  • Super Contributor
  • ***
  • Posts: 1671
  • Country: aq
RiscV v.s ARM Cortex instructions.
« on: February 27, 2023, 05:15:34 pm »
Seams Risc V dont have an equivalent to the ARM Cortex UXTB instruction, so how many operations would a 32bit Risc V take to do the same?

UXTB
Unsigned Extend Byte extracts an 8-bit value from a register,  zero-extends it to 32 bits, and writes the result to the destination register.
The instruction can specify a rotation by 0, 8, 16, or 24 bits before extracting the 8-bit value.
 

Offline woofy

  • Frequent Contributor
  • **
  • Posts: 363
  • Country: gb
    • Woofys Place
Re: RiscV v.s ARM Cortex instructions.
« Reply #1 on: February 27, 2023, 05:59:37 pm »
Would SRLI followed by ANDI do the job?
 
The following users thanked this post: MT

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15234
  • Country: fr
Re: RiscV v.s ARM Cortex instructions.
« Reply #2 on: February 27, 2023, 08:04:49 pm »
Would SRLI followed by ANDI do the job?

Yep.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4448
  • Country: nz
Re: RiscV v.s ARM Cortex instructions.
« Reply #3 on: February 27, 2023, 08:09:05 pm »
It would indeed. Or just ANDI with 255 if the rotation is zero.

More generally a "SLLI;SRLI" pair can extract any zero-extended field of any bit size or bit position, and "SLLI;SRAI" can extract any sign-extended field of any bit size of bit position.
 
The following users thanked this post: MT

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15234
  • Country: fr
Re: RiscV v.s ARM Cortex instructions.
« Reply #4 on: February 27, 2023, 08:15:58 pm »
It would indeed. Or just ANDI with 255 if the rotation is zero.

More generally a "SLLI;SRLI" pair can extract any zero-extended field of any bit size or bit position, and "SLLI;SRAI" can extract any sign-extended field of any bit size of bit position.

I looked in the bitmanip extension, and didn't find any instruction there that could do it in a single instruction.
Which I found a bit odd. IIRC, such a "byte extraction" instruction was proposed in early drafts. A lot of proposals have been stripped off. I can understand that, as the first draft I read looked like a monster.
And as often in the RISCV ISA, they probably favored reducing the number of instructions, and considered this one could be optimized fusing two existing instructions.

 
The following users thanked this post: MT

Offline woofy

  • Frequent Contributor
  • **
  • Posts: 363
  • Country: gb
    • Woofys Place
Re: RiscV v.s ARM Cortex instructions.
« Reply #5 on: February 27, 2023, 08:47:24 pm »
It would indeed. Or just ANDI with 255 if the rotation is zero.

More generally a "SLLI;SRLI" pair can extract any zero-extended field of any bit size or bit position, and "SLLI;SRAI" can extract any sign-extended field of any bit size of bit position.

I looked in the bitmanip extension, and didn't find any instruction there that could do it in a single instruction.
Which I found a bit odd. IIRC, such a "byte extraction" instruction was proposed in early drafts. A lot of proposals have been stripped off. I can understand that, as the first draft I read looked like a monster.
And as often in the RISCV ISA, they probably favored reducing the number of instructions, and considered this one could be optimized fusing two existing instructions.

Yeah, I  guess there are diminishing returns in adding too many niche instructions. MIPS does have nice bitfield instructions though (EXT and INS in MIPS32 release 2).
 
The following users thanked this post: MT

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6827
  • Country: fi
    • My home page and email address
Re: RiscV v.s ARM Cortex instructions.
« Reply #6 on: February 27, 2023, 11:48:22 pm »
In general, examining the operation to be done, and dividing it in different ways can yield much better solutions than just mapping machine instructions from one architecture to another.

If you do not consider the capabilities of the target instruction set architecture, and only look for equivalent instructions or equivalent instruction sequences, you won't find the most efficient ways to implement the underlying sequence of operations.

For ARM Cortex-M4 and -M7, you want ARMv7-M Architecture Reference Manual.   (Both M4 and M7 have the DSP extension mentioned built-in, i.e. SMLAD and such.)

Even such a simple operation as blending two 15-bit RGB pixel values together, i.e. p = 0 (0%) to 33 (100%)
    r' = (r * p + R * (33 - p)) >> 5 = (R*33 + p*(r - R)) >> 5
    g' = (g * p + G * (33 - p)) >> 5 = (G*33 + p*(g - G)) >> 5
    b' = (b * p + B * (33 - p)) >> 5 = (B*33 + p*(b - B)) >> 5
can be implemented in many different ways on 32-bit architectures, depending on exactly what kind of machine instructions are available and efficient.
You definitely do not need to do six multiplications per pixel as the above might suggest.  And yes, 100% blend of rgb, 0% of RGB, is actually p = 2n+1 for n-bit color components, like n=5 here, not 2n.  Many disagree, but they're wrong.  It is trivial to verify.  (It is because (2n-1)*(2n+1) = 22*n-1, and the right shift is a truncation/rounding towards zero.)
 
The following users thanked this post: MT

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4297
  • Country: us
Re: RiscV v.s ARM Cortex instructions.
« Reply #7 on: February 28, 2023, 01:33:56 am »
Quote
a "SLLI;SRLI" pair can extract any zero-extended field of any bit size or bit position
This is particularly odd feeling coming from an 8bit microcontroller world without barrel shifters, where shifts (more than 1bit shift and/or more than 8bits shifted) are quite expensive.

Similarly, I've seen CM0 compilers implement bit tests with a shift of the desired bit to carry or sign position, followed by a conditional branch.  It essentially burns a register, but there is no "and immediate" instruction on CM0 (non-destructive or otherwise), so it ends up quicker and shorter than other possibilities.
 

Offline brucehoult

  • Super Contributor
  • ***
  • Posts: 4448
  • Country: nz
Re: RiscV v.s ARM Cortex instructions.
« Reply #8 on: February 28, 2023, 03:32:18 am »
It would indeed. Or just ANDI with 255 if the rotation is zero.

More generally a "SLLI;SRLI" pair can extract any zero-extended field of any bit size or bit position, and "SLLI;SRAI" can extract any sign-extended field of any bit size of bit position.

I looked in the bitmanip extension, and didn't find any instruction there that could do it in a single instruction.
Which I found a bit odd. IIRC, such a "byte extraction" instruction was proposed in early drafts. A lot of proposals have been stripped off. I can understand that, as the first draft I read looked like a monster.
And as often in the RISCV ISA, they probably favored reducing the number of instructions, and considered this one could be optimized fusing two existing instructions.

Yeah, I  guess there are diminishing returns in adding too many niche instructions.

Very much diminishing returns. The general rule when we were doing the bitmanip extension was that any new instruction had to replace at least 3 or 4 existing instructions unless it was an incredibly common operation -- and preferably many more than that.

Bitfield extract only replaces two instructions (as above), so at most it saves one clock cycle. Any new bitfield extract instruction would have to be a full 4 byte opcode, which means no code size saving at all over the shift pair sequences which are also 4 bytes if you don't mind the final extracted value being in the same register as the structure it was extracted from (and that register is one of the 8 "C extension" registers.

Quote
MIPS does have nice bitfield instructions though (EXT and INS in MIPS32 release 2).

I proposed adopting the Motorola 88000 instructions ext, extu, mak, mask and maybe set though the latter doesn't fit RISC-V's 2R 1W pipeline design.
 
The following users thanked this post: MT

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4297
  • Country: us
Re: RiscV v.s ARM Cortex instructions.
« Reply #9 on: February 28, 2023, 07:06:18 am »
Quote
Very much diminishing returns.
That'd be the point of RISC in general, right?  I remember a bunch of "string" instructions added to the PDP10 that ended up being slower than the regular instructions that they would have replaced.  Just to please the COBOL folks.

 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf