Author Topic: Can I increment an integer faster if 'step' is a power of 2?  (Read 4320 times)

0 Members and 1 Guest are viewing this topic.

Offline 741Topic starter

  • Frequent Contributor
  • **
  • Posts: 402
  • Country: gb
    • Circuit & PCB Design (small PCB quantities OK)
This kind of thing - of course the compiler is free to lose any 'help' I am trying to give it with these powers of two, but anyway...

#define INCREMENT 64
uint16_t u;

for(u = 0; u < LIMIT; u += INCREMENT)
{
   //Do whatever
}

Offline tggzzz

  • Super Contributor
  • ***
  • Posts: 20804
  • Country: gb
  • Numbers, not adjectives
    • Having fun doing more, with less
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #1 on: July 12, 2021, 12:18:00 pm »
With questions like this, the answer will depend on the what's inside the loop, target processor, the compiler, the compiler version, the compilation flags - and what is/isn't in any cache.

The contents of the loop would have to be small for it to make a significant difference - if any. Don't microoptimise!

If you want to play around, insert your code into https://godbolt.org/
There are lies, damned lies, statistics - and ADC/DAC specs.
Glider pilot's aphorism: "there is no substitute for span". Retort: "There is a substitute: skill+imagination. But you can buy span".
Having fun doing more, with less
 
The following users thanked this post: thm_w, newbrain

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 22436
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #2 on: July 12, 2021, 12:27:30 pm »
I suppose an increment of 1 might be slightly faster on some platforms.  Difference being there's usually an INC instruction, whereas for others it's LDI reg, INCREMENT / ADD accum, reg, or ADD imm, more or longer instructions in either case.

I would not worry about it, at all.  There are far more important things to think about.  Loop optimization is a late, late stage of development.

The loop check may also vary, for example an increasing loop uses CMP accum, LIMIT which may be longer than the TST accum (or none at all when the ADD/SUB instruction returns zero) for decreasing to zero.  Mind that comparisons to zero are only equal when the initial value is a multiple of the increment.

Loops can be on pointers as well.  Instead of

Code: [Select]
for (i = 0; i < size; i++) {
q[i] = p[i];
}

you might have something like,

Code: [Select]
void* qq = q;
void* pp = p;
void* ppend = pp + size;
for (; pp < ppend; ) {
*q++ = *p++;
}

In certain (simple) cases, the compiler may elucidate this for you -- check the output listings.  This tends to be more compact, but may not necessarily be faster, for example on 8-bit platforms the 16-bit pointer compare requires at least two instructions (compare, compare with carry), usually four (load immediate for each byte of the comparison).

And in such cases, you could go even further and think about page-aligned accesses, for example copying a buffer aligned to 256-byte start and length, so that the low-byte compare can be trivial (0x00).  The compiler won't know about this; this only applies once you're deep in the assembler, desperate for CPU cycles, pushing around object allocations for lucky breaks like this.

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline gbaddeley

  • Regular Contributor
  • *
  • Posts: 210
  • Country: au
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #3 on: July 12, 2021, 12:34:16 pm »
Integer Increment is basically the same low level logic complexity as add, whether it’s a power of 2 or other integer. Some cpus have a shorter instruction size for increment than add (1), to make a small program size saving.

We’re you hoping to make the C for loop run with lower overhead?
Glenn
 

Offline 741Topic starter

  • Frequent Contributor
  • **
  • Posts: 402
  • Country: gb
    • Circuit & PCB Design (small PCB quantities OK)
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #4 on: July 12, 2021, 01:23:17 pm »
Integer Increment is basically the same low level logic complexity as add

Thanks, that is what I suspected. This is on a PIC, and it is ages since I used assembly. The loop speed matters, since I want low overhead to be sure that N fixed delays add up to what I intended  (N * fixed delay). I have to take the ADC sample at the correct time relative to initiating the readings.


TM4 screen dumps show Microchip's RE46C200 smoke detector in a test mode. Yellow is integrator, blue is DAC.

Integrator slope is proportional to illumination. First the chip takes DARK readings, then LIT readings.

Once the fixed-time (here 200us) integration completes, the DAC ramps up at a fixed rate until is change in value matches the preceeding integration (in it's day job it is part on an on-chip ADC).

---> The time taken by the DAC slope varies with illumination.

Note 1: DAC limit is 5.3V.
Note 2:  I can't easily use the datum of the small dip in IR LED voltage (start of LIT integration, IR LED is on) with the hardware I have in place. I'd like to use timings from initial start-test only.


« Last Edit: July 12, 2021, 01:32:07 pm by 741 »
 

Online coppice

  • Super Contributor
  • ***
  • Posts: 9614
  • Country: gb
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #5 on: July 12, 2021, 01:35:54 pm »
I suppose an increment of 1 might be slightly faster on some platforms.  Difference being there's usually an INC instruction, whereas for others it's LDI reg, INCREMENT / ADD accum, reg, or ADD imm, more or longer instructions in either case.
A number of machines allow fast incrementing by 1, 2, 4 and perhaps 8, as these are the increments needed for incrementing the addresses of the various sizes of variable supported by the machine.
 
The following users thanked this post: T3sl4co1l, 741

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 22436
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #6 on: July 12, 2021, 02:51:25 pm »
I suppose an increment of 1 might be slightly faster on some platforms.  Difference being there's usually an INC instruction, whereas for others it's LDI reg, INCREMENT / ADD accum, reg, or ADD imm, more or longer instructions in either case.
A number of machines allow fast incrementing by 1, 2, 4 and perhaps 8, as these are the increments needed for incrementing the addresses of the various sizes of variable supported by the machine.

Ah, neat.  I haven't seen any like that (that I know of).  Though I do know of an analogous feature.  On 80386+, the extended address calculation allows addition, immediate offset, and multiplication by several values (depending on whether you use the multiplicand in the addition), all done in separate hardware at nearly no expense.  This is one case where the loop,

Code: [Select]
char *p;
for (i = 0; i < MAX; i++) {
p[i * STEP]++;
}

is one of the better ways to do it (assuming it's followed literally, and not optimized to another form like as mentioned above).  Mind, this multiplication is exactly implicit in non-void pointers: "*(p+1)" aka "p[1]" is the next element in the array, of whatever size p is typed as; it's not the raw address!  I use char above, to force a size of 1 so the array access is obviously spaced out.  I could've written it for any sized object, but it may not be as apparent unless one is aware of this.  Usual caveats, mind alignment and padding, depending on platform, if you need to save memory or optimize for accesses like these.

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline vad

  • Frequent Contributor
  • **
  • Posts: 500
  • Country: us
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #7 on: July 12, 2021, 03:44:09 pm »
PIC assembly language has INC and INC2 instructions, but they only increment register value by 1 or 2 correspondingly. Compilers will handle adding of any other constant value to a register using instruction ADDLW or similar.

Check your exact PIC model’s programming manual to see the number of cycles it takes ADDLW instruction to execute. If it is 1 cycle (the most likely answer), then you cannot go any faster than that. INC and INC2 would not be faster.

If the number of cycles is greater than 1, you are still out of luck, unless you are incrementing by 1 or 2…
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15479
  • Country: fr
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #8 on: July 12, 2021, 05:22:22 pm »
With questions like this, the answer will depend on the what's inside the loop, target processor, the compiler, the compiler version, the compilation flags - and what is/isn't in any cache.

Of course. One obvious point here - the OP's question sort of assumes that regular integer addition could be more expensive than some other instruction on a given target. This is very uncommon. I don't think I have ever seen such a target myself, or it's so old that I don't remember.

At least... for addition of 'native' width. I haven't seen the OP mentioning anything about the target, so... we can just guess. All they said was "PIC" unless I missed something else, and there are 8-bit to 32-bit MCUs in the PIC line. So...

Obviously, in the OP's example, if you want to increment 16-bit integers on an 8-bit target (with 8-bit addition only), you can optimize the incrementation if the increment is all zeroes in the lower half, because it will make no difference to the lower half. So you can just add the upper half with the increment. That would work for any increment that is all zeroes in its lower half, not just powers of two. Now that's good if the increment is known in advance of course; if it has to be tested at each iteration, the the cost of the test (and branching) will very likely counterbalance any benefit.

This is just a general approach here. I'm not considering possible specific instructions that could exist on a given target. Some mentioned address incrementing, if I got it right, and again that it be more efficient would all depend on the exact architecture. If you're incrementing a 16-bit integer on at least a 16-bit processor, then there is really nothing to gain whatsoever (except maybe on a very crippled 16-bit CPU that would use more cycles to issue a 16-bit addition than other operations, which would be pretty rare.) But for an 8-bit target (except possibly one that would still have 16-bit addition using register pairs, yes that has existed), the only generic optimization I can think of would be what I said above: the lower half being ZERO.

Now if we take that question outside of the software world and consider pure digital implementation, then certainly, incrementing by a power of 2 will be much less expensive in terms of resources than incrementing by a number that has more than 1 bit set.
« Last Edit: July 12, 2021, 05:32:16 pm by SiliconWizard »
 

Offline T3sl4co1l

  • Super Contributor
  • ***
  • Posts: 22436
  • Country: us
  • Expert, Analog Electronics, PCB Layout, EMC
    • Seven Transistor Labs
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #9 on: July 12, 2021, 05:31:56 pm »
Now if we take that question outside of the software world and consider pure digital implementation, then certainly, incrementing by a power of 2 will be much less expensive in terms of resources than incrementing by a number that has more than 1 bit set.

Heh, would an FPGA even make that optimization?

I guess so?  Not sure which way exactly it'd be done.  For example, using adder builtins, but right/left aligning them, and strapping the extra (unused or default) bits to zero.

Or if there's no adder fabric (I forget if anyone bothers to implement that actually? multipliers definitely) then it doesn't matter, it's just a handful of gates, hopefully utilizing their carry-ins.

Tim
Seven Transistor Labs, LLC
Electronic design, from concept to prototype.
Bringing a project to life?  Send me a message!
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15479
  • Country: fr
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #10 on: July 12, 2021, 05:45:25 pm »
Now if we take that question outside of the software world and consider pure digital implementation, then certainly, incrementing by a power of 2 will be much less expensive in terms of resources than incrementing by a number that has more than 1 bit set.

Heh, would an FPGA even make that optimization?

I guess so?  Not sure which way exactly it'd be done.  For example, using adder builtins, but right/left aligning them, and strapping the extra (unused or default) bits to zero.

Or if there's no adder fabric (I forget if anyone bothers to implement that actually? multipliers definitely) then it doesn't matter, it's just a handful of gates, hopefully utilizing their carry-ins.

Tim

On FPGAs specifically (which I was not addressing in particular), I guess it will all depend on the underlying architecture. But as with my optimization tip above, the minimum optimization it will issue is that it will NOT require adders for all the least significant bits that are zero. That still gives more opportunities for optimization. Now the end result will of course all depend on the width of the number to increment, the exact implementation of built-in adders, and so on. And the increment itself, of course. The more LS bits cleared (of course if the increment is constant), and the less expensive it is, in general.
« Last Edit: July 12, 2021, 05:47:51 pm by SiliconWizard »
 

Offline TheCalligrapher

  • Regular Contributor
  • *
  • Posts: 159
  • Country: us
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #11 on: July 12, 2021, 09:03:59 pm »
This kind of thing - of course the compiler is free to lose any 'help' I am trying to give it with these powers of two, but anyway...

#define INCREMENT 64

If an optimization exists on your platform, it is definitely an "easy and obvious" optimization, which will be applied by the compiler itself. Nothing needs to be done on your side, unless you know for sure that your compiler is deficient in this regard.
 

Offline rstofer

  • Super Contributor
  • ***
  • Posts: 9941
  • Country: us
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #12 on: July 12, 2021, 09:57:44 pm »

Of course. One obvious point here - the OP's question sort of assumes that regular integer addition could be more expensive than some other instruction on a given target. This is very uncommon. I don't think I have ever seen such a target myself, or it's so old that I don't remember.


Some designs (like ARM) have an 'add immediate' instruction where the increment is encoded with the instruction for small values on increment (0..4095).  Otherwise, the operand is required to be in a register and this may require an indexed fetch from a pool area of constants.  The value 'may' be in a register but I don't think you can assume it is unless you play with assembly coding.

https://developer.arm.com/documentation/dui0802/a/A64-General-Instructions/ADD--immediate-

'Premature optimization is the root of all evil!'  - Sir Tony Hoare

https://ubiquity.acm.org/article.cfm?id=1513451
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15479
  • Country: fr
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #13 on: July 12, 2021, 10:23:56 pm »
As to optimizing increments for larger-than-native integers, as I mentioned, a decent compiler will also do it on its own. Again as long as the increment is known statically (thus is a constant).

As an example with GCC and RV32:
Code: [Select]
#include <stdint.h>

uint64_t Inc1(uint64_t n)
{
return n + 1;
}

uint64_t Inc2(uint64_t n)
{
return n + 0x100000000;    // 2^32
}

yields: (parameter in {a1, a0}; return value in {a1, a0})

Code: [Select]
Inc1:
mv a5,a0
addi a0,a0,1
sltu a5,a0,a5
add a1,a5,a1
ret

Inc2:
addi a1,a1,1
ret
« Last Edit: July 12, 2021, 10:26:37 pm by SiliconWizard »
 

Offline TheCalligrapher

  • Regular Contributor
  • *
  • Posts: 159
  • Country: us
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #14 on: July 12, 2021, 10:25:49 pm »
Of course. One obvious point here - the OP's question sort of assumes that regular integer addition could be more expensive than some other instruction on a given target. This is very uncommon. I don't think I have ever seen such a target myself, or it's so old that I don't remember.

There is not such thing as "regular old addition". Or, more precisely, there are many kinds of "regular old addition" even on fairly archaic platforms. The obvious example is various flavors of "load address" instructions (`lea` in x86 parlance, for example), which are widely used for efficient constant arithmetic, then "immediate constant operand" additions, then "constant in a register" additions and, finally, "constant in memory" additions.

And even on a platform that has no fancy alternatives to a plain-vanilla `add` instruction, it might still end up being as efficient as it gets if the delta is pre-loaded in a register before the cycle (assuming a register can be reserved for the entire duration of the cycle).
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15479
  • Country: fr
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #15 on: July 12, 2021, 10:49:08 pm »

Of course. One obvious point here - the OP's question sort of assumes that regular integer addition could be more expensive than some other instruction on a given target. This is very uncommon. I don't think I have ever seen such a target myself, or it's so old that I don't remember.


Some designs (like ARM) have an 'add immediate' instruction where the increment is encoded with the instruction for small values on increment (0..4095).  Otherwise, the operand is required to be in a register and this may require an indexed fetch from a pool area of constants.  The value 'may' be in a register but I don't think you can assume it is unless you play with assembly coding

Many ISAs have "add immediate" instructions. Obviously if the increment can't fit an immediate, you need to load it first in a register. Depending on surrounding code, this register will hold this constant every time the increment is done (in a loop), or it may have to reload it every time (could be also done 'indirectly' by register save and load if you are calling functions in between, or all registers are used up, or interrupts kick in.) Note that in cases it has to load a register every time with the constant, that usually means a significant number of instructions have been executed in between, and then one instruction more or less won't make a fricking difference. That's not even micro-optimizing, at this level. It's nano-optimizing? ;) Just a thought.

Of course for "add immediate" instructions to be used, the increment must be known at compile-time.
 

Offline mikerj

  • Super Contributor
  • ***
  • Posts: 3339
  • Country: gb
Re: Can I increment an integer faster if 'step' is a power of 2?
« Reply #16 on: July 13, 2021, 10:04:04 pm »
Integer Increment is basically the same low level logic complexity as add

Thanks, that is what I suspected. This is on a PIC, and it is ages since I used assembly. The loop speed matters, since I want low overhead to be sure that N fixed delays add up to what I intended  (N * fixed delay). I have to take the ADC sample at the correct time relative to initiating the readings.

On the lower end PICs at least, decrementing the loop counter can shave a few cycles because the decfsz instruction is pretty much all you need.
 
The following users thanked this post: 741


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf