Author Topic: [C] Ow, pointers are making my brain hurt...  (Read 7321 times)

0 Members and 1 Guest are viewing this topic.

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15439
  • Country: fr
Re: [C] Ow, pointers are making my brain hurt...
« Reply #25 on: November 13, 2021, 09:41:41 pm »
Quote
A better way can be to just pass the pointer, and pass back the updated pointer:
...
This produces much less machine code:

But it's doing less work - something else, which isn't accounted for here, has to update the pointer to achieve the same effect as the first version.

No, it does the exact same thing. Just more elegantly.

The "less work" you're talking about here is just that you need to assign the return value back to the pointer variable. That's a lot of work for sure.
Like: you need to write: "p = foo(p)" instead of "foo(&p)". Okay.

And the added value is that it's more elegant - no double pointer - and most of all, it doesn't exhibit side-effects, which are a major source of bugs. It's close to a functional style, as I said.
 
The following users thanked this post: Siwastaja

Online NorthGuy

  • Super Contributor
  • ***
  • Posts: 3249
  • Country: ca
Re: [C] Ow, pointers are making my brain hurt...
« Reply #26 on: November 13, 2021, 09:54:08 pm »
The same function in x86_64 assembly:
Code: [Select]
foo:
        movdqu  (%rdx), %xmm0
        movq    %xmm0, %rdx
        movhlps %xmm0, %xmm1
        paddq   .LC0(%rip), %xmm0
        movq    %rcx, %rax
        movzbl  (%rdx), %ecx
        movq    %xmm1, %rdx
        movups  %xmm0, (%rax)
        movb    %cl, (%rdx)
        ret

Yeah. OK.

Of course x64 ABI doesn't pass the struct in registers, nor does it return the struct in registers. Thus the ABI feature you're catering to doesn't exist. Hence, the "optimization" produces considerable bloat instead.
 

Offline PlainName

  • Super Contributor
  • ***
  • Posts: 7314
  • Country: va
Re: [C] Ow, pointers are making my brain hurt...
« Reply #27 on: November 13, 2021, 10:03:48 pm »
Quote
No, it does the exact same thing. Just more elegantly.

It doesn't. It does part of it. More elegantly perhaps, but there is that little bit missing.

Quote
The "less work" you're talking about here is just that you need to assign the return value back to the pointer variable.

Exactly.  And this bit isn't about elegance but about how many instructions (ignoring 'of what kind'). If you're going to the bother of profiling it (which I don't see why - elegance should win over size unless there is some overriding reason) then you need to profile the same thing. Thus the calling statement and return action (if any) should be in the mix. Otherwise you're just saying this pear is smaller than that orange.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15439
  • Country: fr
Re: [C] Ow, pointers are making my brain hurt...
« Reply #28 on: November 13, 2021, 10:23:12 pm »
Quote
No, it does the exact same thing. Just more elegantly.

It doesn't. It does part of it. More elegantly perhaps, but there is that little bit missing.

Quote
The "less work" you're talking about here is just that you need to assign the return value back to the pointer variable.

Exactly.  And this bit isn't about elegance but about how many instructions (ignoring 'of what kind'). If you're going to the bother of profiling it (which I don't see why - elegance should win over size unless there is some overriding reason) then you need to profile the same thing. Thus the calling statement and return action (if any) should be in the mix. Otherwise you're just saying this pear is smaller than that orange.

We showed that it was usually more efficient in the general case (while being equivalent if the functions are inlined.) But feel free to include both approaches in real context, get the assembly and see for yourself. Maybe we're wrong.
« Last Edit: November 13, 2021, 10:55:19 pm by SiliconWizard »
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15439
  • Country: fr
Re: [C] Ow, pointers are making my brain hurt...
« Reply #29 on: November 13, 2021, 10:54:44 pm »
The same function in x86_64 assembly:
Code: [Select]
foo:
        movdqu  (%rdx), %xmm0
        movq    %xmm0, %rdx
        movhlps %xmm0, %xmm1
        paddq   .LC0(%rip), %xmm0
        movq    %rcx, %rax
        movzbl  (%rdx), %ecx
        movq    %xmm1, %rdx
        movups  %xmm0, (%rax)
        movb    %cl, (%rdx)
        ret

Yeah. OK.

Of course x64 ABI doesn't pass the struct in registers, nor does it return the struct in registers. Thus the ABI feature you're catering to doesn't exist. Hence, the "optimization" produces considerable bloat instead.

That's interesting. From what I've seen, the ABI actually allows using up to 6 registers for parameters - if they fit - and two for the return value (rax and rdx).
I didn't find any "hard" rule in the ABI that would prevent a compiler from passing/returning structs into several registers if this fits, or integer parameters wider than 1 register, for that matter.
I've found quite a few threads about this on StackOverflow and others. Still unsure where the "truth" should lie here.

What I've read:
Code: [Select]
The first six arguments to a function are passed in registers. Any additional arguments are passed
on the stack in the memory-argument area (see Figure 2). The %rax register is used to return the
first result and the %rdx register is used to return a second result.

Sure you may understand this blindly as meaning that each parameter of a function can only be passed in ONE register if it fits, or on the stack otherwise. Not sure I see the rationale of preventing splitting one argument to several registers, or returning values in two registers since the ABI allows it. Yeah I've seen some heated arguments about that so, no need to reproduce them here. Just mentioning it.

But anyway, the point was looking at how this approach would be compiled for different targets with the usual compilers, and the best for this appeared to be RISC-V and Aarch64.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4538
  • Country: nz
Re: [C] Ow, pointers are making my brain hurt...
« Reply #30 on: November 13, 2021, 10:59:58 pm »
The point was the example assembler for it is doing less because it's not updating the pointer. It may be passing it back, but that does nothing until the caller uses it. To be equivalent, you'd need to show the assembler for the calling function doing the update.

I'm surprised that wasn't clear so perhaps I've missed something?

OK, fine. I'll show the caller too.

The technique of passing the modified pointer(s) back as a function result makes the CALLER simpler as well.

Let's use SiliconWizard's example that copies a byte from one buffer to another to build a memcpy https://godbolt.org/z/ojjzYzT7G :

Code: [Select]
#include <inttypes.h>

typedef struct { uint8_t *out; uint8_t *in; } InOutPtrs_t;

__attribute__((noinline))
InOutPtrs_t copyAchar(InOutPtrs_t InOut)
{
    *InOut.out++ = *InOut.in++;
    return InOut;
}

void mymemcpy(uint8_t *dst, uint8_t *src, long sz){
    InOutPtrs_t InOut = {.in=src, .out=dst};
    while (sz--) InOut = copyAchar(InOut);
}

The RV32 assembly language:

Code: [Select]
copyAchar:                              # @copyAchar
        lb      a3, 0(a1)
        addi    a1, a1, 1
        addi    a2, a0, 1
        sb      a3, 0(a0)
        mv      a0, a2
        ret
mymemcpy:                               # @mymemcpy
        addi    sp, sp, -16
        sw      ra, 12(sp)                      # 4-byte Folded Spill
        sw      s0, 8(sp)                       # 4-byte Folded Spill
        beqz    a2, .LBB1_3
        mv      s0, a2
.LBB1_2:                                # =>This Inner Loop Header: Depth=1
        addi    s0, s0, -1
        call    copyAchar
        bnez    s0, .LBB1_2
.LBB1_3:
        lw      s0, 8(sp)                       # 4-byte Folded Reload
        lw      ra, 12(sp)                      # 4-byte Folded Reload
        addi    sp, sp, 16
        ret

Notice that the loop in mymemcpy() has only three instructions (update the counter, call the function, loop if not zero) and no memory instructions at all. And the copyAchar() function has six instructions and only the necessary two memory instructions to actually load and store the byte being copied.

Now let's try it by passing the in and out pointers as in the OP's code https://godbolt.org/z/W3v5WPWd3 :

Code: [Select]
#include <inttypes.h>

__attribute__((noinline))
void copyAchar(uint8_t **out, uint8_t **in) {
    *(*out)++ = *(*in)++;
}

void mymemcpy(uint8_t *dst, uint8_t *src, long sz){
    while (sz--) copyAchar(&dst, &src);
}

And the generated assembly language...

Code: [Select]
copyAchar:                              # @copyAchar
        lw      a2, 0(a1)
        addi    a3, a2, 1
        sw      a3, 0(a1)
        lw      a1, 0(a0)
        lb      a2, 0(a2)
        addi    a3, a1, 1
        sw      a3, 0(a0)
        sb      a2, 0(a1)
        ret
mymemcpy:                               # @mymemcpy
        addi    sp, sp, -16
        sw      ra, 12(sp)                      # 4-byte Folded Spill
        sw      s0, 8(sp)                       # 4-byte Folded Spill
        sw      a0, 4(sp)
        sw      a1, 0(sp)
        beqz    a2, .LBB1_3
        mv      s0, a2
.LBB1_2:                                # =>This Inner Loop Header: Depth=1
        addi    s0, s0, -1
        addi    a0, sp, 4
        mv      a1, sp
        call    copyAchar
        bnez    s0, .LBB1_2
.LBB1_3:
        lw      s0, 8(sp)                       # 4-byte Folded Reload
        lw      ra, 12(sp)                      # 4-byte Folded Reload
        addi    sp, sp, 16
        ret

Now the calling code has to first store the pointers arguments a0 and a1 into memory at SP and SP+4, and the loop needs five instructions instead of three because it has to regenerate the function arguments each time.  And the called code now needs two extra memory loads and two extra memory stores.

Passing the pointers by value in a struct, and returning their updated values in a struct, improves the code in BOTH the called and calling functions.  We go from 16 to 12 instructions in the caller (from 5 to 3 in the loop) and from 9 to 6 (could have been 5) instructions in the called function. And most importantly on modern machines, from 3 loads and 3 stores per byte copied to 1 load and 1 store.
 
The following users thanked this post: SiliconWizard

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6967
  • Country: fi
    • My home page and email address
Re: [C] Ow, pointers are making my brain hurt...
« Reply #31 on: November 13, 2021, 11:05:20 pm »
Of course x64 ABI doesn't pass the struct in registers, nor does it return the struct in registers. Thus the ABI feature you're catering to doesn't exist.
Well, actually both SysV AMD64 ABI (as used on Linux) and ARM ABI support returning a structure with two register-sized elements.

That is,
Code: [Select]
struct longpair {
    long  a, b;
};

struct longpair squarepair(const struct longpair p)
{
    const struct longpair r = { p.a * p.a, p.b * p.b };
    return r;
}
which compiles using GCC -O2 on linux x86-64 to
Code: [Select]
squarepair:
        mov     rax, rdi
        mov     rdx, rsi
        imul    rax, rdi
        imul    rdx, rsi
        ret
and armv7-a clang10 to
Code: [Select]
squarepair:
        mul     r3, r2, r2
        mul     r2, r1, r1
        stm     r0, {r2, r3}
        bx      lr

What the ABI does not allow, is compiler automagically returning a pointer instead of modifying it via the pointer passed as a parameter; you need to use the structure (and enable compiler optimizations, so it doesn't play stupid).
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4538
  • Country: nz
Re: [C] Ow, pointers are making my brain hurt...
« Reply #32 on: November 13, 2021, 11:28:52 pm »
So what we see there is 32 bit ARM storing the results into a struct in memory, pointed to by an invisible extra argument passed in r0.

RISC-V (both 32 bit and 64 bit) and 64 bit ARM do it all in registers:

Code: [Select]
squarepair:                             # @squarepair
        mul     a0, a0, a0
        mul     a1, a1, a1
        ret

Code: [Select]
squarepair:                             // @squarepair
        mul     x0, x0, x0
        mul     x1, x1, x1
        ret

You can only tell which is which by knowing what register names they use :-)
 

Offline PlainName

  • Super Contributor
  • ***
  • Posts: 7314
  • Country: va
Re: [C] Ow, pointers are making my brain hurt...
« Reply #33 on: November 13, 2021, 11:35:31 pm »
Quote
get the assembly and see for yourself. Maybe we're wrong.

You are misrepresenting what I was saying. Nowhere did I say it wouldn't be shorter, smaller, faster, cheaper, more beautiful, higher, or whatever irrelevant measurement you're balancing on a pin head. I don't care, don't know and have no desire to find out any of those.

My SOLE point was that the two examples were not equivalent. One does call-read-add-update-return and the other does call-read-add-return. See? A little bit missing there. That was my sole point and I am rather disappointed that none of you have grasped it but just bang on about how small the damn thing is when that doesn't matter a toss except for the angels on a pin arguments.

Edit:
Quote
OK, fine. I'll show the caller too.

Thank you. Although I really don't care how it turns out, I think it's important to be accurate especially when you're going to call people out for pretty much the same thing elsewhere.
« Last Edit: November 13, 2021, 11:37:33 pm by dunkemhigh »
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6967
  • Country: fi
    • My home page and email address
Re: [C] Ow, pointers are making my brain hurt...
« Reply #34 on: November 13, 2021, 11:41:55 pm »
Crap, me fail once again. ARMv8a does compile to
Code: [Select]
squarepair:                             // @squarepair
        mul     x0, x0, x0
        mul     x1, x1, x1
        ret
and RISC-V rv32gc and rv64gc using Clang to
Code: [Select]
squarepair:                             # @squarepair
        mul     a0, a0, a0
        mul     a1, a1, a1
        ret
as expected.

Now, where can I get armeabi-v7a procedure call spec?  Ah yes, IHI 0042J.  Okay,
Code: [Select]
typedef  long  pair __attribute__((vector_size (2 * sizeof (long))));

pair squarepair(const pair p)
{
    return (pair){ p[0]*p[0], p[1]*p[1] };
}
gets passed in r0-r1 on armv7-a, and a0-a1 on rv32gc and rv64gc.

To make that useful, one would need to hide the internal details of the pair in macros, since on x86-64 it would be passed in xmm0 and on armv8-a in v0, in a rather inefficient manner; there a structure would work better.

I am not sure when I would bother, though.  Probably never; it's not like temporary memory access is that costly (compared to the code complexity and ABI dependency generated).
« Last Edit: November 14, 2021, 12:26:42 am by Nominal Animal »
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4538
  • Country: nz
Re: [C] Ow, pointers are making my brain hurt...
« Reply #35 on: November 13, 2021, 11:56:27 pm »
Now, where can I get armeabi-v7a procedure call spec?

Funnily enough, Googling your words returns https://developer.arm.com/documentation/ihi0042/latest

Composite types can be passed in registers (or partially registers, partially stack if the 4 argument registers are exhausted)

A scalar type larger than a register (e.g. long long, or double) is returned in r0 and r1

Composite types larger than a register are returned in memory, with the address of the memory passed in r0. A bit of a shame.

I'd love to see ABIs where composite return values could use all the same registers as arguments can. Why not? Their values are currently undefined after return from a function, so there is nothing to be lost.
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6967
  • Country: fi
    • My home page and email address
Re: [C] Ow, pointers are making my brain hurt...
« Reply #36 on: November 14, 2021, 12:36:47 am »
(Sorry, didn't see your answer as I was editing mine.  Did some godbolt.org testing.)

I'd love to see ABIs where composite return values could use all the same registers as arguments can. Why not? Their values are currently undefined after return from a function, so there is nothing to be lost.
Me too.  I really don't like the C errno convention; it would be much nicer to just return more than one scalar, so one of them could be the error code (and 0 for no error).

There are quite a few cases like reading from a file or device, where returning some data and and an error would make the API so much more useful.

Which ties in to this thread nicely: when returning both a numeric value and a pointer, it actually makes sense to pass the pointer as a parameter, and the pointer to the numeric value to be updated, and return the pointer.  This is because the code then only does one level of pointer dereferencing.

For example, when parsing (command-line) parameters to a numeric type NUMTYPE, I use either
    int  parse_NUMTYPE(const char *from, NUMTYPE *to);
returning 0 if success, nonzero errno error code otherwise, if the from string should be a complete number without any trailing garbage, or
    const char *parse_NUMTYPE(const char *from, NUMTYPE *to);
returning a pointer to the first unparsed character, or NULL if there wasn't a number to be parsed.

This might look odd at first, but both the parsing (I usually use strtod()/strtol()/strtoul()/strtoll()/strtoull()) and calling the parsing function is simpler and more robust (maintenance-wise, simpler code) than passing a double pointer.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15439
  • Country: fr
Re: [C] Ow, pointers are making my brain hurt...
« Reply #37 on: November 14, 2021, 12:41:25 am »
Of course x64 ABI doesn't pass the struct in registers, nor does it return the struct in registers. Thus the ABI feature you're catering to doesn't exist.
Well, actually both SysV AMD64 ABI (as used on Linux) and ARM ABI support returning a structure with two register-sized elements.

Ah, thanks for pointing this out! The example I gave for x86_64 was compiled on Windows. Just did the same on Linux (also x86_64) and I get this instead:

Code: [Select]
        movzbl  (%rdi), %eax
        leaq    1(%rsi), %rdx
        movb    %al, (%rsi)
        leaq    1(%rdi), %rax
        ret

which is much nicer, and closer to what I was expecting.
So lesson learned about the x86_64 ABI on Windows - at least the way it's implemented with both GCC and LLVM.
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15439
  • Country: fr
Re: [C] Ow, pointers are making my brain hurt...
« Reply #38 on: November 14, 2021, 01:14:52 am »
(Sorry, didn't see your answer as I was editing mine.  Did some godbolt.org testing.)

I'd love to see ABIs where composite return values could use all the same registers as arguments can. Why not? Their values are currently undefined after return from a function, so there is nothing to be lost.
Me too.  I really don't like the C errno convention; it would be much nicer to just return more than one scalar, so one of them could be the error code (and 0 for no error).

Oh, I agree. I always do that in my own code.
As we saw, there are a few ways of doing it in C: the most common is to return an error code, and return whatever else the function should return via pointers.
Another, as I suggested here too, is to use structs. Although results vary a bit depending on the target, if you restrict your returned structs to two values which typically fit in registers, it's as efficient, if not more, and more elegant. Drawback is you have to define a struct for each kind of type you want to return beside the error code. Not as elegant as in languages that actively support returning multiple values as tuples, or with some kind of monad mechanism, but it also works.

If you want to return say a double along with an error code:
Code: [Select]
#include <math.h>

enum { NOERROR = 0, INVALID_PARAM = 1 };

typedef struct { int err; double value; } RetDouble_t;

RetDouble_t MySqrt(double x)
{
    RetDouble_t  Ret = { .err = NOERROR };

    if (x < 0.0)
        Ret.err = INVALID_PARAM;
   else
        Ret.value = sqrt(x);

    return Ret;
}
 

Online NorthGuy

  • Super Contributor
  • ***
  • Posts: 3249
  • Country: ca
Re: [C] Ow, pointers are making my brain hurt...
« Reply #39 on: November 14, 2021, 04:10:50 am »
Of course x64 ABI doesn't pass the struct in registers, nor does it return the struct in registers. Thus the ABI feature you're catering to doesn't exist.
Well, actually both SysV AMD64 ABI (as used on Linux) and ARM ABI support returning a structure with two register-sized elements.

Windows x64 ABI is different - different registers are used, different rules.
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6967
  • Country: fi
    • My home page and email address
Re: [C] Ow, pointers are making my brain hurt...
« Reply #40 on: November 14, 2021, 08:28:10 am »
Of course x64 ABI doesn't pass the struct in registers, nor does it return the struct in registers. Thus the ABI feature you're catering to doesn't exist.
Well, actually both SysV AMD64 ABI (as used on Linux) and ARM ABI support returning a structure with two register-sized elements.

Windows x64 ABI is different - different registers are used, different rules.
True.  All other OSes on x86-64 (actually, AMD64) use the SysV ABI.
 

Offline PlainName

  • Super Contributor
  • ***
  • Posts: 7314
  • Country: va
Re: [C] Ow, pointers are making my brain hurt...
« Reply #41 on: November 14, 2021, 11:17:44 am »
Quote
If you want to return say a double along with an error code:
Code: [Select]

#include <math.h>

enum { NOERROR = 0, INVALID_PARAM = 1 };

typedef struct { int err; double value; } RetDouble_t;

RetDouble_t MySqrt(double x)
{
    RetDouble_t  Ret = { .err = NOERROR };

    if (x < 0.0)
        Ret.err = INVALID_PARAM;
   else
        Ret.value = sqrt(x);

    return Ret;
}

Isn't that going to cause memory issues when the returned struct, which no longer exists, is sampled? Getting around that would be really messy, I think.
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6967
  • Country: fi
    • My home page and email address
Re: [C] Ow, pointers are making my brain hurt...
« Reply #42 on: November 14, 2021, 11:32:40 am »
Isn't that going to cause memory issues when the returned struct, which no longer exists, is sampled? Getting around that would be really messy, I think.
No.

Consider this:
Code: [Select]
typedef struct {
    int  x;
    int  y;
    int  z;
} vec3i;

vec3i Vec3i(const int x, const int y, const int z)
{
    const vec3i result = { .x = x, .y = y, .z = z };
    return result;
}
(Ignore that the code is silly, since C99 and later allow (vec3i){ .x = x, .y = y, .z = z }.  This is just for illustration, a three integer component vector type.)

On architectures and ABIs where the structure is not passed in registers (most of them), the caller reserves memory for the structure, and passes a pointer to it to the function.  (Exactly how –– i.e., which register, or where on the stack that is ––, depends on the ABI.)

In all cases, the actual structure returned therefore has the caller scope, not the called function scope.  So using the structure in the caller is perfectly okay. For example, in
Code: [Select]
int  vec3i_dot(const int x, const int y, const int z)
{
    vec3i  v = Vec3i(x, y, z);
    return v.x*v.x + v.y*v.y + v.z*v.z;
}
either v is passed in registers, or the compiler passes a pointer to v when calling Vec3i(), depending on the ABI.  In either case its lifetime is the caller scope, and the above is completely safe.

(If you were to pore through the abstract machine model of the C standard, you'd find that regardless of how the structure is passed to the caller, its lifetime must be the caller scope, and not the function scope.  So it's completely different than returning a pointer to a local (non-static) variable, which is a bug.)
« Last Edit: November 14, 2021, 11:35:01 am by Nominal Animal »
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4538
  • Country: nz
Re: [C] Ow, pointers are making my brain hurt...
« Reply #43 on: November 14, 2021, 11:55:15 am »
Isn't that going to cause memory issues when the returned struct, which no longer exists, is sampled? Getting around that would be really messy, I think.

No.

C function arguments and results are passed by value. That is, (logically) they are copied in the process of passing them.
 

Offline PlainName

  • Super Contributor
  • ***
  • Posts: 7314
  • Country: va
Re: [C] Ow, pointers are making my brain hurt...
« Reply #44 on: November 14, 2021, 01:39:55 pm »
Quote
On architectures and ABIs where the structure is not passed in registers (most of them), the caller reserves memory for the structure, and passes a pointer to it to the function.  (Exactly how –– i.e., which register, or where on the stack that is ––, depends on the ABI.)

Ah! Thank you :)
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4247
  • Country: gb
Re: [C] Ow, pointers are making my brain hurt...
« Reply #45 on: November 14, 2021, 01:54:27 pm »
as reference, three different approaches
  • 6800 (only one CPU-register, therefore the ram is massively used)
  • 68k (eight registers for data, eight registers for address, compromise between stack and registers)
  • RISCV (31 registers, if the structure is small enough, it's passed via registers)
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15439
  • Country: fr
Re: [C] Ow, pointers are making my brain hurt...
« Reply #46 on: November 14, 2021, 05:57:19 pm »
Isn't that going to cause memory issues when the returned struct, which no longer exists, is sampled? Getting around that would be really messy, I think.

No.

C function arguments and results are passed by value. That is, (logically) they are copied in the process of passing them.

And, a third no. :)
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15439
  • Country: fr
Re: [C] Ow, pointers are making my brain hurt...
« Reply #47 on: November 14, 2021, 06:02:47 pm »
Of course x64 ABI doesn't pass the struct in registers, nor does it return the struct in registers. Thus the ABI feature you're catering to doesn't exist.
Well, actually both SysV AMD64 ABI (as used on Linux) and ARM ABI support returning a structure with two register-sized elements.

Windows x64 ABI is different - different registers are used, different rules.
True.  All other OSes on x86-64 (actually, AMD64) use the SysV ABI.

Yep. As we can see, the difference in this example is absolutely mind-boggling. I'd be curious to see actual analysis and "benchmarks" that can evaluate the impact of the ABI in a range of typical applications. If someone happens to have a reference on that. Because as it looks, the Windows ABI seems pretty horrible compared to SysV, but it'd be interesting to see a well conducted analysis on this rather than impressions based on a few examples.
 

Online Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6967
  • Country: fi
    • My home page and email address
Re: [C] Ow, pointers are making my brain hurt...
« Reply #48 on: November 14, 2021, 07:15:09 pm »
I'd be curious to see actual analysis and "benchmarks" that can evaluate the impact of the ABI in a range of typical applications. If someone happens to have a reference on that. Because as it looks, the Windows ABI seems pretty horrible compared to SysV, but it'd be interesting to see a well conducted analysis on this rather than impressions based on a few examples.
I think the hardest part of that would be choosing the representative set of function signatures to compare.

On SysV AMD64/x86-64 passes up to six 64-bit integer or pointer parameters (or up to three max. 128-bit aggregates) and up to 8 128-bit xmm vectors in xmm registers (including single doubles as a 128-bit xmm vector first component).
Windows x64 passes up to four 64-bit aggregate arguments in standard registers, and up to four floating-point xmm vectors in xmm registers.

No choice of function signatures would be "fair", because each ABI suggests/implies/assumes different approaches to function signatures.  For example, as an X64 programmer, I would prefer passing references (pointers) to structures over 64 bits instead of passing the structures; but on SysV, the limit is 128 bits.  Structures with a 64-bit pointer and a 64-bit size, or a 64-bit pointer and two 32-bit integers, are quite common.  Should one pass those by value or by reference (via a pointer)?
Programmers comfy on one of the ABIs will have pretty strong preferences of stuff like this, because of their experience (which is obviously colored by the performance and quirks of the ABI they use so much).  And because of cache effects, register allocation, and so on, even tiny differences in the function call and return, can cause significant performance difference in the surrounding code.

It would be more fair to examine the code generated and efficiency of different function signature approaches for each ABI and compiler family and version.
That way you wouldn't be trying to compare ABI-versus-ABI, but approach-on-ABI-and-compiler.  If you microbenchmark the approaches on the two different ABIs on the same hardware, you can perhaps make some kind of qualitative statements about the relative efficiencies of the ABIs, but I wouldn't bother; the relative merits and downsides of the function signature approaches would be much more useful.
« Last Edit: November 14, 2021, 07:17:36 pm by Nominal Animal »
 

Online SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15439
  • Country: fr
Re: [C] Ow, pointers are making my brain hurt...
« Reply #49 on: November 14, 2021, 07:34:46 pm »
Having to adapt your code style to the underlying ABI is not a fantastic idea IMHO. You might have to resort to that when performance is critical, but I favor portable approaches otherwise. But I certainly do agree, comparisons are difficult to make relevant. Yeah it's all a matter of compromises too. If more registers are used to pass arguments and return values, then sure, fewer will be available for other uses, and thus more register saving will be needed... So sure, on targets with more general-purpose registers, registers will be favored.

Just a quick summary of the differences: https://sourceforge.net/p/mingw-w64/wiki2/MinGW%20x64%20Software%20convention/

No clue how the exact differences explain the point, but I've certainly noticed, for instance, that some applications are significantly faster on Linux than on Windows on similar hardware (such as compilers, for instance.) Could also largely be due to differences in how memory is managed, rather than the ABI itself.
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf