Author Topic: Are data type compiler-dependent or target dependent  (Read 8852 times)

0 Members and 2 Guests are viewing this topic.

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #25 on: August 04, 2022, 09:01:05 am »
This is why stdint was introduced in C99.


All this is not "insane", it all comes from the fact that the C standard has always had the goal of supporting a very wide range of targets while making it possible for compilers to produce efficient code.

Why do you always have to defend that mistakes of the c compiler?

Byte must be 8 bit to have a common and universally accepted reference, like 1000mm is universally accepted as 1m, otherwise you need to introduce other constant which only makes the c compiler messed up with more crap and people more confused.

To really support a very wide range of targets aiming for true portability (which is utopia, anyway), char must be always equal to byte and sizeof(type) must be measured in 8bit modulo, and data type must be expressed with bit and sign.

What the frog is the meaning of sizeof(char)=1 if char is 16bit???

It's like saying that 1m is measuring 2000mm if you are around the north pole because measuring permafrost cannot be measured in 1000mm modulo, and you need a "adapting constant" to adjust things

1000mm_northpole = 1000m_measured_everywhereelse * magic_adapting_constant

printf("The number of bits a 'char' has on my system: %zu\n", sizeof(char) * CHAR_BIT);

(taken from the GNU C Library Reference Manual)

Seriously, what the fuck is that crap? Call things with their names, that is the biggest bullshit ever seen in computer science, in fact it has the only benefit of causing nothing but tons of stupid bugs

Therefore c must ban char, short, int, long and long long, and only accept uint8, uint16, uint32, uint64 and their signed versions

That also must be applied to pointers

All other things made with c is pure garbage
« Last Edit: August 04, 2022, 09:05:45 am by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6788
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #26 on: August 04, 2022, 09:54:29 am »
Why do you always have to defend that mistakes of the c compiler?
Hey, SiliconWizard wasn't defending, only explaining why.
Besides, they weren't mistakes back when C was first designed and later standardized by ISO and ANSI.

Remember, there were both PDP and DECsystems with 18-bit/36-bit words back then, and 8-bit chars weren't universally accepted yet.  That is the reason C calls its smallest integer base type char, and not byte.

(And sizeof expression and sizeof (type) return the size of the expression or type in chars, not bytes, with CHAR_BITS bits per char.  So, sizeof (short) * CHAR_BITS does return the effective number of bits in short.)

It is annoying now, yes, most definitely; but not a mistake.

In Linux, for example, you'll see a preprocessor macro __BITS_PER_LONG used rather extensively.  Because Linux uses either an ILP32 (with __BITS_PER_LONG=32) or LP64 (with __BITS_PER_LONG=64) data model, unsigned long is the optimum base type for bit arrays.  Each limb is a single unsigned long, and contains __BITS_PER_LONG bits.  This is true on all architectures Linux runs on.

In a mixed freestanding C/C++ environment as is typically used when programming microcontrollers, one can do something like
Code: [Select]
#include <stdint.h>
#include <limits.h>

#if ULONG_MAX == UINT32_MAX
// This architecture has 32-bit 'unsigned long'
#elif ULONG_MAX == UINT64_MAX
// This architecture has 64-bit 'unsigned long'
#else
// This architecture has an odd-sized 'unsigned long'; perhaps #error ?
#endif
and similarly with UCHAR_MAX, USHRT_MAX, UINT_MAX, and ULLONG_MAX.  You can even check CHAR_BIT.

My point is that while the C standard allows some really odd things, the fact that most of those odd things are exceedingly rare nowadays means we can use a different set of practical requirements.  This is also why I always say practice trumps theory; that what the C standard says is useful, but not the "law of the land": that which is practical in real life trumps the C (and POSIX C) standards.

But, instead of just making those assumptions silently, we should codify them in a header file explicitly describing and testing for them like above, and then just include that header in our various projects –– make it requirements.h or base-assumptions.h or similar, so it is clear for other human developers too.  (I do like the way GNU C library requires one to define macros like _GNU_SOURCE, _DEFAULT_SOURCE, _POSIX_C_SOURCE, and so on, to expose the interfaces.)

That way, you get exactly what you demand a sane C implementation should nowadays provide (and I don't really disagree), and if someone tries it on a strange architecture, instead of getting odd results and bugs, they'll get a warning/error at compile time that the target is a strange architecture.

Thing is, on those strange architectures (even on DSPs), there are usually compiler options that can be used to trade generated machine code efficiency for a more typical C environment.  On such, one can use the flags to compile the original code, then refactor the code to this strange architecture (typically a DSP with very specific/unique quirks in C), and verify the new code by comparing unit test results to the original (but possibly horribly slow) code.
In particular, on these architectures, you would usually not use the exact-width intN_t/uintN_t types, but either the base types, custom types provided by that target and C compiler, or int_leastN_t/uint_leastN_t/int_fastN_t/uint_fastN_t types.

So, it's really a win-win for us developers, and like SiliconWizard says, it does let C compilers produce more effective code.

Sure, it is a bit annoying, especially in that us developers need to find the way to express ourselves in C efficiently, which is not always logical (because of the concepts like "char" instead of "byte"); and we really do need to explicitly express the target assumptions (like in the <stdint.h>/<limits.h> check above –– remember, they're always available, even in a freestanding environment; you can consider them to be provided by the C compiler and not the standard C library implementation per se).
 

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15185
  • Country: fr
Re: Are data type compiler-dependent or target dependent
« Reply #27 on: August 04, 2022, 07:15:52 pm »
As Nominal (who kindly took the time for this elaborate answer) said.

If you really want a language more robust and with much fewer quirks, just use Ada. I don't even understand, given the statements you (@DiTBho) make on various programming topics, why you even bother with C. It clearly doesn't match your expectations/requirements.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #28 on: August 04, 2022, 11:05:27 pm »
If you really want a language more robust and with much fewer quirks, just use Ada.

I do use Ada! What is the point now? I think I am the only person in the world supporting Gnat on HPPA and MIPS4; it took me years to prepare a valid Gnat compiler

Do you know how difficult is preparing a Gnat compiler for MIPS4? That's why I have to use C.

The same applies to GoLang: porting llvm to HPPA is not a piece of cake. llvm doesn't work on HPPA because it has no support and without llvm you can forget Golang.
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #29 on: August 04, 2022, 11:06:15 pm »
I don't even understand, given the statements you (@DiTBho) make on various programming topics, why you even bother with C. It clearly doesn't match your expectations/requirements.

Because things like sizeof(x) are really a stupid things in the way they have been implemented, and - here the annoying + frustrating part is that - it's would be extremely easy for competent people to be fixed them, but people argument that it's ok.

Why People don't fix it once and forever in say "C-2022" revision instead than introducing more stupid workarounds? I cannot believe "because otherwise it won't compile for PDP/8"

For the records, I have designed my own C-like compiler targetting MIPS4-only and

sizeof(type) has been renamed byte_sizeof(type), which always returns things measured in byte
char, short, int, long, long long are banned so you cannot use them
It doesn't need any external header to *redefine* the basic data-types
char8_t (always unsigned, and always 8bit, it's for ASCII stuff)
char16_t (always unsigned, and always 16bit, it's for unicode stuff)
uint8, uint16, uint32, uint64,
sint8, sint16, sint32, sint64,
fp32, fp64, fx1616, fx2408,
cplxfp32, cplxfp64, cplxfx1616, cplxfx2408, (complex numbers)
p_uint8, p_uint16, p_uint32, p_uint64,  (p_ means pointer)
p_sint8, p_sint16, p_sint32, p_sint64,
p_fp32, p_fp64, p_fx1616, p_fx2408,
p_cplxfp32, p_cplxfp64, p_cplxfx1616, p_cplxfx2408,
p_this, p_that (like void*)
p_char8
p_char16
string8 (this is like p_char8, but the first cell stores the length)
string16 (this is like p_char16, but the first cell stores the length)

There is also boolean, which is a true type with its logic operators
(p_boolean is its pointer)

All of these are built-in, you don't need any header (perpetually wrong, bugged, wrong) header.

I am so frustrated with the all the shit done by GNU with their glibc headers, always broken, always with problems because the last hacker decided to change something; supporting them on Gentoo is more frustrating than thinking you can cool hell, and this because people continuously modify those bloody headers and you have no more compiling stuff, or worse still, broken stuff, and this because people continuously modify those bloody headers and you have no more compiling stuff, or worse still, broken stuff.

Is it reasonable? I don't think so! And here, see my simple solution:

my MIPS4 prototype doesn't implements 16 bit operations (some instructions are missing in hardware), Gcc doesn't fully support a MIPS4 cpu like R18K (which is not officially existing, anyway, the last one was R16K, and the last supported was R12K)

So, I applied these nazy rules:

unt16, sint16, p_uint16, p_sint16 are NOT defined, and there is no way to define them, so the user cannot mess up anything!

Do you see how coherent, simple, elegant it is?

You have a piece of C code where you see "uint16_t ", the compiler won't accept it and returns

"sorry, this target doesn't support 16 bit datatype"

Which is super-clear, simple, and not bug-prone.

Applied to TI320, ... "char" is banned, the compiler doesn't accept it, you have to use "char8" or "char16", and when you try to define "char8 something"

"sorry, this target doesn't support 8 bit datatype"

I have these operators

bit_sizeof(type), returns the size of a type in bits, e.g. sizeof(uint8_t) returns 8
byte_sizeof(type), returns the size of a type in byte (byte=8bit), e.g. sizeof(uint8_t) returns 1
typeof(type), returns the type (grabbed from the typedef-space), that's super useful for polymorphic code
Code: [Select]
switch (typeof(type))
{
     case fx2408:
          ...
          break;
     case fx1616:
          ...
          break;
     ...
}

case can be of any type, you can compare strings, fp numbers, etc

logicalExOr    ^^ is not defined by C, and again you have to provide an header with this difinition
Code: [Select]
#define logicalExOr(A, B)     ((!A) != (!B)) // new
#define logicalExNOr(A, B)    ((!A) == (!B)) // new

My compiler comes with a built-in logicalExOr operator, which ONLY works on boolean stuff.
If you try to apply it to say "uint8" it will trigger an error.

Anyway, I cannot call my compiler "C", it's not C-compliant in several aspects, but who cares? Gcc can compile my compiler and it works on HPPA, and I am able to (cross)compile on an HPPA workstation an hacked version of XINU(1) for a MIPS4 prototype.

(the generated machine-code sucks about optimization, I know ... , but) next step, re-targeting for 68K


- in Conclusion -
I hope someone with more skills and attributes than me will one day take all the bullshit out of C and fix it right once and for all without making C too big like C++


(1) Written in C89, rewritten in "myC". It tooks a bit because I also cleaned it a bit, but it was not a difficult task.
« Last Edit: August 04, 2022, 11:20:05 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4416
  • Country: nz
Re: Are data type compiler-dependent or target dependent
« Reply #30 on: August 04, 2022, 11:29:32 pm »
This is why stdint was introduced in C99.


All this is not "insane", it all comes from the fact that the C standard has always had the goal of supporting a very wide range of targets while making it possible for compilers to produce efficient code.

Why do you always have to defend that mistakes of the c compiler?

Language definition, not compiler. And why "mistakes"?

Quote
Byte must be 8 bit to have a common and universally accepted reference, like 1000mm is universally accepted as 1m, otherwise you need to introduce other constant which only makes the c compiler messed up with more crap and people more confused.

You may be thinking of "octet".

"Byte" is usually 8 bits these days, but not universally so.

Quote
To really support a very wide range of targets aiming for true portability (which is utopia, anyway), char must be always equal to byte and sizeof(type) must be measured in 8bit modulo, and data type must be expressed with bit and sign.

And yet C is the most portable efficient language we have, and it doesn't do that.

Quote
What the frog is the meaning of sizeof(char)=1 if char is 16bit???

Every object in C is an exact multiple of char in size and every positive integer is a possible object size.

If sizeof(char) was 2 then that would imply you could have something with size 3. Or 1. But you can't.  Char is the smallest addressable unit, and the measure of all things. It is 1.

Quote
All other things made with c is pure garbage

Some of us like it and find it fits our purposes. I'm sorry it doesn't meet your needs.
 
The following users thanked this post: newbrain, MK14

Online newbrain

  • Super Contributor
  • ***
  • Posts: 1761
  • Country: se
Re: Are data type compiler-dependent or target dependent
« Reply #31 on: August 05, 2022, 09:13:51 am »
Quote
You have a piece of C code where you see "uint16_t ", the compiler won't accept it and returns

"sorry, this target doesn't support 16 bit datatype"

Which is super-clear, simple, and not bug-prone.
Oh, but the C standard already defines exact sized types as optional - C11, 7.20.1.1 Exact-width integer types, §3 (there since C99).
It only says that:
Quote
These types are optional. However, if an implementation provides integer types with widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed types) that have a two’s complement representation, it shall define the corresponding typedef names.

C23 slightly amended this, removing the 8-16-32-64 size reference (and the two's complement mention, as it's the mandatory signed integer representation):
Quote
If an implementation provides standard or extended integer types with a particular width and no padding bits, it shall define the corresponding typedef names

to exactly the same effect, but in a more general way.
CHAR_BIT is still at least 8, but can be more.

You can still have (u)int_least16_t (which will still be 32 bits) though.

So you reinvented an already well-thought-out wheel.  :-//

Quote
p_uint8, p_uint16, p_uint32, p_uint64,  (p_ means pointer)
Ok, I'll bite: what if I need a pointer to pointer? p_p_uint8?
Are these predefined typedefs or lexing/parsing tricks (i.e. the lexer will blindly consume 'p_' prefixes tokenizing them as '*')?
Nandemo wa shiranai wa yo, shitteru koto dake.
 
The following users thanked this post: Siwastaja, MK14

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #32 on: August 05, 2022, 10:21:39 am »
what if I need a pointer to pointer? p_p_uint8?

casting: banned
pointer to pointer: banned
pointer to function: allowed
custom typedef: allowed, but pointer to pointer-typedef: banned

Are these predefined typedefs or lexing/parsing tricks

defined at the parsing level, inserted into the "typedef world", so they are really built-in typedef.

i.e. the lexer will blindly consume 'p_' prefixes tokenizing them as '*')?

"*" is only allowed (at the parsing level) for multiplications, and only if the target has a multiply-unit

otherwise, nazy-rule out:

"sorry, this hardware doesn't have any hardware multiply, you have to use softmul"
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online newbrain

  • Super Contributor
  • ***
  • Posts: 1761
  • Country: se
Re: Are data type compiler-dependent or target dependent
« Reply #33 on: August 05, 2022, 10:50:32 am »
...(The Horror! The Horror!)...
Ah well, some say there's pleasure in pain - who am I to judge other people's perversions...still, I cannot escape a feeling of morbid fascination.

How do you cope with, say, arrays of strings (which will decay to pointers to pointers in most contexts, first of all as function parameters)?
Do you also forego all the standard library functions that take ** arguments (e.g. strtod())?
Nandemo wa shiranai wa yo, shitteru koto dake.
 
The following users thanked this post: MK14

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #34 on: August 05, 2022, 12:28:44 pm »
How do you cope with, say, arrays of strings

String8 and string16 are not a pointer-types, they are types class0 (basic type), therefore array of them is allowed as type class1(pointer type).

« Last Edit: August 05, 2022, 01:02:57 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #35 on: August 05, 2022, 12:40:47 pm »
If sizeof(char) was 2 then that would imply you could have something with size 3. Or 1. But you can't.  Char is the smallest addressable unit, and the measure of all things. It is 1.

Motorola 56000 is a very old and weird DSP. It never really had a C compiler until the architecture evolved into 56300, but there was an attempt to support 56000 during Gcc v2.95 era.

And you guess what? Operand sizes are defined as follows:
  • byte is 8 bits long
  • short word is16 bits long
  • word is 24 bits long
  • long word is 48 bits long
  • accumulator is 56 bits long
People gave up, and programmed it in assembly.


myC can easily solve the problem

  • byte --> uint8, sint8
  • short word --> uint16, sing16
  • word --> uint24, sint24
  • long word is 48 --> uint48, sing48
  • accumulator is 56 --> uint56, sint56
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #36 on: August 05, 2022, 12:43:59 pm »
Oh, but the C standard

but C90, 5.2.4.2.1 2.2.4.2.1 requires CHAR_BIT >= 8 and UCHAR_MAX >= 255. C89 uses a different section number but identical content.

They treat "char" and "byte" as essentially synonymous

What the fuck? Again  :palm:
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Offline Nominal Animal

  • Super Contributor
  • ***
  • Posts: 6788
  • Country: fi
    • My home page and email address
Re: Are data type compiler-dependent or target dependent
« Reply #37 on: August 05, 2022, 01:48:48 pm »
casting: banned
pointer to pointer: banned
pointer to function: allowed
custom typedef: allowed, but pointer to pointer-typedef: banned
You forget: You're not really working with C, you are working with an externally defined/dictated subset/superset of C.  You cannot really blame C for not being fit for those, can you?

I mean, if one could not pronounce the consonants r, t, or m, it wouldn't be the fault of the language they're speaking that others would have difficulty understanding them, right?  You could say it is stupid for languages to differentiate between the consonants L and R (as not all do), but it would not be a fair characterisation.  It is just a difficult situation compounded by a number of things, so we muddle through (grumbling as we go) the best we can.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #38 on: August 05, 2022, 02:18:41 pm »
You forget: You're not really working with C, you are working with an externally defined/dictated subset/superset of C.  You cannot really blame C for not being fit for those, can you?

Yes, I can!  :D

I can for all the time I wasted on Glibc on HPPA and MIPS because C favors bad programming practice, very prone to generate errors, misunderstandings(1), therefore bugs, I can because C desperately needs headers to correctly address its features, and again this generates errors, misunderstandings, therefore bugs;

Silly stuff like assuming char is unsigned when you don't explicitly define it. Last time it wasted 45 minutes of my time before I got it.
It was under an header that says

Code: [Select]
typedef char u8;

Fsk shit!!! 45 minutes of my life on that!

I'm working with myC not because I am masochist but rather because C sucks at certain things (to the point we had to invent MISRA to contain them) and people don't want to fix them at the language-level, and if nothing has changed until now, nothing will change in the near future.

Therefore I do blame C for all the flaws that people don't fix. I blame C because if those flaws are tolerable with mono cpu and common memory models, they become completely out of control, with machines like the R18200 and its tr-memory, and, worse still, because threads cannot be implemented as a C-library.

Oh, and if you think tr-memory is just something you'll never see in a consumer computer: my boss's POWER10 workstation has tr-memory in every POWER-core.

Now, if your try to handle it with C ... and you'll have a lot of trouble, inability to complete a single working program without wasting weeks of time on a debugger.



edit:
(1) here the problem is: glibc is regularly fixed only for mainstream hardware { x86, arm }, HPPA, MIPS, and SH are not so lucky, therefore you have to fix stuff yourself. Usually, a lot of these bugs are with headers, which are usually related to wrong size_t, wrong pointer size, wrong data-type size, wrong directive telling the compiler the wrong typedef, constant, etc. All silly stuff, that wastes a lot o time.
« Last Edit: August 05, 2022, 02:40:11 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online newbrain

  • Super Contributor
  • ***
  • Posts: 1761
  • Country: se
Re: Are data type compiler-dependent or target dependent
« Reply #39 on: August 05, 2022, 06:25:50 pm »
  • byte is 8 bits long
  • short word is16 bits long
  • word is 24 bits long
  • long word is 48 bits long
  • accumulator is 56 bits long
[...]
myC can easily solve the problem
As a said earlier, since C99 this is not a problem in C.
int8_t, int16_t, int24_t, int48_t and int56_t are all perfectly fine fixed size integers, together with their unsigned siblings.
No need to provide int32_t or int64_t, in fact, it might even be impossible on  that architecture (e.g. due to alignment constraints): exact size type must not have padding.

Then you have to provide the mandatory "least" and "fast" types:
int_least_8t -> int8_t
int_least_16t -> int16_t
int_least_32t -> int48_t
int_least_64t -> this needs to be implemented, not being native. Possibly as a 48+48.

yourC does not seem to have a real advantage here.

I'm still curious on how pointers to pointers are handled in library calls (unless you've thrown away parts of the standard library) and in other cases when you need to pass a pointer to a function that need to change it and give it back to the caller.

Of course one can wrap the target pointer in a structure, then it the actual argument becomes a pointer to a struct, that only contains a pointer to something else...it does not run afoul of the rules, just makes them pointless.

Or does yourC also allow passing by reference, à la C++?

Oh, and how is constantness (and volatility etc.)  addressed? Is it possible to declare a const pointer to a const object (and all the other combinations)?

Fascinating. I'll have to recite the 8 translation phases (5.1.1.2) ten times to cleanse my soul after staring into the abyss.
(Just joking. Not going to atone for other people's sins  >:D)
Nandemo wa shiranai wa yo, shitteru koto dake.
 
The following users thanked this post: MK14

Offline SiliconWizard

  • Super Contributor
  • ***
  • Posts: 15185
  • Country: fr
Re: Are data type compiler-dependent or target dependent
« Reply #40 on: August 05, 2022, 06:58:32 pm »
Designing a better C (I happened to have opened a thread dedicated to this a while ago and it unexpectedly got nowhere) is almost a lost cause. I know it sounds corny, but we now have 50 years to back this up.

Recent attempts that have claimed to be that while keeping the "spirit" of C (so no object orientation or any other fancy mix of paradigms) are actually relatively few. We can name Zig and Odin. Rust is definitely not that. While Zig is interesting, it just has its own set of problems. Odin also has interesting ideas, but it's a bit too opinionated to ever be widespread IMHO.

As I hinted and DiTBho confirmed, he actually likes Ada and would apparently like to use it more, his problem being that there is no Ada compiler for some of the platforms he targets. He doesn't need a "better C", he needs Ada.

So his prefered language has already been designed. No need to try and design another one, for which there wouldn't be any more compiler available anyway.

I personally find C pretty usable. For any advanced use, I definitely recommend reading the standard for whatever revision you're going to use. It's a must. And reading the latest revisions (C11, and even the C23 draft) could also give you a couple ideas and show you what "modern C" can bring to the table.

As to Ada, this is certainly a language that I would like to use, but not as is, at least not for most projects I work on. (I would have no problem using it for the super-critical stuff it's usually used for.)
So I would like some successor of Ada with only a subset of it (but which subset is the hard task), possibly a slightly more compact syntax, and a very limited or even no runtime required.

But that does not exist, and being pragmatic tells us that making the best of existing and well-established tools is way more productive than chasing after hypothetical ones.
 
The following users thanked this post: newbrain, MK14

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #41 on: August 05, 2022, 08:06:16 pm »
yourC does not seem to have a real advantage here.

I spent two years on this problem for two reasons
1) hacking Gcc in order to support "MIPS4++" is beyond my skills, and dealing with GNU guys is not ... sane
2) I had to solve tr-memory and concurrency problems with R18200, which is a weird beast

Cray is also odd, sizeof(char) = 1, but char is 32 bit, as well as short, int, and long.
MIPS4++, as is, exposes a similar problem: everything is 64 bit at the hardware level

uint8 comes through several tricks in the machine-level, but at least ... I have a couple of hardware instructions to correctly address a byte during Load/Store
Code: [Select]
size byte3 byte2 byte1 byte0  shift auto_mask    address_mask
08                         x  R(00) 0x0000.00ff  0xffff.fffc
08                   x        R(08) 0x0000.ff00  0xffff.fffc
08             x              R(16) 0x00ff.0000  0xffff.fffc
08       x                    R(24) 0xff00.0000  0xffff.fffc
16                   *** not supported ***
32       x     x     x     x  R(00) 0xffff.ffff  0xffff.ffff

thanks to this, I can still have byte-granularity, and I can also have 4byte granularity :D

tr-memory is even more problematic because it's 35bit (quad port), but you need to address it with 32bit modulo0, hence like a 32-bit-only memory.

Extra bits are "meta". The machine layer copies them into Reg14, and exposes them on demand through the operator ".meta", which doesn't return uint8 (casting is banned in myC) and it's only usable through a bit operator ".bit[]";

For the user it is really simple and intuitive

Code: [Select]
uint32_t addr;
tr_t trmem[4K_cells];
uint32_t data;
booleant_t bit_0;
booleant_t bit_1;
booleant_t bit_2;

addr = 0x8000.0000;
data = trmem[addr].data;
bit_0 = trmem[addr].meta.bit[0];
bit_1 = trmem[addr].meta.bit[1];
bit_2 = trmem[addr].meta.bit[2];

bit_sizeof(trmem); /* myC built-in operator, returns 35 */
bit_sizeof(trmem.data); /* myC built-in operator, returns 32 */
bit_sizeof(trmem.meta); /* myC built-in operator, returns 3 */
n_of_cells(trmem); /* myC built-in operator, return 4000 */

tr_t is built_in and accessible in every detail from the high level!
sizeof(type) makes no sense, and it has been removed.

when you need to pass a pointer to a function that need to change it and give it back to the caller

Code: [Select]
p_uint32_t do_xxxx
(
     p_uint32_t p_x
)
{
     p_uint32_t  ans;

     ans = p_x;
     return ans;
}

arguments

if you need a function that can modify an argument ... well it's not possible, it has been deliberately banned since the beginning, you have to create an object and pass its pointer, which means you have to re-thing your software design carefully, which is my purpose with other people tend to mess up code.

Code: [Select]
public void matrix_cell_is_fx1616
(
    p_matrix_t p_matrix
)
{
    p_matrix->context.method.cmp.isle = cmp_isle;
    p_matrix->context.method.cmp.islt = cmp_islt;
    p_matrix->context.method.cmp.isge = cmp_isge;
    p_matrix->context.method.cmp.isgt = cmp_isgt;
    p_matrix->context.method.cmp.is_0 = cmp_is_0;
    p_matrix->context.method.cmp.iseq = cmp_iseq;
    p_matrix->context.method.let.show = let_show;
    p_matrix->context.method.let.copy = let_copy;
    p_matrix->context.method.let.swap = let_swap;
    p_matrix->context.method.eval.add = eval_add;
    p_matrix->context.method.eval.sub = eval_sub;
    p_matrix->context.method.eval.mul = eval_mul;
    p_matrix->context.method.eval.div = eval_div;
    p_matrix->context.method.eval.mac = eval_mac;
    p_matrix->context.method.eval.msc = eval_msc;
    p_matrix->context.method.eval.rem = eval_rem;
    p_matrix->context.method.eval.abs = eval_abs;
    p_matrix->context.method.eval.clr = eval_clr;

Code: [Select]
    matrix_t        matrix;
    p_matrix_t      p_matrix;

    p_matrix = get_address(matrix);
    matrix_init(p_matrix, 4, 4, matrix0_data, matrix_cell_is_fx1616);

This is a polymorphic linear system solver written in myC: 70% is portable to C with a few modifications (thanks to a "compatibility header")

The code (cross)compiles on myC targeting the MIPS4++ R18200. It works correctly but the machine code is not optimized, because well ... there isn't yet an optimizer, myC only outputs -o0 assembly.

But it's stable and allows me to play with concurrency: I can split the LU decomposition in blocks, and performing the evaluations on four cores with results exposed on the tr-memory.

The function above modifies a lot of function pointers: this is the ONLY allowed way to modify a pointer in myC.

Is it possible to declare a const pointer to a const object (and all the other combinations)?

"const" is one of the deceiving word that I immediately banned as well as "break", "goto", "volatile", "static" and "extern".

myC v1: "Break" is banned when inside a loop, only allowed inside a switch case
myC v2: "Break" is entirely banned, switch case must use {}

Code: [Select]
switch ()
{
     case xxx /* <----- note xxx does no more look similar to a label, the symbol ":" is banned
          {
          }
     default xxx
          {
          }
}

"=", "==", "&", "&&", "!", "|", "||", "~", "^" are banned and replaced with operators.
if(..) and while(..) cannot accept an expression, only a boolean, this helps the ICE.

pointer arithmetic is banned, and when you need to dereference, you need to call the built-in operator dereference(..), which removes the ambiguity with "*" and helps the ICE





Personally, I have to say, myC is less frustrating than C by several order of magnitude, not because I made it, but rather because I made it exactly in the way it helps making the code clean and simple especially during my ICE-debugging sessions.
« Last Edit: August 05, 2022, 08:14:10 pm by DiTBho »
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 
The following users thanked this post: newbrain

Online newbrain

  • Super Contributor
  • ***
  • Posts: 1761
  • Country: se
Re: Are data type compiler-dependent or target dependent
« Reply #42 on: August 05, 2022, 08:44:04 pm »
Quote
I had to solve tr-memory and concurrency problems with R18200, which is a weird beast
Wow.
See attachment.
Nandemo wa shiranai wa yo, shitteru koto dake.
 

Offline TheCalligrapher

  • Regular Contributor
  • *
  • Posts: 151
  • Country: us
Re: Are data type compiler-dependent or target dependent
« Reply #43 on: August 05, 2022, 09:48:12 pm »
Are data type compiler-dependent or target dependent?

Absoltely _everything_ is compiler dependent (or, in more formal terms, implementation dependent). No exceptions.

However, compilers are not created in a vacuum. For the sake of efficiency, they do take into account specific properties of the target: hardware, OS, etc. But these are nothing more than considerations of common sense and efficiency. All of them can be ignored, circumvented and overriden by the compiler, should it become necessary for some reason.
« Last Edit: August 05, 2022, 09:50:19 pm by TheCalligrapher »
 

Offline westfw

  • Super Contributor
  • ***
  • Posts: 4281
  • Country: us
Re: Are data type compiler-dependent or target dependent
« Reply #44 on: August 05, 2022, 10:56:24 pm »
Quote
You can still have (u)int_least16_t
I think the _least and _fast types are probably vastly under-used, even by programmers that use int8_t and similar religiously.

Which vaguely makes sense, since they're UGLY.  Part of a programming language's features is supposed to be readability.  "char msg[] = "Hello World"; is obvious and easy to read.   The "const unsigned char msg[]" form that C++ wants you to use, considerably less so.  "const uint_least8_t msg[]" is pretty horrible.
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4416
  • Country: nz
Re: Are data type compiler-dependent or target dependent
« Reply #45 on: August 06, 2022, 04:03:12 am »
Quote
You can still have (u)int_least16_t
I think the _least and _fast types are probably vastly under-used, even by programmers that use int8_t and similar religiously.

Which vaguely makes sense, since they're UGLY.  Part of a programming language's features is supposed to be readability.

They are certainly ugly.

The solution (which there are many other reasons to follow) is to not sprinkle them throughout your code, but to typedef meaningful names and use them everywhere instead of built in C types e.g. "typedef uint_least16_t CustID; typedef NativeChar CharArr[]".

Quote
"char msg[] = "Hello World"; is obvious and easy to read.   The "const unsigned char msg[]" form that C++ wants you to use, considerably less so.  "const uint_least8_t msg[]" is pretty horrible.

All the more so because it's wrong on x86 and MIPS (?), either way, where char and string literals are signed.
 

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #46 on: August 06, 2022, 08:18:07 am »
_fast types

Probably only IBM guys love it.

If you program POWER10 or POWER11 machines you have hardware-loop instructions and it would be great if you can tell the compiler to make good use of them.
Code: [Select]
uint_fast32_t io; /* hope the compiler is smart, and  understand what I am telling here */
...
for (i0 = 0; i0 < n ; i0++) /* please, use a special loop register, and use a loop instruction */
{
         /* you also have use a general purpose register, i0 needs to go from 0 to n-1 */
         /* while the special loop register goes from n to 0 */
}
Why is it better? Well, because a loop hardware instruction never causes any wrong prediction, it's not an if then else branch, it's a down-counter-kind instruction, and when your pipeline is 14 or 20 stages, well it saves the pipeline 14 or 20 stages back.

You just like it when you do a big decomposition of the LUP matrix because you basically have three big nested loops :D
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 

Online brucehoult

  • Super Contributor
  • ***
  • Posts: 4416
  • Country: nz
Re: Are data type compiler-dependent or target dependent
« Reply #47 on: August 06, 2022, 08:31:52 am »
_fast types

Probably only IBM guys love it.

If you program POWER10 or POWER11 machines you have hardware-loop instructions and it would be great if you can tell the compiler to make good use of them.
Code: [Select]
uint_fast32_t io; /* hope the compiler is smart, and  understand what I am telling here */
...
for (i0 = 0; i0 < n ; i0++) /* please, use a special loop register, and use a loop instruction */
{
         /* you also have use a general purpose register, i0 needs to go from 0 to n-1 */
         /* while the special loop register goes from n to 0 */
}
Why is it better? Well, because a loop hardware instruction never causes any wrong prediction, it's not an if then else branch, it's a down-counter-kind instruction, and when your pipeline is 14 or 20 stages, well it saves the pipeline 14 or 20 stages back.

You just like it when you do a big decomposition of the LUP matrix because you basically have three big nested loops :D

There is only one CTR register (which lives in the instruction fetch unit, along with the PC and LR), so only the innermost loop can use it. CTR is also used for calling function pointers / C++ virtual functions, so if you're doing any of that in the inner loop (which of course you shouldn't be) then you don't get to use CTR as a loop counter at all.

CTR is the same size as any other integer register, so it's no great trick to get the compiler to use it. You simply need any counted loop that doesn't have another counted loop or indirect function call within it.

All this worked fine and as expected with the C/C++ compilers we had on PowerPC Macs 25 years ago.
 
The following users thanked this post: DiTBho

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #48 on: August 06, 2022, 08:44:14 am »
Wow.

yes, you will find nothing on Google, that's a big problem for me: no documentation.

MIPS R16000 was a product sold by MIPS for high-end SGI workstations like Fuel and Tezro, but you get nothing when you try to get the cpu datasheet or its user manual.

The last document available is for R10000, a bit less for R12000, and we know about R14000 thanks to all the reverse engineering done to support R12K on Linux.

R16K is pretty undocumented, and not exactly compatible to R14K
R18K is very different in several aspects.

R16K documentation exists, but it's not publicly available. You may find it on the underground internet, but it's not legal, and MIPS is a very aggressive company about its intellectual properties.

R18200 is a prototype, pretty dead, and nobody will ever use it, but at least I got with some documentation, even if it's the kind that can't be printed, saved, emailed, uploaded to your Kindle ...

Physically there is a big FPGA soldered on CPU module for the Atlas MIPS EVB board, which accepts MIPS32 and MIPS64 CPU modules.

I have no HDL code, but I have the ISA documentation plus a little document that tells about the bus implementation and a second document that tells everything about the tr-memory with 35 bit of data they implemented inside the fpga; there is nothing more, but hey? it's better than nothing.


So, I know everything about the motherboard, just a fraction about the CPU.



Tr-memory has a similar story on its background, but at least you can find something on the Wikipedia. Unfortunately it's an abstract article with no implementation detail. I am no sure the tr-memory implemented in my boss's POWER10 is the same as the one I am working on the R18200 prototype  :-//
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 
The following users thanked this post: newbrain

Offline DiTBho

  • Super Contributor
  • ***
  • Posts: 4221
  • Country: gb
Re: Are data type compiler-dependent or target dependent
« Reply #49 on: August 06, 2022, 09:42:15 am »
it's no great trick to get the compiler to use it

d'oh, that's probably the reason why IBM recommends and supports -mcpu=power10 -unroll-loops

It's set as default compiler(1) flag in their MMA demos (AI stuff), and - according to the readme.txt - it's the best trick for better performance even if it performs more aggressive duplication of loop bodies than the compiler normally would.

(1) GCC v11.2
The opposite of courage is not cowardice, it is conformity. Even a dead fish can go with the flow
 


Share me

Digg  Facebook  SlashDot  Delicious  Technorati  Twitter  Google  Yahoo
Smf