Author Topic: Design a better "C" (Read 37180 times)

SiliconWizard · « **on:** July 21, 2021, 08:34:41 pm »

Yeah I know this is a very recurring topic. But since we have talked a lot about C here, its strong points, and (very often) its shortcomings, I'm curious what you guys would propose for a "better C" language. Especially from the POV of people that are mostly EEs (and only a few of them are pure software guys, I think).

Syntax could be different. The key here would be to keep the essence of C: simplicity, portability, high-level and low-level enough, and the potential for wide acceptance. Meaning probably avoiding "uncomfortable" constructs or too exotic approaches, and, of course, making it usable on small embedded targets.

You can mention existing candidates (Rust comes to mind, but there are also attempts such as Zig and Odin, and lots of others...), but what would be interesting, instead of just mentioning them, would be to detail what you think is great in them for a C replacement, and what you think isn't that great. I'm sure many of you have considered alternatives over the years, and yet are still using C. So, getting to the bottom line of what would make this possible could be enlightening. Hopefully.

The final conclusion may very well be that there won't be a real replacement for C in areas it's used most these days. But some ideas could be interesting. Let's try not to turn this into a language war, it's just meant to be constructive.

Just_another_Dave · « **Reply #1 on:** July 21, 2021, 09:42:28 pm »

As an EE, the main reason for choosing a particular language is having a compiler available for the microcontroller that you are using. Most manufacturers just provide a compiler that supports C and therefore Most of the projects in which I have worked are written in C, but if they just supported another language (C++, Ada, etc) I would use it. The language is not that important, as programs that run in microcontrollers are normally not that big and therefore it is easier to maintain them (although some projects even require running an embedded Linux distro in the MCU. One of my professors said to us that we shouldn’t try to invent a new way to write an if but something new to use it for, and I have followed that recommendation since then.

However, there are some things that make some languages nicer. Standardization improves portability and being able to write inline assembly routines is a must if you’re working with DSPs or control systems that operate in real time. Therefore, it would be nice if interruptions where included in the C standard, as each manufacturer uses a different way to specify which functions should be mapped to them. Being able to specify the location in which variable should be used (for mapping registers) without using linker scripts would also be useful, as Ada allows.

On the other hand, nowadays FPGAs allow implementing application specific processors quite easily, so being able to easily write a compiler backend for them might become same thing important in the near future when choosing a programming language

Finally, I wouldn’t use a language which is not ISO standardized, nor an interpreted one. The first ones tend to change too fast breaking the compatibility of old programs with new compilers and the second ones are too slow for these applications. JIT compilation should also be avoided in real time applications as well as garbage collectors as they impede predicting the amount of time that a function will require for being executed. Additionally, in some industries, just ISO standardized languages are allowed, which is something that also needs to be taken into account when choosing it

PlainName · « **Reply #2 on:** July 21, 2021, 10:26:10 pm »

I wouldn't mind some basic C++ stuff. Just objects with inheritance. Perhaps overloading. Forget template stuff and pretty much everything else. Streams... well, if you must I suppose.

David Hess · « **Reply #3 on:** July 21, 2021, 11:12:44 pm »

I would fix the problem of operations returning the wrong result size; real multipliers return a product which is twice the size of the operands. I would add fixed point radix tracking; bookkeeping is exactly what computers are good at. Some old "improved" C compilers for DSP and embedded use did both of these.

AntiProtonBoy · « **Reply #4 on:** July 22, 2021, 12:06:29 am »

Quote from: dunkemhigh on July 21, 2021, 10:26:10 pm

I wouldn't mind some basic C++ stuff. Just objects with inheritance. Perhaps overloading. Forget template stuff and pretty much everything else. Streams... well, if you must I suppose.

Inheritance is overrated. Great for some niche stuff, but you are better off with the composition pattern. As for templates, they are awesome, especially when used together with concepts. But the best part of C++ is RAII.

james_s · « **Reply #5 on:** July 22, 2021, 12:11:13 am »

I quite like C as it is actually, but if there's one thing that comes to mind that I'd change, it would be to give it the ability to pass arrays to and from functions. Having learned Python first that is something that caused me some grief coding in C.

Siwastaja · « **Reply #6 on:** July 22, 2021, 05:53:37 am »

Removing "implementation defined" or "undefined behavior" aspects whenever possible.

Replacing them with explicit directives etc.

For example, remove strict aliasing rule, extend the restrict keyword if necessary to allow the same optimizations, and more. (It might be enough already as it is, to enable optimizations, people just need to start using it. I don't know.)

Adding many of the compiler flags / extensions to the language itself.

For example, instead of having to pack a struct to obtain the same result across different platforms, add official language directives that allow the same alignment results.

I won't suggest any large or fundamental changes because quite frankly, everybody has their own ideas about them, all of them have been tried multiple times, and for "some reason", C is still there but most of the "improved" languages have all but vanished.

Maybe some syntactical differences like being able to return multiple values from a function without specifying a single-use struct type for that, but these are not so important. Too many such changes easily alienate existing C developers because having to know many new rules is a mental burden. Think about C++ with so many rules it takes 20 years to learn it properly, then rest of your life to keep on track on all the new features on each new standard, and thus everyone using a different subset. Not a good situation IMO, we don't want this.

cfbsoftware · « **Reply #7 on:** July 22, 2021, 06:24:26 am »

Quote from: SiliconWizard on July 21, 2021, 08:34:41 pm

Syntax could be different. The key here would be to keep the essence of C: simplicity, portability, high-level and low-level enough, and the potential for wide acceptance. Meaning probably avoiding "uncomfortable" constructs or too exotic approaches, and, of course, making it usable on small embedded targets.

NOTE: I have to declare that I have a commercial interest in this topic as I am the creator of Astrobe - An IDE for ARM Cortex-M Microcontrollers and Xilinx FPGA Systems so it is difficult not to be biased.

Oberon-07 is my language of choice for programming microcontrollers. A summary of the features that make it suitable for this work is here:

https://www.astrobe.com/Oberon.htm

Elektor Magazine ran an article a few years ago:

https://www.elektormagazine.com/labs/easy-sceptre-programming-with-oberon-07

I have attempted to get some objective measure of the comparative reliability of C and Oberon-07 when used for embedded software development. In order to do this I referred to the MISRA C guidelines:

>From Wikipedia: "MISRA C is a set of software development guidelines for the C programming language developed by MISRA (Motor Industry Software Reliability Association). Its aims are to facilitate code safety, security, portability and reliability in the context of embedded systems"

I took the 142 rules of the MISRA-C:2004 "Guidelines for the use of the C language in critical systems" and applied them to Oberon-07. I discovered that more than 70% of the rules are NOT required when programming in Oberon-07. They are either already enforced by the language or are not applicable.

Just_another_Dave · « **Reply #8 on:** July 22, 2021, 07:09:00 am »

Quote from: Siwastaja on July 22, 2021, 05:53:37 am

Removing "implementation defined" or "undefined behavior" aspects whenever possible.

Replacing them with explicit directives etc.

Implementation defined int size is really a problem. Even though uint_fast8_t and similar types solve that partially, it would be nice to be able to specify the minimum amount of bits that each variable should have (as far as I know, float numbers do not have that possibility). That would reduce a lot of headaches when porting a program to a different microcontroller

Siwastaja · « **Reply #9 on:** July 22, 2021, 07:32:11 am »

Quote from: Just_another_Dave on July 22, 2021, 07:09:00 am

Implementation defined int size is really a problem. Even though uint_fast8_t and similar types solve that partially, it would be nice to be able to specify the minimum amount of bits that each variable should have (as far as I know, float numbers do not have that possibility). That would reduce a lot of headaches when porting a program to a different microcontroller

Optional ranges, like in VHDL.

int range -1234,2345 ivar;
float range -12.3,+999.3 fvar;

enabling the compiler to do static error checking, optimization with assumptions about the numerical range, and obviously, choose optimum storage size for implementation defined size storage (like the current _fast_t).

Also makes the programmer document their assumptions way better than having to use free form comments to do that.

Inside structs, bitfields exist already, all that is needed to remove arbitrary ordering and padding which is currently implementation defined, and standardize one way of doing it.

Current standard is utterly stupid because it allows arbitrary padding of structs for performance reasons, but does not allow reordering. Have an explicit directive tell that exact binary representation either is necessary, or isn't. In latter case, allow reordering, optimizing away, or automatic "unionization" of members; compilers are pretty capable of static analysis today. Even just the automatic reordering would allow the classical "in decreasing size" ordering (with some simplistic cache line analysis) while maintaining the readability of the code.

Basically, at general level, directives for the programmer to describe whether certain memory layout is desired, or if the compiler can do whatever it wants to obtain the functional result. The current standard is inconsistent here, making C a mixed high level functional / low level near-hardware language with a few traps.

tggzzz · « **Reply #10 on:** July 22, 2021, 10:38:57 am »

Quote from: Just_another_Dave on July 21, 2021, 09:42:28 pm

the second ones are too slow for these applications. JIT compilation should also be avoided in real time applications as well as garbage collectors as they impede predicting the amount of time that a function will require for being executed.

Presumably you also only use processors and memory systems which don't have any statistical optimisation techniques. They significantly improve performance on average, at the expense of making it impossible to predict worst-case behaviour. The prime example is caches, but also include interrupts.

PlainName · « **Reply #11 on:** July 22, 2021, 10:47:28 am »

Quote

Adding many of the compiler flags / extensions to the language itself.

For example, instead of having to pack a struct to obtain the same result across different platforms

That kind of flag tends to be hardware-dependent though, doesn't it? Or, at least, at a compiler (as opposed to language) level. Although it would be pretty unlikely to find that specific thing unsupported, the potential is there (just as a byte may not be 8 bits).

Ed.Kloonk · « **Reply #12 on:** July 22, 2021, 10:54:45 am »

Quote from: james_s on July 22, 2021, 12:11:13 am

I quite like C as it is actually, but if there's one thing that comes to mind that I'd change, it would be to give it the ability to pass arrays to and from functions. Having learned Python first that is something that caused me some grief coding in C.

I thought that Python was intended to suppliment C for the reasons that the OP mentioned.

DiTBho · « **Reply #13 on:** July 22, 2021, 10:58:17 am »

Quote from: dunkemhigh on July 22, 2021, 10:47:28 am

That kind of flag tends to be hardware-dependent though, doesn't it?

Yes, and it's a mess ... too many flags for ARM, many many more for MIPS, and they are all different.

Siwastaja · « **Reply #14 on:** July 22, 2021, 10:59:38 am »

Quote from: dunkemhigh on July 22, 2021, 10:47:28 am

That kind of flag tends to be hardware-dependent though, doesn't it? Or, at least, at a compiler (as opposed to language) level. Although it would be pretty unlikely to find that specific thing unsupported, the potential is there (just as a byte may not be 8 bits).

No, I mean like a keyword to enable a padding rule of self-aligning all variables and aligning the whole struct by 8. Such setting would work in 99.999% actually existing platforms and also be very efficient.

Because even now, if you use stdint.h for every variable and forget the packed attribute, chances are high it still works through most architectures because the padding rules are mostly the same regardless of the word with: just add padding bytes after each variable so that the next one self-aligns. Some architectures may allow more loose alignment than self-alignment, for example a float starting at any addr%4, or a 8-bit architecture can allow any variable to be unaligned because accessing them is the same amount of work anyway. But self-alignment is practically never wrong, and if you could force the compiler to do that, then you wouldn't have the stupid choice between potential binary protocol incompatibility (no attributes), or potential performance issues (packed attribute). Now we are forced to use packed and manually add padding in such cases. Maybe this isn't bad because explicit over implicit, though...

I haven't thought this over enough to say if it's practical or a good idea. Just popped in my mind.

Just_another_Dave · « **Reply #15 on:** July 22, 2021, 11:40:13 am »

Quote from: tggzzz on July 22, 2021, 10:38:57 am

Quote from: Just_another_Dave on July 21, 2021, 09:42:28 pm
the second ones are too slow for these applications. JIT compilation should also be avoided in real time applications as well as garbage collectors as they impede predicting the amount of time that a function will require for being executed.

Presumably you also only use processors and memory systems which don't have any statistical optimisation techniques. They significantly improve performance on average, at the expense of making it impossible to predict worst-case behaviour. The prime example is caches, but also include interrupts.

In general, processors used in control loops that need to operate in real time don’t have most of those things. Interrupts are the only exception, so they need to be used carefully to avoid blocking the main loop. However, as they are frequently used to react to critical events, in most projects in which I have worked it’s not so important to predict the amount of time required to execute the main loop, but to ensure that the amount of cycles required to execute each interrupt falls within an specified range.

Additionally, most interrupts are usually periodic (for example a timer) and the others frequently came from protection systems, so they are more important that anything else.

More complex systems don’t normally have such demanding requirements in terms of reaction time, allowing to respond to the received stimulus in hundreds of milliseconds or even more. Python would still be too slow for them (it won’t be able to execute the program fast enough to meet the specifications even if you could predict how much time would be required), but a real time os can be used even though it can add certain degree of uncertainty.

Regarding JIT, variable delays can be a nightmare for control loops stability. Therefore, it is better to do things slower but with a constant speed than to apply optimizations iteratively

tggzzz · « **Reply #16 on:** July 22, 2021, 12:05:36 pm »

Quote from: Just_another_Dave on July 22, 2021, 11:40:13 am

Quote from: tggzzz on July 22, 2021, 10:38:57 am
Quote from: Just_another_Dave on July 21, 2021, 09:42:28 pm
the second ones are too slow for these applications. JIT compilation should also be avoided in real time applications as well as garbage collectors as they impede predicting the amount of time that a function will require for being executed.

Presumably you also only use processors and memory systems which don't have any statistical optimisation techniques. They significantly improve performance on average, at the expense of making it impossible to predict worst-case behaviour. The prime example is caches, but also include interrupts.

In general, processors used in control loops that need to operate in real time don’t have most of those things. Interrupts are the only exception, so they need to be used carefully to avoid blocking the main loop. However, as they are frequently used to react to critical events, in most projects in which I have worked it’s not so important to predict the amount of time required to execute the main loop, but to ensure that the amount of cycles required to execute each interrupt falls within an specified range.

Additionally, most interrupts are usually periodic (for example a timer) and the others frequently came from protection systems, so they are more important that anything else.

More complex systems don’t normally have such demanding requirements in terms of reaction time, allowing to respond to the received stimulus in hundreds of milliseconds or even more. Python would still be too slow for them (it won’t be able to execute the program fast enough to meet the specifications even if you could predict how much time would be required), but a real time os can be used even though it can add certain degree of uncertainty.

Regarding JIT, variable delays can be a nightmare for control loops stability. Therefore, it is better to do things slower but with a constant speed than to apply optimizations iteratively

For loop control stability, any form of timing variability is a nightmare!

I've seen too many people not understand the effects of caches.

I've seen too many people think isolated unit tests created via TDD are sufficient. "You can't inspect/test quality into product; it has to be designed."

Siwastaja · « **Reply #17 on:** July 22, 2021, 12:38:50 pm »

tggzzz's similar points have been addressed before, but I repeat that the key to success with non-deterministic timing is to keep thing simple enough that it's easy to prove the worst case being good enough, and add some margin on the top of that. With today's performant parts, adding a lot of margin is usually trivial.

Increasing the amount of margin reduces uncertainties due to small mistakes we human beings make. While you could model the crystal structure of steel and control the melting process perfectly, obtaining parts with exact material thickness enough to last, in reality ample safety margins are added. Same goes with programming.

Practical example: You need a peak current limit. Available voltage is 100V and lowest possible inductance under maximum possible current and worst case temperature is 100µH. Now current rises at 100V/100µH = 1A/µs. Taking comparator offset voltages into account, the current sense comparator is guaranteed to generate output when exceeding 10A. Inductor saturation current (at which the 100µH inductance was calculated) is 15A. There is hence 5µs of time to react. Say comparator delay is 1µs, and the MOSFET gate driver delay is 100ns. Software has to be able to trig the interrupt, enter it, change the IO state in 3.9µs.

Deciding you only disable interrupts for critical sections of max. 10 clock cycles, Cortex-M interrupt latency being 12 cycles, and the relevant IO operation placed right at the beginning of the ISR, taking say 3 clock cycles, plus let's say another 2 for crossing the clock boundary from the CPU core to GPIO domain, the delay is 27 clock cycles. Say at Fcpu=100MHz, this is 0.27µs, way less than 3.9µs. 3.9µs would be 390 clock cycles, and ****cking the code up so badly requires fatal flaws in the designer's skill of building software-controller power supplies. Now a Cortex-M4 or -M7 might optimize the interrupt entry because a lower priority interrupt has already started stacking, so sometimes it's faster than this, making it non-deterministic and jittery. But in this case, having it "too fast" is impossible.

Using CPUs with configurable interrupt priorities ease the task. It's very rare the possible tree of nested interrupts is so complex you have hard time analysing it. Usually you don't have gazillion important things that all require actions within a few clock cycles, if you do, consider a CPLD / an FPGA.

Caches - don't need them in timing critical control applications. Turn off, if not already disabled at startup. These top-of-the-line MCUs which are fancy enough to come with caches, also tend to come with core-coupled memories so you can run the timing critical code off there, with perfectly deterministic - and FAST! - memory access timing. Even if you enable caches (for example, to speed up the average performance of your bloated UI), critical interrupt code placed in ITCM does not have any penalty from this.

tggzzz · « **Reply #18 on:** July 22, 2021, 01:10:57 pm »

Quote from: Siwastaja on July 22, 2021, 12:38:50 pm

tggzzz's similar points have been addressed before, but I repeat that the key to success with non-deterministic timing is to keep thing simple enough that it's easy to prove the worst case being good enough, and add some margin on the top of that. With today's performant parts, adding a lot of margin is usually trivial.

That's easy and sufficient iff you can prove the worst case time. How do you prove the worst case time (a) without caches (b) with caches?

To me "measurement" isn't "proof", although it can be "validation". You can't test quality into a product!

Quote

Using CPUs with configurable interrupt priorities ease the task. It's very rare the possible tree of nested interrupts is so complex you have hard time analysing it. Usually you don't have gazillion important things that all require actions within a few clock cycles, if you do, consider a CPLD / an FPGA.

Caches - don't need them in timing critical control applications. Turn off, if not already disabled at startup. These top-of-the-line MCUs which are fancy enough to come with caches, also tend to come with core-coupled memories so you can run the timing critical code off there, with perfectly deterministic - and FAST! - memory access timing. Even if you enable caches (for example, to speed up the average performance of your bloated UI), critical interrupt code placed in ITCM does not have any penalty from this.

Turning off instruction and data caches is a valid approach - but it implies a large area of silicon is unused, and that a cheaper and lower power alternative is probably available.

Interrupts bring their own problems (see the Mars Pathfinder!), which can be very difficult to analyse and get right as features are added.

Siwastaja · « **Reply #19 on:** July 22, 2021, 04:28:13 pm »

Quote from: tggzzz on July 22, 2021, 01:10:57 pm

That's easy and sufficient iff you can prove the worst case time. How do you prove the worst case time (a) without caches

Cortex-M interrupt latency is reliably specified, guaranteed by ARM.

Vector table can be relocated in ITCM or RAM, ISR itself can be in ITCM. Resulting access patterns are simple.

Output assembly listing, count clock cycles of instructions that lead to the important operation. Instruction set manual gives the range of clock cycles a particular instruction takes.

Additionally, go through all critical sections that run interrupts disabled; count the clock cycles. This is the worst-case delay before the interrupt latency.

If a higher-priority interrupt can pre-empt a less urgent but still timing-critical interrupt, add the maximum duration of the higher-priority interrupt. This highly motivates to keep the highest priority ISRs short because counting cycles from an assembly listing of a complex state horror with a lot of branches will be a nightmare. If the hi-prio ISR needs to be complex, you can split it in two, by demoting the interrupt priority by generating a software interrupt request with lower priority from the hi-prio ISR. This way, you can do the absolutely timing critical thing first, then continue more complex code at lower priority so it can be pre-empted by "medium-importance" stuff.

Run all such code from ITCM. Running from flash is OK too, just slower and jittery, you can still guarantee worst-case longest timing by assuming the number of configured wait states happen for every read; real-world average performance will be just better.

Yes, this is some manual work, but usually you don't need to go this far too often because in real world projects, not every signal, not every logical operation is so sensitive to timing.

All of this manual work has a risk of human error. Measurement can be used as a verification step, as you note. Margins are important in any case, against small errors. If you write low level C, how to do it right, don't use random bloated libraries, and finally, check the assembly listing, there just is no way expected 20 cycles becomes 200 cycles.

Everything in real world has range for errors. You don't pilot an airliner at millimeter accuracy.

Quote

(b) with caches?

Assume cache miss for every read. Actual average (and possibly even worst-case, but you can't trust this) is way way better but hey, you get your worst-case guarantee.

This being said, I don't understand why you are bringing up the caches so eagerly. Cycle-accurate, timing critical microcontroller code just isn't ran from a slow memory through cache, it makes no sense. Indeed, none of my projects on STM32F7 or H7 enable caches; they are wasted silicon. The fact these top-of-the-line micros include caches which are "wasted" silicon if not used is a moot point because such top tier micros have dozens and dozens of features being unused. For example, H7 has two CAN peripherals which take up quite some area because they have dedicated message RAMs roughly similar size to what a cache could be, yet you don't always need CAN. Same can be said about area used by Ethernet, USB, and so on. Custom ASIC for everything would be technically optimal solution but obviously impossible.

tggzzz · « **Reply #20 on:** July 22, 2021, 04:44:59 pm »

Quote from: Siwastaja on July 22, 2021, 04:28:13 pm

Quote from: tggzzz on July 22, 2021, 01:10:57 pm
That's easy and sufficient iff you can prove the worst case time. How do you prove the worst case time (a) without caches

Cortex-M interrupt latency is reliably specified, guaranteed by ARM.

Vector table can be relocated in ITCM or RAM, ISR itself can be in ITCM. Resulting access patterns are simple.

Output assembly listing, count clock cycles of instructions that lead to the important operation. Instruction set manual gives the range of clock cycles a particular instruction takes.

Additionally, go through all critical sections that run interrupts disabled; count the clock cycles. This is the worst-case delay before the interrupt latency.

If a higher-priority interrupt can pre-empt a less urgent but still timing-critical interrupt, add the maximum duration of the higher-priority interrupt. This highly motivates to keep the highest priority ISRs short because counting cycles from an assembly listing of a complex state horror with a lot of branches will be a nightmare. If the hi-prio ISR needs to be complex, you can split it in two, by demoting the interrupt priority by generating a software interrupt request with lower priority from the hi-prio ISR. This way, you can do the absolutely timing critical thing first, then continue more complex code at lower priority so it can be pre-empted by "medium-importance" stuff.

Run all such code from ITCM. Running from flash is OK too, just slower and jittery, you can still guarantee worst-case longest timing by assuming the number of configured wait states happen for every read; real-world average performance will be just better.

Yes, this is some manual work, but usually you don't need to go this far too often because in real world projects, not every signal, not every logical operation is so sensitive to timing.

Quote
(b) with caches?

Assume cache miss for every read. Actual average (and possibly even worst-case, but you can't trust this) is way way better but hey, you get your worst-case guarantee.

This being said, I don't understand why you are bringing up the caches so eagerly. Cycle-accurate, timing critical microcontroller code just isn't ran from a slow memory through cache, it makes no sense. Indeed, none of my projects on STM32F7 or H7 enable caches; they are wasted silicon. The fact these top-of-the-line micros (like STM32H7) include caches which are "wasted" silicon if not used is a moot point because such top tier micros have dozens and dozens of features being unused. For example, H7 has two CAN peripherals which take up quite some area because they have dedicated message RAMs roughly similar size to what a cache could be, yet you don't always need CAN. Same can be said about area used by Ethernet, USB, and so on.

Where the interrupt latency is guaranteed by design and specified in the datasheet, that is fine and practical. That's only a small part of real-time operation, of course. Of more interest is the maximum repetition frequency of the outer control loop in the entire system, and will have far more instructions.

Cycle counting is theoretically possible, but I doubt it is practical unless the design toolset does the counting for you.

Good luck trying to cycle count any operations inside an RTOS, e.g. thread swapping, mailboxes or semaphores with priority inheritance enabled.

Siwastaja · « **Reply #21 on:** July 22, 2021, 04:54:52 pm »

That is why I don't prefer to use RTOSes and don't understand people who market them as the best things since sliced bread for timing critical control. They bring poor concepts and abstractions (like threads) to describe parallelism, developed for non timing critical desktop/server apps, and known to be colossally difficult even there.

This being said, when I get such market speech and ask for examples, they get so excited to tell me they did something which happens in a millisecond!!! I count time in nanoseconds thank you.

Cycle counting the whole program is of course impractical, but often the really timing-critical part is only a dozen of instructions with no branches.

Plain old interrupt hierarchy on bare metal is so much easier to analyse, IMHO. Just keep things simple.

If large parts of control need to be totally jitter-free, i.e., minimum allowed == maximum allowed, then this is still possible but becomes so impractical and tedious on a general purpose MCU that indeed go to FPGA or your favorite XCORE.

It's just my observation that such totally jitter-free controls are rare, almost always it's enough to guarantee some worst-case maximum time, and minimum time can be shorter.

But this is again drifting off-topic but oh well that always happens anyway.

SiliconWizard · « **Reply #22 on:** July 22, 2021, 06:01:47 pm »

Note: many of the points raised by Siwastaja - and others - are addressed in ADA.

Actually, I would likely use ADA if:
- It was more lightweight;
- It could be used without a runtime (I know this point can be mitigated, you can design very minimal runtimes for ADA, but still...);
- Related: writing your own runtime would be required if you want to do anything customized and/or use targets that do not have existing runtimes yet - doable but not completely trivial;
- Code efficiency was a bit better...

I don't mind the syntax. "Wirthian" languages are fine with me. But something a bit simplified wouldn't hurt.

@cfbsoftware: I have considered Oberon actually. There's a lot to like about it. But I think Wirth went a bit too far with it. He removed so much that some things are even clunkier to express than with C. Also, all low-level stuff is supposed to be handled by the SYSTEM module - which is an approach that Wirth always cherished - but it does make writing portable low-level stuff very clunky.

I have looked at other similar languages such as Component Pascal, Modula-3, etc. All are interesting in their own way, but, not quite there. At least for me. Worth a look though.

One thing I often mention and that those languages have: support for modules. This is good. I f*cking want modules. (I think it's being discussed for C++?) You can have a look at D: its module implementation is nice.

bd139 · « **Reply #23 on:** July 22, 2021, 06:32:26 pm »

Ok I like this thread.

Don't change C. Don't make a better C. C is the perfect C.

A little story recently. Higher level and slightly more idealistic languages impose more constraints on the user which they are not aware of due to the preferred way of doing things. They might not be the right way based on what you're trying to achieve. Most software is fashion driven as well so most alternatives are people wandering in the forest looking for a sexy looking tree to spend the night rubbing up against...

Recently I was asked to fix a participation tracking system for consolidating order commissions. This is a fairly simple problem in the scale of things but the dataset is quite large (3GB CSV). The first attempt was some junk in python with AWS lambda. That eventually cost a lot of money to run. The next attempt was written in python and ran on a proper server but it was slow so they added pooling and parallelism to utilise all the machine resources. This worked until they ran out of RAM. After profiling it I found it was spending about 60% of the wall clock time futzing organising the conceptual abstraction of the data structures which were lists and objects. I pictured the whole thing as a bunch of monkeys opening and closing filing draws and moving bits of paper around.

It took 25 minutes to process it and it used 24Gb of RAM. Code was 12Kb.

I sat down and thought about it from a machine perspective and came up with mmapping the source file, parsing the source file, processing it and then writing the output into a spare structure using malloc and overcommitting then jumping through the links in that and writing it out to final destination file. This inevitably ended up in C because it was closest to the API I was using (Linux libc)

It took 18 seconds to process it and it used 4.2Gb of RAM. Code was 8Kb.

The best thing is the entire problem took me about 4 hours to solve using tools that were already on my computer (gcc, vim, make, valgrind) and are fast, efficient and have a really really really low iteration time.

For this I suffer some loss of safety and correctness but gain the ridiculous amount of power from throwing off the shackles of not being trusted near any of that RAM stuff...

Epilogue: I'm waiting for them to work out how to run my C program in lambda next

tggzzz · « **Reply #24 on:** July 22, 2021, 08:16:11 pm »

Quote from: Siwastaja on July 22, 2021, 04:54:52 pm

That is why I don't prefer to use RTOSes and don't understand people who market them as the best things since sliced bread for timing critical control.

NASA aren't afraid of a commercial RTOSs for controlling Martian rovers. Having one enabled them to "rescue" the Mars Pathfinder

Quote

They bring poor concepts and abstractions (like threads) to describe parallelism, developed for non timing critical desktop/server apps, and known to be colossally difficult even there.

That is simply wrong.

Quote

This being said, when I get such market speech and ask for examples, they get so excited to tell me they did something which happens in a millisecond!!! I count time in nanoseconds thank you.

When you mention µs and ns (and jitter), you appear to be confusing "real time" and "fast".

Quote

Cycle counting the whole program is of course impractical, but often the really timing-critical part is only a dozen of instructions with no branches.

Plain old interrupt hierarchy on bare metal is so much easier to analyse, IMHO. Just keep things simple.

If large parts of control need to be totally jitter-free, i.e., minimum allowed == maximum allowed, then this is still possible but becomes so impractical and tedious on a general purpose MCU that indeed go to FPGA or your favorite XCORE.

It's just my observation that such totally jitter-free controls are rare, almost always it's enough to guarantee some worst-case maximum time, and minimum time can be shorter.

But this is again drifting off-topic but oh well that always happens anyway.

In a real time system the results of a computation have to be produced before a deadline; if they miss the deadline then that is a failure. The problem is proving that they will be produced before the deadline, whether that deadline is 1ns, 1s or 86400s!

Jitter is a separate issue, and should not be confused with speed.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: Design a better "C" (Read 37180 times)

Share me