Author Topic: Get started with C programming (Read 17258 times)

97hilfel · « **Reply #50 on:** March 08, 2016, 04:15:36 pm »

Well, its nice to see that Dave got such a big and good community. This Weekend Im going to try some examples, and baybe I try to get a UART module running... Would be a big goal.

Gesendet von iPhone mit Tapatalk

MSO · « **Reply #51 on:** March 09, 2016, 12:44:51 am »

Run time interpreters (RTI) such as BASIC and Java give up speed and compactness in favor of convenience. Instead of edit, compile, link and execute, they offer the edit and execute paradigm which can greatly speed up development.

Most RTIs utilize various strategies to overcome the loss of execution speed. These strategies include pre-compilation to byte codes and/or they will gather statistics during execution that can point out ‘hot spots’ in the code that can be compiled into machine code which is then cached for future use.

Neither of these strategies comes for free. The pre-compilation into byte codes results in a start up delay and the JIT statistics is always sucking up machine cycles in addition to those used by the program itself. Both, however, can substantially speed up program execution, sometimes even approaching 50% to 60% of the speed of the statically compiled program.

There exist a few cases where JIT can improve upon a statically compiled program such as when pattern matching and regular expressions are the goal of the program. In these cases, it is the nature of the input data itself that determines execution speed. Since the input data isn’t seen until runtime, the static compiler won’t take the data into account unless it also uses the output of a data profiler while compiling.

Such techniques aren’t often used as with statically compiled programs the vast majority of software would suffer a loss in speed with little to no likelihood of improvement.

An RTI could gain portability by generating a hardware independent byte code that could be distributed to computers with different CPUs. A local RTI could read those byte codes and translate them into machine language for the specific host computer.

All in all, RTIs are doing very well if they can execute programs at half the speed of statically compiled programs.

Howardlong · « **Reply #52 on:** March 09, 2016, 08:16:29 pm »

Quote from: MSO on March 09, 2016, 12:44:51 am

Neither of these strategies comes for free. The pre-compilation into byte codes results in a start up delay and the JIT statistics is always sucking up machine cycles in addition to those used by the program itself. Both, however, can substantially speed up program execution, sometimes even approaching 50% to 60% of the speed of the statically compiled program.

In certain very rare edge cases and contrived cases possibly, but I can't say I've ever seen any evidence of it in the real world. The slow start up though is an endemic problem. I've never really understood the point of recompiling new code each time you run a program if you're not doing run time profiling. It's not like the magic CPU changing fairies have been out in the middle of the night or they were doing run time profiling the last time the code was run. Certainly Microsoft introduced AOT (ahead of time) compilation very very early on in .NET's life when they found pretty much any non-trivial solution ran so slowly at startup.

Quote

There exist a few cases where JIT can improve upon a statically compiled program such as when pattern matching and regular expressions are the goal of the program. In these cases, it is the nature of the input data itself that determines execution speed. Since the input data isn’t seen until runtime, the static compiler won’t take the data into account unless it also uses the output of a data profiler while compiling.

This has been de rigeur on database engines for a couple of decades, using the data distribution as input to optimisers in near real time which frequently recompiles code in certain scenarios. This is a double edged sword. While it works a lot of the time it also introduces subtleties which makes those times it doesn't work even more difficult to fix and work around to optimise performance. Although very clever, it makes a lot of assumptions, and frequently bases its assumptions on pseudo-randomly selected but non-representative data samples which sometimes leads to bad compilation plans.

Key though is the determinism, or the lack of it. Not only does this frustrate end users, one of the biggest problems in troubleshooting is creating reproducible scenarios. When the rug's being contually being pulled from under your feet when the compiler decides to take a different route depending on a butterfly flapping its wings in Peru, it just makes the task another order of magnitude harder. Sometimes it's a case of less is more.

Boomerang · « **Reply #53 on:** March 10, 2016, 06:54:37 am »

It could be very simple:

1. You compile your source code to some P-code, byte code or whatevere the name is. This is done only once.

2. When you install/deploy the program on specific machine - the byte code is converted to machine code. This is done also only once per machine.

ade · « **Reply #54 on:** March 14, 2016, 02:58:33 pm »

C vs C++ is really about two programming styles: modular (C) vs. object-oriented (C++). They are very different paradigms.

Choosing one style over the other has a fundamental impact on how you design, organize and implement your code, with pros and cons depending on the specific situation.

Many C programmers cannot successfully make the transition to C++. They end up superficially using C++ constructs like classes, but essentially continue to program in modular style.

On the flip side, as someone mentioned earlier, it's possible to apply object-oriented concepts in plain C, but that requires a lot of manual work. Many elements the Unix/Linux/BSD kernel are conceptually object-oriented.

Use C when you think modular style is the best style for the problem at hand. Use C++ when you think object-oriented methods will benefit the design.

For the younger amongst us, it is advisable to learn both programming styles properly, as well as other programming paradigms such as functional programming.

nctnico · « **Reply #55 on:** March 14, 2016, 03:08:07 pm »

Quote from: ade on March 14, 2016, 02:58:33 pm

C vs C++ is really about two programming styles: modular (C) vs. object-oriented (C++). They are very different paradigms.

Use C when you think modular style is the best style for the problem at hand. Use C++ when you think object-oriented methods will benefit the design.

I disagree. There are many things in C++ which are better than C. For example being able to pass a variable by reference, variable sizes arrays, a proper string type, the STL library, etc. Even if you program modular using the C++ features can help a lot to make a program more robust.
As you wrote you can program modular and object oriented in both C and C++ so the choice between C or C++ isn't driven by the programming paradigm (modular or object based).

tggzzz · « **Reply #56 on:** March 14, 2016, 03:18:26 pm »

Quote from: Boomerang on March 10, 2016, 06:54:37 am

It could be very simple:

1. You compile your source code to some P-code, byte code or whatevere the name is. This is done only once.

2. When you install/deploy the program on specific machine - the byte code is converted to machine code. This is done also only once per machine.

That is no different to conventional static compilation and optimisation, based on what the compiler is allowed to guess about the code sequences and data accesses. With some languages, C/C++ especially, the possibility of aliased references requires the compiler to make pessimal assumptions - ask anybody in the High Performance Computing community (and that is one reason why they prefer Fortran).

The advantage of dynamic compilation techniques (e.g. HotSpot, Dynamo, database queries) is that the optimisations are based on what is actually happening in your code and data. As with any optimisation technique, there are downsides - but is a significant number of cases the advantages outweigh the disadvantages.

tggzzz · « **Reply #57 on:** March 14, 2016, 03:22:05 pm »

Quote from: Howardlong on March 09, 2016, 08:16:29 pm

Key though is the determinism, or the lack of it. Not only does this frustrate end users, one of the biggest problems in troubleshooting is creating reproducible scenarios. When the rug's being contually being pulled from under your feet when the compiler decides to take a different route depending on a butterfly flapping its wings in Peru, it just makes the task another order of magnitude harder. Sometimes it's a case of less is more.

I presume you do turn off caches, because cache misses (either L1/2/3 or TLB) can cause your program to slow down by an order of magnitude. And that can happen because unrelated processes "evict" your data or virtual addresses from the caches.

Howardlong · « **Reply #58 on:** March 14, 2016, 06:09:13 pm »

Quote from: tggzzz on March 14, 2016, 03:22:05 pm

Quote from: Howardlong on March 09, 2016, 08:16:29 pm
Key though is the determinism, or the lack of it. Not only does this frustrate end users, one of the biggest problems in troubleshooting is creating reproducible scenarios. When the rug's being contually being pulled from under your feet when the compiler decides to take a different route depending on a butterfly flapping its wings in Peru, it just makes the task another order of magnitude harder. Sometimes it's a case of less is more.

I presume you do turn off caches, because cache misses (either L1/2/3 or TLB) can cause your program to slow down by an order of magnitude. And that can happen because unrelated processes "evict" your data or virtual addresses from the caches.

Absolutely. While you make an important point, that is on very much the micro-level when dealing with an enterprise level system and any cache level issues are way down in the noise and generally not easily configurable if at all at the application layer. These days pretty much everything's on VMs and SANs anyway, the chance of your average VM admin giving a crap about cache hits is very unlikely, but cache is a key reason why things often run slower on VMs to physical boxes. In the physical world, the CPU is usually only sporadically used to full capacity on (relatively) well defined processes, whereas in the VM world it's frequently a race to have the underlying CPUs used as much as possible, particularly in commodity cloud-based services.

This has been one of my major concerns over the use of SANs and VMs in enterprise systems for the past fifteen or so years, that it's very difficult to provide _any_ form of reliable and guaranteed performance metrics as there are just too many unknowns. In general the SAN guys just want to give you JBOD solutions and simply aren't interested in spindle management. Luckily nowadays with tiered SANs the situation is alleviated.

But at the low level, absolutely. I place pinch point code in zero wait state memory and manually tier memory when necessary. That is something a compiler generally knows little about. For less intensive stuff, that can sit in slower cached memory. In general, in an embedded real time situation, you have to be able to guarantee and profile resposnse time, but luckily usually these systems are simple enough that there are a very well defined set of process that run.

Essentially, non-deterministic systems have kept the consultancy side of my business in work for a couple of decades and continue to do so. Most of my experience away from databases has been mostly with distributed systems using .NET rather than JEE in the enterprise, but I do have experience with both. Quite early on in .NET's lifecycle for example, Microsoft realised that slow start up was a major problem for anything non-trivial, and they released they needed the option of pre-compiling ahead of time. However, at the drop of a hat they will still recompile if the stars are not properly aligned. Unfortunately this also means that after security updates, it is not at all uncommon for all that code to recompile all over again, and this can take tens of minutes, churning your poor CPU cycles - cycles that would not be used if precompiled binaries had been distributed instead.

As a real example of where adaptive compilation doesn't work well is when you have multiple end users (or processes) with differing requirements using the same code. In this instance, the way one user uses the code is different to another user, and in some situations this can lead to serialisation, as there is only one execution plan which is being repeatedly recompiled, and you end up with massive overhead as the same code is recompiled on each context switch/iteration/execution - think in terms of 1,000 slower. This became very apparent, for example, in SQL Server 2000, and remains a problem to this day unless you enforce a per-session execution plan, which is expensive in memory... and, coincidentally, cache of course! I don't know how a Java hot-spot compiler deals with this situation, but it is very real with adaptive run time optimisers. Indeed, I wrote a paper on it at the time. It is exactly this kind of unexpected and unintended performance behaviour which makes me wary, but equally it's given me the opportunity to earn a few quid diagnosing and fixing it when it goes wrong too.

Anyway, these are just my experiences, and as I say I can hardly complain, these kinds of additional complexities and their unintended side effects are exactly what end up paying my bills after all!

ade · « **Reply #59 on:** March 14, 2016, 07:01:21 pm »

Quote from: nctnico on March 14, 2016, 03:08:07 pm

There are many things in C++ which are better than C. For example being able to pass a variable by reference, variable sizes arrays, a proper string type, the STL library, etc.

Pass by reference is meaningless in modular C, because you don't have overloaded types (which imply objects). Otherwise it's only syntactic sugar.

Modern C supports variable length arrays (since C99). It's neither good or bad. Peruse the Linux kernel and you'll see VLAs being used.

"Proper string type" and "STL" again are entirely meaningless outside object-oriented context. Type here imply object. You can't use your "proper string type" without instantiating string objects. You don't need generics when programming in modular style, as you don't organize by types.

tggzzz · « **Reply #60 on:** March 14, 2016, 09:13:54 pm »

I think we ar emostly in violent agreement.

Quote from: Howardlong on March 14, 2016, 06:09:13 pm

Quote from: tggzzz on March 14, 2016, 03:22:05 pm
Quote from: Howardlong on March 09, 2016, 08:16:29 pm
Key though is the determinism, or the lack of it. Not only does this frustrate end users, one of the biggest problems in troubleshooting is creating reproducible scenarios. When the rug's being contually being pulled from under your feet when the compiler decides to take a different route depending on a butterfly flapping its wings in Peru, it just makes the task another order of magnitude harder. Sometimes it's a case of less is more.

I presume you do turn off caches, because cache misses (either L1/2/3 or TLB) can cause your program to slow down by an order of magnitude. And that can happen because unrelated processes "evict" your data or virtual addresses from the caches.
Absolutely. While you make an important point, that is on very much the micro-level when dealing with an enterprise level system and any cache level issues are way down in the noise and generally not easily configurable if at all at the application layer.

While the points you make below are most assuredly important, the reason I chose the L1/L2/L3/TLB caches is that those are at more equivalent level to dynamic compilation. There is little that dynamic or static optimisation can do to affect SAN/network/VM level effects.

Quote

These days pretty much everything's on VMs and SANs anyway, the chance of your average VM admin giving a crap about cache hits is very unlikely, but cache is a key reason why things often run slower on VMs to physical boxes.

In my experience they don't even know that L1/L2/L3/TLB caches exist, let alone understanding their significance. Unfortunately most "enterprise level" programmes are little better.

Quote

In the physical world, the CPU is usually only sporadically used to full capacity on (relatively) well defined processes, whereas in the VM world it's frequently a race to have the underlying CPUs used as much as possible, particularly in commodity cloud-based services.

My recent experience has been with telecom application servers, and there customers do measure latency (usually mean latency rather than 95th percentile latency, fortunately!).

Quote

This has been one of my major concerns over the use of SANs and VMs in enterprise systems for the past fifteen or so years, that it's very difficult to provide _any_ form of reliable and guaranteed performance metrics as there are just too many unknowns. In general the SAN guys just want to give you JBOD solutions and simply aren't interested in spindle management. Luckily nowadays with tiered SANs the situation is alleviated.

That is, of course, when those bodies aren't peddling half-truths and even lies about performance, failure modes, and reliability.

Quote

But at the low level, absolutely. I place pinch point code in zero wait state memory and manually tier memory when necessary. That is something a compiler generally knows little about. For less intensive stuff, that can sit in slower cached memory. In general, in an embedded real time situation, you have to be able to guarantee and profile resposnse time, but luckily usually these systems are simple enough that there are a very well defined set of process that run.

For embedded systems that is, to some extent practical (I always liked the i960 in that respect), but I haven't seen a way to do it in application servers.

Quote

Essentially, non-deterministic systems have kept the consultancy side of my business in work for a couple of decades and continue to do so. Most of my experience away from databases has been mostly with distributed systems using .NET rather than JEE in the enterprise, but I do have experience with both. Quite early on in .NET's lifecycle for example, Microsoft realised that slow start up was a major problem for anything non-trivial, and they released they needed the option of pre-compiling ahead of time. However, at the drop of a hat they will still recompile if the stars are not properly aligned. Unfortunately this also means that after security updates, it is not at all uncommon for all that code to recompile all over again, and this can take tens of minutes, churning your poor CPU cycles - cycles that would not be used if precompiled binaries had been distributed instead.

I had presumed, without evidence or searching, that was the reason MS updates were so appallingly slow. Thanks for the solid experience.

As for slow-startup but eventually better performance vs fast-startup but poorer performance, there's a good argument MS made the right choice for the .NET desktop. It is less clear for long-running application servers.

Quote

As a real example of where adaptive compilation doesn't work well is when you have multiple end users (or processes) with differing requirements using the same code. In this instance, the way one user uses the code is different to another user, and in some situations this can lead to serialisation, as there is only one execution plan which is being repeatedly recompiled, and you end up with massive overhead as the same code is recompiled on each context switch/iteration/execution - think in terms of 1,000 slower. This became very apparent, for example, in SQL Server 2000, and remains a problem to this day unless you enforce a per-session execution plan, which is expensive in memory... and, coincidentally, cache of course! I don't know how a Java hot-spot compiler deals with this situation, but it is very real with adaptive run time optimisers. Indeed, I wrote a paper on it at the time. It is exactly this kind of unexpected and unintended performance behaviour which makes me wary, but equally it's given me the opportunity to earn a few quid diagnosing and fixing it when it goes wrong too.

Anyway, these are just my experiences, and as I say I can hardly complain, these kinds of additional complexities and their unintended side effects are exactly what end up paying my bills after all!

I've always stayed away from databases, except to get rid of them as being unbeneficially heavyweight for specific telco applications.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

EEVblog Electronics Community Forum

Author Topic: Get started with C programming (Read 17258 times)

97hilfel

Re: Get started with C programming

MSO

Re: Get started with C programming

Howardlong

Re: Get started with C programming

Boomerang

Re: Get started with C programming

ade

Re: Get started with C programming

nctnico

Re: Get started with C programming

tggzzz

Re: Get started with C programming

tggzzz

Re: Get started with C programming

Howardlong

Re: Get started with C programming

ade

Re: Get started with C programming

tggzzz

Re: Get started with C programming

Share me