I would consider this advice on -O3 obsolete when targeting x86_64 and probably Armv8-A.
Perhaps, but it very much depends on the compiler and especially compiler version.
For example, I do not use anything newer than GCC 9 for ARM targets, because of the unfixed issues in later versions. I'm seriously considering switching to Clang for arm, anyway.
I take a slightly different perspective on optimizations with gcc, it's not all about -Ox.
Very true; I too mentioned specific optimization flags that I end up using; for example,
-ffinite-math-only can make a big difference and be very useful when you have e.g. explicit checks so that you never do division by values very close to zero and such.
However, I like to keep such things in separate compilation units (files). For optimized routines, I often have alternates with the exact same interface but wildly different implementations, and choose the implementation simply by selecting which C source file (among the alternates) is used: either via Makefile options, or via a common .c source file that #includes the appropriate .c source file based on preprocessor macros. (Note that some people do have an irrational dislike of #include used with source files though; it seems that it jars some peoples sensitivities somehow.)
Can anyone explain what is meant by "debugging information" included in the code?
As ataradov and SiliconWizard already mentioned, there is no debugging information per se in the code.
Some optimization flags do affect the debuggability of the code, though; in particular,
-fomit-frame-pointer. In many architectures, the address of the current stack frame is kept in a separate register. This option disables that (so that the stack frame is then implicit, and local variables on stack are accessed via the stack pointer). On some architectures, this can make debugging much harder; according to documentation, impossible on some, but I'm not sure on which arches that is. Stack frames can still be described for each function via separate debugging data, for example when using DWARF formats (for the debugging data).
Object files and especially final ELF binaries will contain a lot of extra information when debugging information is enabled. If your build facilities are such that the ELF files are stripped before uploaded to the target device (say, like in Arduino environment, or most environments targetting microcontrollers), it does not matter whether the compilation included debugging information in the object files or not (whether
-g was used or not); but the optimization options used does.
One nice thing about
-Og is that the compiled code should be the same regardless of whether debugging information is included or not via
-g, in the object files and final binaries. If one uses
-O2 or
-Os , and then switches to say
-O0 -g for debugging, the compiled code is usually different, making debugging problems more difficult than necessary. Also, both
-O2 and
-Os enable
-fomit-frame-pointer, affecting debuggability. I do believe it was the programmers' need for an optimization level that generates reasonably optimized code without affecting debuggability that caused GCC to grow support for
-Og in GCC 4.8 in 2013, I believe; but I'm not sure if Clang actually implemented it first and GCC users found how useful it is, or vice versa.
Since we're using the ELF file format for object files (and final binaries before converting to hex), knowing the structure of
ELF files can be very useful, since ELF files can contain all sorts of information, not just "code" and "data".
I often build things with "-nostdlib" flag, making sure no standard library functions are included.
Me too, but with GCC, it is not enough to avoid a dependency on memcpy(), memmove(), memset(), and memcmp(), because GCC expects these to be provided by even a freestanding environment; see the second-to-last paragraph in section 2.1,
GCC C-Language Standards in GCC documentation.
It is not too common for GCC (across its versions and compiler options) to turn loops into a call of one of the above, I think.
A bit of glue logic (even preprocessor macros detecting compile-time type or alignment, so that an optimized native-word-sized operations can be used) is usually enough to ensure it does not happen for a particular function implementation. When one does need these four functions anyway, or duplicates of them in the same project, the library-provided ones are
weak, and one can override those simply by implementing ones own, using the same function signature (including name).
If one defines them in the same compilation unit (file or files compiled in the same gcc/clang command), the pattern shown by ttt in
post #206 can be used to control the optimization flags; and the scriptlet I showed earlier can be used to verify the compiled object file contains no external dependencies.
Sometimes it can be worth the effort to implement these (separate variants for loop direction and access size for memcpy()/memmove(), separate access sizes for memset(), and only a byte-by-byte memcmp()) in
extended inline assembly (
asm volatile ("code" : outputs : inputs : clobbers); as the function body). If in the same compilation unit, I recommend using an
#include "memfuncs.c" so that the implementation is easy to change/select at build time, for example based on the hardware architecture. It also makes unit testing them (with a separate program) much easier.
Nominal Animal, I appreciate your input, because of our different views.
I too appreciate different views, especially when people describe the reasons for their different views (like you did, and members like ataradov and SiliconWizard and many others do), because that way I can learn. I know I don't
know really that much; but I can and am willing to learn. Never hesitate to correct me, if you believe I am in error; I very much appreciate that.
My communications style is far from optimal (verbose, sometimes looks like I'm trying to be more authoritative than I actually am, me occasionally fail English, and so on); but my attempt is always to describe my reasons, with my current opinion (based on those reasons) more like a side note than the focus, because my opinions change as I learn. But that sort of describing-the-reasons can sometimes appear as The List Of Facts, which they aren't; they're just the stuff I
currently am aware of.