how do you identify what are the preprocessor and macros in c program

how do you identify what are the preprocessor and macros in c program
Posted by Dadu@ on 03 Mar, 2022 08:29
Hi,

I don't understand difference between preprocessor and macros in cprogramming language after reading material on internet

I have written following code and I think value is preprocessor directive. compiler sustitute value 1 with text

Code: [Select]
#include<stdio.h> #define value 1 #define add(y) y + 1 int main () { int x = value; int y = 5; printf("x = %d \n", x); printf("y = %d ", add(y)); return 0; }
x = 1
y = 6

What is add(y) y + 1 ? Is it macros ?

how do you identify what are the preprocessor and macros in c program ?

#1 Reply
Posted by sleemanj on 03 Mar, 2022 08:46
It's "preprocessor macro", a macro handled by the preprocessor. They are not different things.

#2 Reply
Posted by ledtester on 03 Mar, 2022 09:49
The short answer is that there is no easy way to tell if a symbol refers to a macro or not.

You basically have to do what the C-preprocessor does... read all of the code that occurs before the use of the symbol and see if there are any #defines for that symbol.

In a lot of cases authors will use all upper case for macro names -- e.g. FOO(...) refers to a macro named FOO and not a function. A lot of constants in the C standard library are implemented this way, e.g. EOF, the SEEK_* constants, ... On the other hand, there are also functions (or things you would expect to be a function) which are implemented as macros on some systems -- e.g. getc().

Macros are a relatively simple mechanism which allowed early C programmers to work around the limitations of the compiler. Nowadays in many instances there are much better ways to accomplish what macros were used for in the past. If you google "why macros are evil" you'll find discussions like this:

https://stackoverflow.com/a/14041847/866915

#3 Reply
Posted by golden_labels on 03 Mar, 2022 10:52
Unless someone used a preprocessor wrong,⁽¹⁾ for practical purposes it doesn’t matter if something is a macro expansion or an actual function call. So you may continue making a call without ever knowing if it’s really a call or text substitution. You should not assume either anyway, because it may change with another release of some library.

One thing you must understand and remember, when writing your own macros, is that the preprocessor is dumb text substitution. It has no idea about C syntax: it just replaces sequences of characters in code.

As for satisfying your curiosity and help with debugging, both gcc and clang have the -E option. It stops the compilation process after the preprocessing stage and spits out a preprocessed file without doing anything else. The output is usually messy, as it’s not meant to be regularly used by humans, but debugging at this level happens to be ugly.

⁽¹⁾ Which may include using it at all.

#4 Reply
Posted by Dadu@ on 03 Mar, 2022 13:17
Quote from: golden_labels on 03 Mar, 2022 10:52
both gcc and clang have the -E option.

I am using windows 10 operating system. What is command for winows

#5 Reply
Posted by grumpydoc on 03 Mar, 2022 16:06
Quote from: Dadu@ on 03 Mar, 2022 08:29
#define add(y) y + 1

Be careful writing function-like macros, and don't do it as in the example above.

Macros are pure text expansion, which means that an unprotected argument can interact with operator precedence rules - eg if someone using your macro writes
Code: [Select]
y = 6; x = add(y)*4;
They might be quite surprised that x ends up as 10, not 28 because it expanded to

Code: [Select]
y = 6; x = y + 1 * 4;
So always put parentheses around macro arguments - i.e write
#define add(y) ((y) + 1)

There are, however lots more traps for the unwary when writing macros.

#6 Reply
Posted by westfw on 04 Mar, 2022 03:24
Quote
there is no easy way to tell if a symbol refers to a macro or not.
If a symbol is a macro, it will "disappear" in the -E output that people have been talking about.

#7 Reply
Posted by golden_labels on 04 Mar, 2022 05:52
Quote from: Dadu@ on 03 Mar, 2022 13:17
I am using windows 10 operating system. What is command for winows
There is no “command for Windows”. Compilers are not an inherent part of operating systems. The options you need depend on the compiler you use. If you are using gcc or clang under Windows, it’s -E, exacly the same as elsewhere. If it’s some other compiler, consult the documentation for it. Or tell us what is the compiler — perhaps someone will able to give some hints.

#8 Reply
Posted by Berni on 04 Mar, 2022 06:19
If you create it with #define then it is a preprocessor macro. Simple as that.

Most decent C IDEs will also support coloring macros a different color (part of syntax highlighting)

#9 Reply
Posted by brucehoult on 04 Mar, 2022 06:26
Quote from: Dadu@ on 03 Mar, 2022 13:17
Quote from: golden_labels on 03 Mar, 2022 10:52
both gcc and clang have the -E option.

I am using windows 10 operating system.

I'm so sorry.

Quote
What is command for winows

If you use gcc or clang, then the same as anywhere else.

I don't know if there are versions of gcc and clang that work in the DOS command line -- probably -- but you can certainly use them if you install WSL or cygwin.

I believe Mingw-w64 is a port of gcc that can create Windows applications. I'm not sure whether it runs under DOS or in cygwin etc.

#10 Reply
Posted by golden_labels on 04 Mar, 2022 07:00
Quote from: Berni on 04 Mar, 2022 06:19
If you create it with #define then it is a preprocessor macro. Simple as that.
Do you seriously think they ask about detecting macros they themselves has written?

#11 Reply
Posted by Rick Law on 04 Mar, 2022 07:02
Quote from: grumpydoc on 03 Mar, 2022 16:06
Quote from: Dadu@ on 03 Mar, 2022 08:29
#define add(y) y + 1

Be careful writing function-like macros, and don't do it as in the example above.

Macros are pure text expansion, which means that an unprotected argument can interact with operator precedence rules - eg if someone using your macro writes
Code: [Select]
y = 6; x = add(y)*4;
They might be quite surprised that x ends up as 10, not 28 because it expanded to

Code: [Select]
y = 6; x = y + 1 * 4;
So always put parentheses around macro arguments - i.e write
#define add(y) ((y) + 1)

There are, however lots more traps for the unwary when writing macros.

Grumpydoc pointed out a very important point with macros in his reply above. There is another related one caused by the macro string replacement:

For a simple example, let say we have a macro to triple the value.
#define triple(y) ((y) + (y) + (y))

Now think about if you do this, this works fine:
++x; // first increment the count
z = triple(x); // set z = ((x) + (x) + (x))

But if you increment the count as macro's parameter, x is incremented three times:
z = triple(++x) // this set z = ((++x) + (++x) + (++x))

So, remember unlike a function call, macro is a text string replacement. If in your macro you use the argument multiple times and that argument changes itself, the change will be done multiple times also.

#12 Reply
Posted by Ian.M on 04 Mar, 2022 07:17
@brucehoult,
There's also TDM-GCC that runs GCC natively on Windows, and lets you build native Windows applications using the MinGW runtimes.

#13 Reply
Posted by grumpydoc on 04 Mar, 2022 11:48
Quote from: Rick Law on 04 Mar, 2022 07:02

But if you increment the count as macro's parameter, x is incremented three times:
z = triple(++x) // this set z = ((++x) + (++x) + (++x))

So, remember unlike a function call, macro is a text string replacement. If in your macro you use the argument multiple times and that argument changes itself, the change will be done multiple times also.
Absolutely

Another problem with the example above is that you have no idea what z would end up as - there is no sequence point between the three increments of x so you don't know whether x or x+1 will be added together.

This problem with double evaluation shows up in the "naïve" implementation of "MAX" as a macro - if you write
Code: [Select]
#define MAX(a, b) ((a) > (b) ? (a) : (b))
You might well run into this.

gcc has some extensions which can help avoid this so you can write
Code: [Select]
#define MAX(a,b) \ ({ \ typeof(a) _a = (a); \ typeof(b) _b = (b); \ (_a > _b) ? (_a) : (_b); \ })
but I don't think that's portable and it is best to avoid anything which is not idempotent as a macro argument.

#14 Reply
Posted by Nominal Animal on 04 Mar, 2022 12:08
Quote from: Rick Law on 04 Mar, 2022 07:02
But if you increment the count as macro's parameter, x is incremented three times:
z = triple(++x) // this set z = ((++x) + (++x) + (++x))
Worse: it invokes the dreaded Nasal Demons of Undefined Behaviour. You cannot apply pre- or post-increment or -decrement more than once to a variable in a single expression.

Many C compilers do usually do what you expect without surprises – except when you enable higher optimizations, and the results start to differ!
Thus, it is better to avoid that sort of a thing, and also enable compiler warnings (so the compiler will tell you if you accidentally do that). Even if a simple test shows that a particular version of a particular compiler seems to handle it right.

#15 Reply
Posted by brucehoult on 04 Mar, 2022 12:17
Quote from: grumpydoc on 04 Mar, 2022 11:48
gcc has some extensions which can help avoid this so you can write
Code: [Select]
#define MAX(a,b) \ ({ \ typeof(a) _a = (a); \ typeof(b) _b = (b); \ (_a > _b) ? (_a) : (_b); \ })
but I don't think that's portable and it is best to avoid anything which is not idempotent as a macro argument.

Just confirmed that works in Clang as well as gcc.

How much more portability do you need? :-)

gcc supports all the old historical stuff, and LLVM supports all the new stuff first and best.

Well, except for the 8-bit CPUs covered by cc65 and sdcc. Hmm .. LLVM has a plain C back-end, which I'd presume you can pass the output of to those compilers...

https://github.com/JuliaComputingOSS/llvm-cbe

#16 Reply
Posted by brucehoult on 04 Mar, 2022 12:24
Quote from: Nominal Animal on 04 Mar, 2022 12:08
Quote from: Rick Law on 04 Mar, 2022 07:02
But if you increment the count as macro's parameter, x is incremented three times:
z = triple(++x) // this set z = ((++x) + (++x) + (++x))
Worse: it invokes the dreaded Nasal Demons of Undefined Behaviour. You cannot apply pre- or post-increment or -decrement more than once to a variable in a single expression.

Many C compilers do usually do what you expect without surprises – except when you enable higher optimizations, and the results start to differ!
Thus, it is better to avoid that sort of a thing, and also enable compiler warnings (so the compiler will tell you if you accidentally do that). Even if a simple test shows that a particular version of a particular compiler seems to handle it right.

But what is "handle it right"?

If you do that with x initially 10, my assumption, as a compiler guy, is that "handling it right" would probably be 13 + 13 + 13. But you, on the other hand, might well imagine that "handling it right" would be 11 + 12 + 13.

It's undefined. It could be either -- they are both perfectly consistent and logical. Or it could be something else, though it's harder I think to make a consistent argument for something else. Maybe 12 + 12 + 13 is arguable.

In general, it's just a very very bad way to code.

#17 Reply
Posted by brucehoult on 04 Mar, 2022 12:41
So I tried a couple of compilers on:

Code: [Select]
#define triple(x) x+x+x int foo(int n){ return triple(++n); }
clang on my Mac gives:

Code: [Select]
0000000000000000 <ltmp0>: 0: 08 04 00 0b add w8, w0, w0, lsl #1 4: 00 19 00 11 add w0, w8, #6 8: c0 03 5f d6 ret
So, that's returning 3*n + 6, i.e. for n=10 it's 11+12+13.

gcc on x86 Linux gives:

Code: [Select]
0000000000000000 <foo>: 0: f3 0f 1e fa endbr64 4: 8d 44 7f 07 lea 0x7(%rdi,%rdi,2),%eax 8: c3 retq
That's 3*n + 7, so actually for n=10 that's 12+12+13.

gcc for risc-v gives:

Code: [Select]
0000000000000000 <foo>: 0: 0025079b addiw a5,a0,2 4: 0017979b slliw a5,a5,0x1 8: 250d addiw a0,a0,3 a: 9d3d addw a0,a0,a5 c: 8082 ret
So that's 2*(n+2) + (n+3), or 3*n + 7, consistent with x86 gcc.

I'm glad I said 12+12+13 is arguable in my previous post! That was based on the idea of evaluating side effects of each operand to a binary operator (or function) just before applying that operator (or function), and leaving side-effects on arguments of other operators until they were about to be evaluated.

#18 Reply
Posted by DiTBho on 04 Mar, 2022 13:03
Let's say it simple and concise:
don't use macro. Never.

edit: bold

#19 Reply
Posted by Ian.M on 04 Mar, 2022 13:17
That's overreacting. If you are writing 'vanilla' ISO C compliant code even MISRA hasn't gone as far as totally banning function-like macros. C++ code is a different matter as it has better ways of achieving a similar result.

Ref:
- MISRA C:2004, 19.7 - A function should be used in preference to a function-like macro.
- MISRA C++:2008, 16-0-4 - Function-like macros shall not be defined.
- MISRA C:2012, Dir. 4.9 - A function should be used in preference to a function-like macro where they are interchangeable

#20 Reply
Posted by grumpydoc on 04 Mar, 2022 13:26
No need to shout.

I wouldn't go that far.

However, it is reasonable to say that inline functions and modern complier options render a lot of macro use unnecessary, and the rest implementable in safer ways than times past.

Perhaps better to say - don't use any language feature without having taken time to understand it.

#21 Reply
Posted by DiTBho on 04 Mar, 2022 13:45
Quote from: grumpydoc on 04 Mar, 2022 13:26
Perhaps better to say - don't use any language feature without having taken time to understand it.

Precisely.

#22 Reply
Posted by Siwastaja on 04 Mar, 2022 16:31
Don't use anything you don't know how to use.

C does not have templates, so macros are used for generic programming or metaprogramming. Sometimes this is a lifesaver when it comes to readability, writability and maintainability of code bases. Yes, we all know the preprocessor sucks, and there are a few catches, you just need to know them:
* for function-like interface, need to wrap inside do{} while(0), which looks odd when you see it for the first time
* arguments must be in (parenthesis).

Replacements are not without catches, either. In real world (ignoring dreams, hallucinations, etc.), the replacement for a macro means a non-portable attribute/pragma thing, forcing inlining of the function. Also, real-world experience shows that people who intend to do that, do not do that.

Yet reality also shows that people often just forget to qualify their "private" functions static. Mistakes happen. This does not mean that you should not use functions.

#23 Reply
Posted by grumpydoc on 04 Mar, 2022 16:58
Quote from: Siwastaja on 04 Mar, 2022 16:31
Don't use anything you don't know how to use.
Sage advice, but obviously you generally need to use a thing to learn how to use it better.

Quote
* for function-like interface, need to wrap inside do{} while(0), which looks odd when you see it for the first time
Only multi-statement macros need this trick.

The problem is that if you define a macro such as
Code: [Select]
#define do_a_and_b(x, y) \ a(x,y); \ b(x,y)
you have a problem if you later write
Code: [Select]
if (test) do_a_and_b(x, y);because this is going to expand to
Code: [Select]
if (test) a(x, y); b(x,y);
The problem here is that b(x,y) will be called whether test is true or not - which is almost certainly not what the user of the macro expected.

You could make the two statements into a block
Code: [Select]
#define do_a_and_b(x, y) {\ a(x,y); \ b(x,y); }
that does fix the "if" problem but causes a syntax error if you add a semi-colon after the macro call.

Wrapping the block in do {...} while (0) gives a single block of code (which executes once) *and* allows a trailing semi-colon after the macro call without inducing a syntax error.

#24 Reply
Posted by golden_labels on 04 Mar, 2022 20:24
DiTBho: no need to shout.

As for the advice itself: you don’t have any choice. Not unless you want to stop using any library, including the C standard library.