Author Topic: how do you identify what are the preprocessor and macros in c program (Read 8678 times)

Dadu@ · « **on:** March 03, 2022, 08:29:44 am »

Hi,

I don't understand difference between preprocessor and macros in cprogramming language after reading material on internet

I have written following code and I think value is preprocessor directive. compiler sustitute value 1 with text

Code: [Select]

 #include<stdio.h>

#define value   1
#define add(y) y + 1

int main ()
{ 
	int x = value;
	int y = 5;
	printf("x = %d \n", x);
	printf("y = %d ", add(y));
	
	return 0;
}

x = 1
y = 6

What is add(y) y + 1 ? Is it macros ?

how do you identify what are the preprocessor and macros in c program ?

sleemanj · « **Reply #1 on:** March 03, 2022, 08:46:10 am »

It's "preprocessor macro", a macro handled by the preprocessor. They are not different things.

ledtester · « **Reply #2 on:** March 03, 2022, 09:49:22 am »

The short answer is that there is no easy way to tell if a symbol refers to a macro or not.

You basically have to do what the C-preprocessor does... read all of the code that occurs before the use of the symbol and see if there are any #defines for that symbol.

In a lot of cases authors will use all upper case for macro names -- e.g. FOO(...) refers to a macro named FOO and not a function. A lot of constants in the C standard library are implemented this way, e.g. EOF, the SEEK_* constants, ... On the other hand, there are also functions (or things you would expect to be a function) which are implemented as macros on some systems -- e.g. getc().

Macros are a relatively simple mechanism which allowed early C programmers to work around the limitations of the compiler. Nowadays in many instances there are much better ways to accomplish what macros were used for in the past. If you google "why macros are evil" you'll find discussions like this:

https://stackoverflow.com/a/14041847/866915

golden_labels · « **Reply #3 on:** March 03, 2022, 10:52:20 am »

Unless someone used a preprocessor wrong,⁽¹⁾ for practical purposes it doesn’t matter if something is a macro expansion or an actual function call. So you may continue making a call without ever knowing if it’s really a call or text substitution. You should not assume either anyway, because it may change with another release of some library.

One thing you must understand and remember, when writing your own macros, is that the preprocessor is dumb text substitution. It has no idea about C syntax: it just replaces sequences of characters in code.

As for satisfying your curiosity and help with debugging, both gcc and clang have the -E option. It stops the compilation process after the preprocessing stage and spits out a preprocessed file without doing anything else. The output is usually messy, as it’s not meant to be regularly used by humans, but debugging at this level happens to be ugly.

⁽¹⁾ Which may include using it at all.

Dadu@ · « **Reply #4 on:** March 03, 2022, 01:17:13 pm »

Quote from: golden_labels on March 03, 2022, 10:52:20 am

both gcc and clang have the -E option.

I am using windows 10 operating system. What is command for winows

grumpydoc · « **Reply #5 on:** March 03, 2022, 04:06:58 pm »

Quote from: Dadu@ on March 03, 2022, 08:29:44 am

#define add(y) y + 1

Be careful writing function-like macros, and don't do it as in the example above.

Macros are pure text expansion, which means that an unprotected argument can interact with operator precedence rules - eg if someone using your macro writes

Code: [Select]

y = 6;
x = add(y)*4;

They might be quite surprised that x ends up as 10, not 28 because it expanded to

Code: [Select]

y = 6;
x = y + 1 * 4;

So always put parentheses around macro arguments - i.e write
#define add(y) ((y) + 1)

There are, however lots more traps for the unwary when writing macros.

westfw · « **Reply #6 on:** March 04, 2022, 03:24:58 am »

Quote

there is no easy way to tell if a symbol refers to a macro or not.

If a symbol is a macro, it will "disappear" in the -E output that people have been talking about.

golden_labels · « **Reply #7 on:** March 04, 2022, 05:52:03 am »

Quote from: Dadu@ on March 03, 2022, 01:17:13 pm

I am using windows 10 operating system. What is command for winows

There is no “command for Windows”. Compilers are not an inherent part of operating systems. The options you need depend on the compiler you use. If you are using gcc or clang under Windows, it’s -E, exacly the same as elsewhere. If it’s some other compiler, consult the documentation for it. Or tell us what is the compiler — perhaps someone will able to give some hints.

Berni · « **Reply #8 on:** March 04, 2022, 06:19:55 am »

If you create it with #define then it is a preprocessor macro. Simple as that.

Most decent C IDEs will also support coloring macros a different color (part of syntax highlighting)

brucehoult · « **Reply #9 on:** March 04, 2022, 06:26:55 am »

Quote from: Dadu@ on March 03, 2022, 01:17:13 pm

Quote from: golden_labels on March 03, 2022, 10:52:20 am
both gcc and clang have the -E option.

I am using windows 10 operating system.

I'm so sorry.

Quote

What is command for winows

If you use gcc or clang, then the same as anywhere else.

I don't know if there are versions of gcc and clang that work in the DOS command line -- probably -- but you can certainly use them if you install WSL or cygwin.

I believe Mingw-w64 is a port of gcc that can create Windows applications. I'm not sure whether it runs under DOS or in cygwin etc.

golden_labels · « **Reply #10 on:** March 04, 2022, 07:00:28 am »

Quote from: Berni on March 04, 2022, 06:19:55 am

If you create it with #define then it is a preprocessor macro. Simple as that.

Do you seriously think they ask about detecting macros they themselves has written?

Rick Law · « **Reply #11 on:** March 04, 2022, 07:02:53 am »

Quote from: grumpydoc on March 03, 2022, 04:06:58 pm

Quote from: Dadu@ on March 03, 2022, 08:29:44 am
#define add(y) y + 1

Be careful writing function-like macros, and don't do it as in the example above.

Macros are pure text expansion, which means that an unprotected argument can interact with operator precedence rules - eg if someone using your macro writes
Code: [Select]
y = 6; x = add(y)*4;
They might be quite surprised that x ends up as 10, not 28 because it expanded to

Code: [Select]
y = 6; x = y + 1 * 4;
So always put parentheses around macro arguments - i.e write
#define add(y) ((y) + 1)

There are, however lots more traps for the unwary when writing macros.

Grumpydoc pointed out a very important point with macros in his reply above. There is another related one caused by the macro string replacement:

For a simple example, let say we have a macro to triple the value.
#define triple(y) ((y) + (y) + (y))

Now think about if you do this, this works fine:
++x; // first increment the count
z = triple(x); // set z = ((x) + (x) + (x))

But if you increment the count as macro's parameter, x is incremented three times:
z = triple(++x) // this set z = ((++x) + (++x) + (++x))

So, remember unlike a function call, macro is a text string replacement. If in your macro you use the argument multiple times and that argument changes itself, the change will be done multiple times also.

Ian.M · « **Reply #12 on:** March 04, 2022, 07:17:22 am »

@brucehoult,
There's also TDM-GCC that runs GCC natively on Windows, and lets you build native Windows applications using the MinGW runtimes.

grumpydoc · « **Reply #13 on:** March 04, 2022, 11:48:28 am »

Quote from: Rick Law on March 04, 2022, 07:02:53 am

But if you increment the count as macro's parameter, x is incremented three times:
z = triple(++x) // this set z = ((++x) + (++x) + (++x))

So, remember unlike a function call, macro is a text string replacement. If in your macro you use the argument multiple times and that argument changes itself, the change will be done multiple times also.

Absolutely

Another problem with the example above is that you have no idea what z would end up as - there is no sequence point between the three increments of x so you don't know whether x or x+1 will be added together.

This problem with double evaluation shows up in the "naïve" implementation of "MAX" as a macro - if you write

Code: [Select]

#define MAX(a, b) ((a) > (b) ? (a) : (b))

You might well run into this.

gcc has some extensions which can help avoid this so you can write

Code: [Select]

#define MAX(a,b) \
({ \
    typeof(a) _a = (a); \
    typeof(b) _b = (b); \
    (_a > _b) ? (_a) : (_b); \
})

but I don't think that's portable and it is best to avoid anything which is not idempotent as a macro argument.

Nominal Animal · « **Reply #14 on:** March 04, 2022, 12:08:29 pm »

Quote from: Rick Law on March 04, 2022, 07:02:53 am

But if you increment the count as macro's parameter, x is incremented three times:
z = triple(++x) // this set z = ((++x) + (++x) + (++x))

Worse: it invokes the dreaded Nasal Demons of Undefined Behaviour. You cannot apply pre- or post-increment or -decrement more than once to a variable in a single expression.

Many C compilers do usually do what you expect without surprises – except when you enable higher optimizations, and the results start to differ!
Thus, it is better to avoid that sort of a thing, and also enable compiler warnings (so the compiler will tell you if you accidentally do that). Even if a simple test shows that a particular version of a particular compiler seems to handle it right.

brucehoult · « **Reply #15 on:** March 04, 2022, 12:17:17 pm »

Quote from: grumpydoc on March 04, 2022, 11:48:28 am

gcc has some extensions which can help avoid this so you can write
Code: [Select]
#define MAX(a,b) \ ({ \ typeof(a) _a = (a); \ typeof(b) _b = (b); \ (_a > _b) ? (_a) : (_b); \ })
but I don't think that's portable and it is best to avoid anything which is not idempotent as a macro argument.

Just confirmed that works in Clang as well as gcc.

How much more portability do you need? :-)

gcc supports all the old historical stuff, and LLVM supports all the new stuff first and best.

Well, except for the 8-bit CPUs covered by cc65 and sdcc. Hmm .. LLVM has a plain C back-end, which I'd presume you can pass the output of to those compilers...

https://github.com/JuliaComputingOSS/llvm-cbe

brucehoult · « **Reply #16 on:** March 04, 2022, 12:24:51 pm »

Quote from: Nominal Animal on March 04, 2022, 12:08:29 pm

Quote from: Rick Law on March 04, 2022, 07:02:53 am
But if you increment the count as macro's parameter, x is incremented three times:
z = triple(++x) // this set z = ((++x) + (++x) + (++x))
Worse: it invokes the dreaded Nasal Demons of Undefined Behaviour. You cannot apply pre- or post-increment or -decrement more than once to a variable in a single expression.

Many C compilers do usually do what you expect without surprises – except when you enable higher optimizations, and the results start to differ!
Thus, it is better to avoid that sort of a thing, and also enable compiler warnings (so the compiler will tell you if you accidentally do that). Even if a simple test shows that a particular version of a particular compiler seems to handle it right.

But what is "handle it right"?

If you do that with x initially 10, my assumption, as a compiler guy, is that "handling it right" would probably be 13 + 13 + 13. But you, on the other hand, might well imagine that "handling it right" would be 11 + 12 + 13.

It's undefined. It could be either -- they are both perfectly consistent and logical. Or it could be something else, though it's harder I think to make a consistent argument for something else. Maybe 12 + 12 + 13 is arguable.

In general, it's just a very very bad way to code.

brucehoult · « **Reply #17 on:** March 04, 2022, 12:41:08 pm »

So I tried a couple of compilers on:

Code: [Select]

#define triple(x) x+x+x

int foo(int n){
  return triple(++n);
}

clang on my Mac gives:

Code: [Select]

0000000000000000 <ltmp0>:
       0: 08 04 00 0b   add     w8, w0, w0, lsl #1
       4: 00 19 00 11   add     w0, w8, #6
       8: c0 03 5f d6   ret

So, that's returning 3*n + 6, i.e. for n=10 it's 11+12+13.

gcc on x86 Linux gives:

Code: [Select]

0000000000000000 <foo>:
   0:   f3 0f 1e fa             endbr64 
   4:   8d 44 7f 07             lea    0x7(%rdi,%rdi,2),%eax
   8:   c3                      retq

That's 3*n + 7, so actually for n=10 that's 12+12+13.

gcc for risc-v gives:

Code: [Select]

0000000000000000 <foo>:
   0:   0025079b                addiw   a5,a0,2
   4:   0017979b                slliw   a5,a5,0x1
   8:   250d                    addiw   a0,a0,3
   a:   9d3d                    addw    a0,a0,a5
   c:   8082                    ret

So that's 2*(n+2) + (n+3), or 3*n + 7, consistent with x86 gcc.

I'm glad I said 12+12+13 is arguable in my previous post! That was based on the idea of evaluating side effects of each operand to a binary operator (or function) just before applying that operator (or function), and leaving side-effects on arguments of other operators until they were about to be evaluated.

DiTBho · « **Reply #18 on:** March 04, 2022, 01:03:12 pm »

Let's say it simple and concise:
don't use macro. Never.

edit: bold

Ian.M · « **Reply #19 on:** March 04, 2022, 01:17:03 pm »

That's overreacting. If you are writing 'vanilla' ISO C compliant code even MISRA hasn't gone as far as totally banning function-like macros. C++ code is a different matter as it has better ways of achieving a similar result.

Ref:

MISRA C:2004, 19.7 - A function should be used in preference to a function-like macro.
MISRA C++:2008, 16-0-4 - Function-like macros shall not be defined.
MISRA C:2012, Dir. 4.9 - A function should be used in preference to a function-like macro where they are interchangeable

grumpydoc · « **Reply #20 on:** March 04, 2022, 01:26:49 pm »

No need to shout.

I wouldn't go that far.

However, it is reasonable to say that inline functions and modern complier options render a lot of macro use unnecessary, and the rest implementable in safer ways than times past.

Perhaps better to say - don't use any language feature without having taken time to understand it.

DiTBho · « **Reply #21 on:** March 04, 2022, 01:45:25 pm »

Quote from: grumpydoc on March 04, 2022, 01:26:49 pm

Perhaps better to say - don't use any language feature without having taken time to understand it.

Precisely.

Siwastaja · « **Reply #22 on:** March 04, 2022, 04:31:44 pm »

Don't use anything you don't know how to use.

C does not have templates, so macros are used for generic programming or metaprogramming. Sometimes this is a lifesaver when it comes to readability, writability and maintainability of code bases. Yes, we all know the preprocessor sucks, and there are a few catches, you just need to know them:
* for function-like interface, need to wrap inside do{} while(0), which looks odd when you see it for the first time
* arguments must be in (parenthesis).

Replacements are not without catches, either. In real world (ignoring dreams, hallucinations, etc.), the replacement for a macro means a non-portable attribute/pragma thing, forcing inlining of the function. Also, real-world experience shows that people who intend to do that, do not do that.

Yet reality also shows that people often just forget to qualify their "private" functions static. Mistakes happen. This does not mean that you should not use functions.

grumpydoc · « **Reply #23 on:** March 04, 2022, 04:58:39 pm »

Quote from: Siwastaja on March 04, 2022, 04:31:44 pm

Don't use anything you don't know how to use.

Sage advice, but obviously you generally need to use a thing to learn how to use it better.

Quote

* for function-like interface, need to wrap inside do{} while(0), which looks odd when you see it for the first time

Only multi-statement macros need this trick.

The problem is that if you define a macro such as

Code: [Select]

#define do_a_and_b(x, y) \
    a(x,y);                         \
    b(x,y)

you have a problem if you later write

Code: [Select]

     if (test)
         do_a_and_b(x, y);

because this is going to expand to

Code: [Select]

     if (test)
         a(x, y);
         b(x,y);

The problem here is that b(x,y) will be called whether test is true or not - which is almost certainly not what the user of the macro expected.

You could make the two statements into a block

Code: [Select]

#define do_a_and_b(x, y) {\
    a(x,y);                         \
    b(x,y); }

that does fix the "if" problem but causes a syntax error if you add a semi-colon after the macro call.

Wrapping the block in do {...} while (0) gives a single block of code (which executes once) *and* allows a trailing semi-colon after the macro call without inducing a syntax error.

golden_labels · « **Reply #24 on:** March 04, 2022, 08:24:30 pm »

DiTBho: no need to shout.

As for the advice itself: you don’t have any choice. Not unless you want to stop using any library, including the C standard library.


EEVblog Main Site	EEVblog on Youtube	EEVblog on Twitter	EEVblog on Facebook	EEVblog on Odysee

Author Topic: how do you identify what are the preprocessor and macros in c program (Read 8678 times)

Share me