17

I discovered that Microsoft Visual Studio compiler and gcc preprocess the following small snippet differently:

# define M3(x, y, z) x + y + z
# define M2(x, y) M3(x, y)
# define P(x, y) {x, y}
# define M(x, y) M2(x, P(x, y))
M(a, b)

'gcc -E' gives the following:

a + {a + b}

, while 'cl /E' issues a warning about missing macro argument and produces the following output:

a + {a, b} +

It seems that commas that came from nested macro expansions are not considered to be argument separators. Unfortunately, I found no description of the algorithm implemented in cl preprocessor, and so I'm not sure that my suggestion is correct. Does anyone know how cl preprocessor works and what's the difference between its algorithm and gcc's? And how the observed behaviour can be explained?

4
  • 5
    What version of gcc and CL? Besides that, I would say it's a bug in the gcc preprocessor as M3 should have three arguments and only gets two. Commented Jul 13, 2012 at 11:30
  • which version of cpp? cpp --version Commented Jul 13, 2012 at 11:42
  • I can't see any reference of the '{a + b}' form in C99? Commented Jul 13, 2012 at 11:45
  • 1
    Just forget about to try to get MSVC and any C99 complying compiler to produce the same results. MSCV isn't C99 (nor C11) they lack two versions of the standard behind. Commented Jul 13, 2012 at 13:15

3 Answers 3

9
# define M3(x, y, z) x + y + z
# define M2(x, y) M3(x, y)
# define P(x, y) {x, y}
# define M(x, y) M2(x, P(x, y))
M(a, b)

Let us roll this out manually, step by step:

M(a, b)
--> M2(a, P(a, b))
--> M2(a, {a, b})

The standard says:

The individual arguments within the list are separated by comma preprocessing tokens, but comma preprocessing tokens between matching inner parentheses do not separate

only parentheses are mentioned, so ...

--> M3(a, {a, b})
--> a + {a + b}

Important:

M3(a, {a, b})

Here, according to the previous quote from the standard, three "arguments" are passed to M3 (using single-quotes to describe tokens/arguments):

M3('a', '{a', 'b}')

which are expanded to

'a' + '{a' + 'b}'

And this is what cpp (4.6.1) gives verbatim:

# 1 "cpp.cpp"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "cpp.cpp"




a + {a + b}

cpp (or gcc and g++) are correct, MSVC isn't.

As a nobleman make sure a bug report exists.

Sign up to request clarification or add additional context in comments.

2 Comments

AFAIR this precise rules for the preprocessor came with C99. So for MSVC it isn't a bug, I don't think they claim to conform to C99. Code written for MSVC simply isn't portable, nowadays.
@JensGustedt: My take is: A C compiler should implement the current standard. If MSVC is neither C99, nor C11 conforming, than it is not a C compiler, but at max specifically a C89 compiler. Same argumentation for C++ :)
4

The only logic that explains such a behavior looks like this.

CL way:

 M(a,b) 
 M2(a,P(a,b)) 
 M3(a,P(a,b))
 M3(a,{a,b}) -> M3 gets 2 arguments ( 'a' and '{a,b}') instead of 3.
    |  \ /
  arg1  |
      arg2 

Gcc way:

M(a,b) 
M2(a,P(a,b)) 
M3(a,P(a,b))
M3(a,{a,b}) -> Gcc probably thinks there are 3 arguments here ('a', '{a', 'b}').
   |  | |
 arg1 | |
   arg2 |
     arg3

2 Comments

cpp doesn't think there are 3 arguments, it knows there are, as per the standard there are three arguments. Only commas within parentheses are not "preprocessor commas" (sidenote: I won't summon any downvote for this, just in case somebody will) :)
@phresnel, Yea, probably. Unfortunately I can't check the standart right now, so this was just my assuption based on OP data ;)
1

I think gcc gets it right, what Microsoft does is incorrect.

When macro substitution is done for the line

M2(a, P(a, b))

the standard (section 6.10.3.1) requires that before replacing the second parameter ("y") in the macro's replacement list ("M3(x, y)") with its argument ("P(a, b)"), macro replacement is to be performed for that argument. This means "P(a, b)" is processed to "{a, b}" before it is inserted, resulting in

M3(a, {a, b})

which is then further replaced to

a + {a + b}

4 Comments

Thanks. I agree that MS is incorrect, but the problem is slightly different. Algorithm of preprocessing is described in the standard, and MS preprocessor obviously doesn't follow this algorithm. Do you know (or do you have a reasonable assumption) how MS preprocessor works?
@SergeySyromyatnikov: At least one flaw is that it doesn't recognize commas within {}, but the standard says that only commas within () should be ignored.
@SergeySyromyatnikov I have no idea. MSCV claims to be C90-compliant, with some extensions. As far as I can tell C90 doesn't differ from the current C standard in this regard; here is a draft for the old standard, the relevant section is 3.8.3.
BTW, "{}" are not important and can be omitted or replaced with other symbols.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.