3

I'm trying to do a macro like this:

#define STRING(i) \
struct STRING##i \
{ \
    size_t len; \
    char chars[i]; \
}

but the problem is this works with constexpr arguments like this:

constexpr int ten = 10;
STRING(ten) mystr;

I don't want that because then STRING(ten) and STRING(10) aren't compatible types and that might confuse users of this macro.

I tried the following:

#define STRING(i) \
struct STRING##i \
{ \
    _Static_assert( CAT(i, ul) , "must be nonzero literal"); \
    size_t len; \
    char chars[i]; \
}

it appends ul to the literal and makes it an unsigned long literal, but fails with non-literals since it makes it a different identifier. but the problem with this is if the user has another constexpr named tenul for example.

I wonder if there's a better way to make this macro fail unless an integer literal is provided.


Edit

To add to the accepted answer, I did:

#define STRING(i) \
struct STRING##i \
{ \
    _Static_assert((1##i##1ul || i##8ul || 1), "argument must be positive decimal integer literal"); \
    size_t len; \
    char chars[i]; \
}

This makes sure octals and hexadecimals aren't accepted, only decimal literals.

18
  • Detail: C does not define integer literals. It does define integer constants. Commented Mar 21, 2024 at 5:12
  • 1
    Should STRING(10) and STRING(0xA) map to the same type? If the answer is "Yes", life is going to be hell! You can attempt to make a macro foolproof, but rest assured, a better fool will come along and break it. It may not be worth the mental gymnastics. Commented Mar 21, 2024 at 5:19
  • it wouldn't, because the identifier name is different. But I only want the distinction to be between normal integer literal in decimal and identifiers. Since no one actually writes code like STRING(0xA) Commented Mar 21, 2024 at 5:21
  • Who writes code like STRING(ten)? Commented Mar 21, 2024 at 5:22
  • 1
    There is no reason to do everything with a compiler. Simply write a script in any scripting language that scans the source code for STRING invoked with an undesired form of argument. Then call the script as part of the build procedure (or even as a commit hook in the source code control system). Commented Mar 21, 2024 at 11:29

3 Answers 3

2

This is quite intricate, since:

  • A lot of things in C can be integer constant expressions while they are not integer constants. For example 1|1.
  • C has a whole lot of ways to express integer constants, including hex, octal and binary notation (C23), various suffixes like U or L. And as per C23 also decimal separators '.

I'm making the assumption that it is fine that something passed to the macro, which is not an integer constant, should either result in some manner of (likely confusing) compiler error or otherwise cause the macro to return zero.

Some things like false or '1' could arguably be considered integer constants, but they are not typically something we would like to pass such a check either.

I came up with a macro that catches most but not all of the various special cases. It consists of 2 expressions:

  • (void)(struct{int ident##x;}){}.ident##x. This creates an anonymous struct inside a compound literal. It then tries to name a struct member ident + whatever was passed to the macro.

    Which in case of integer constants is fine to place there, since an identifier may not begin with a number but may contain letters and numbers. But in case x contains any pre-processor token which are not letters and numbers, then we will get a compiler error. This will weed out integer constant expressions.

  • (#x)[0] >= '0' && (#x)[0] <= '9' Inspired by the union trick posted by @CPlus. This simply checks if the first letter in the pp token is a digit, which is always the case on integer constants. (Hex, octal and binary constants all start with 0).

Forming a macro out of these two (in C23), we put the first expression at the left side of a comma operator to discard it (and cast to void to hush up compiler warnings):

#define IS_INTEGER_CONSTANT(x) ( (void)(struct{int ident##x;}){}.ident##x  ,  \
                                 (#x)[0] >= '0' && (#x)[0] <= '9' )

Test cases:

#include <stdio.h>
#include <stdint.h>

#define IS_INTEGER_CONSTANT(x) ( (void)(struct{int ident##x;}){}.ident##x  ,  \
                                 (#x)[0] >= '0' && (#x)[0] <= '9' )
#define TEST(x) printf("%s is %san integer constant.\n", #x, IS_INTEGER_CONSTANT(x)?"":"NOT ")


int main() 
{
  TEST(0);
  TEST(123);
  TEST(0123);
  TEST(0x123);
  TEST(1UL);
  TEST(UINT32_C(1));

  TEST( false ); // it's a bool and keyword in C23, not a macro
  int i = 1; TEST(i);
  typedef int type; TEST(type);
  TEST(nullptr);
  constexpr int ten = 10; TEST(ten);

  //TEST(1'2'3); fails here, C23 integer constants
  
  /* The following are not integer constants and give various compiler errors:
  TEST(1|1);
  TEST( (char)1 );
  TEST('1');
  TEST(1.0);
  TEST(NULL);
  TEST("hello");
  */
}

Output:

0 is an integer constant.
123 is an integer constant.
0123 is an integer constant.
0x123 is an integer constant.
1UL is an integer constant.
UINT32_C(1) is an integer constant.
false is NOT an integer constant.
i is NOT an integer constant.
type is NOT an integer constant.
nullptr is NOT an integer constant.
ten is NOT an integer constant.

So this failed to catch the corner case of C23 ' decimal separators but worked with all other weird corner cases I could come up with.

Sign up to request clarification or add additional context in comments.

4 Comments

If you are doing a runtime check anyway, isdigit((#x)[0]) works from <ctype.h>.
@CPlus What makes you think the check will be carried out in run-time?
You cannot use IS_INTEGER_CONSTANT in a context where a compile-time constant is required.
@CPlus Yes, no, maybe? Depends on context. The macro in itself is not an integer constant expression, if that's what you mean.
1

Valid identifiers start with alphabetic characters or underscores. Valid integer literals start with digits. If only there was a way to check if the first character of #i (the macro argument converted to a string) was numeric at compile time. But as far as I can tell using (#i)[0] only works at runtime.

The question is tagged so here is the only method I could think of and only works in C23:

In 6.6 Constant Expressions:

Starting from a structure or union constant, the member-access . operator may be used to form a named constant or compound literal constant as described above.

This means in theory you should be able to have code along the lines of:

#define IS_LITERAL(i)\
_Static_assert(((constexpr union {unsigned char f, s[sizeof(#i)];}){.s = #i}).f - '0' <= 9,\
    "Not numeric literal");

What this does is creates a constant anonymous union containing a char field and a char[] field and uses the . operator to get the char field which will be the first character of the char[] field and then checks if the character is numeric.

Edit: The above technically invokes undefined behavior/is not required to compile as per the following rule:

If the member-access operator . accesses a member of a union constant, the accessed member shall be the same as the member that is initialized by the union constant's initializer.


Alternatively if you want a compile time error for a non literal integer value and do not mind losing the ability to use hexadecimal constants you could use:

#define IS_LITERAL(i) _Static_assert(1##i || 1); // Or int dummy = 1##i; or similar

If i were an identifier then i preceded by a 1 would not be a valid identifier name but prepending 1 to an octal or decimal constant would still be a valid expression.

Note: 1##i##0 can also be used to block identifiers such as l or ul.

8 Comments

@CPlus "Valid integer literals start with digits" C does not define integer literals. It does define string and compound literals and integer constants. Perhaps you are answering with C++ terms?
@chux-ReinstateMonica I was answering with the same language OP used.
@CPlus yup this actually works perfectly for me. Thanks. One suggestion is to do 1##i##1 instead because the identifier can be ul or l or another suffix.
As for the "the accessed member shall be the same as the member that is initialized by the union constant's initializer" part, you actually don't need to use constexpr here, since the expression will be evaluated at compile-time regardless. It rather seems that constexpr only brought problems to this example. Plus if you get rid of it, the code is backwards compatible at least back to C11.
Oh btw... any ASCII character lower than '0' passed to the macro, means that the macro will fail due to integer promotion of .f - 0 to type int. You'll get a negative value which is indeed <=9.
|
1

How to check if a macro argument is an integer literal in C

There is no reason to do everything with a compiler. Simply write a script in any scripting language that scans the source code for STRING invoked with an undesired form of argument. Then call the script as part of the build procedure (or even as a commit hook in the source code control system).

The easiest way to do this may be to use the compiler partially, for its preprocessing function, as with Clang or GCC’s -E switch to provide the result of preprocessing without compiling. Then this output may be scanned for struct STRINGtext, and the text can be tested for the desired form.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.