CPP: Macro Replacement: Page 182, EXAMPLE 3

2023/08/09

This post explains the EXAMPLE 3 given on Page# 182 (page# that would be found printed on the page of a paper-copy, not the one displayed on a pdf reader application) of the working draft, dated April 1, 2023, of the C2X standard.


#define x 3
#define f(a) f(x * (a))
#undef x
#define x 2
#define g f
#define z z[0]
#define h g(\~{ }
#define m(a) a(w)
#define w 0,1
#define t(a) a
#define p() int
#define q(x) x
#define r(x,y) x ## y
#define str(x) # x
f(y+1) + f(f(z)) % t(t(g)(0) + t)(1);
g(x+(3,4)-w) | h 5) & m
(f)^m(m);
p() i[q()] = { q(1), r(2,3), r(4,), r(,5), r(,) };
char c[2][6] = { str(hello), str() };

Expansion of f(y+1)

The CPP pushes on the currently empty active-macro-stack-#0 an entry corresponding to the invocation f(y+1):

    // active-macro-stack-#0

    f(y+1)    <---- bottom of the stack
    -----------------------------------

The argument y+1 is expanded to itself, since the identifier y isn’t the name of any macro, and the symbols + and 1 are not subject to expansion.

After argument-substitution, the state is:

    f(y+1)    (rest is: + f(f(z)) % . . .)
    +-----
    | after arg-substitution
    v-----------
    f(x * (y+1))

The CPP now rescans the token-sequence f(x * (y+1)). The identifier f matches with the name of an active macro, f itself, as evident from the state of the active-macro-stack-#0. Hence, this token for the identifier f is marked for non-replacement. In this post, such marked identifiers are written in CAPITAL. The current state of expansion is:

    f(y+1)    (rest is: + f(f(z)) % . . .)
    +-----
    | after arg-substitution
    v-----------
    f(x * (y+1))
    F(x * (y+1))    <--- mark f for non-replacement

Since F is marked for non-replacement, the token-sequence (x * (y+1)) isn’t considered to be a part of an invocation of the macro f. The CPP moves ahead and reaches x. Although a macro named x is defined, undefined and then redefined, the definition that is active is the latest one lexically before the tokens currently being expanded. Accordingly, x is expanded to 2, as shown below.

The identifier x is the name of an obj-like macro. It also doesn’t match with the name of any active macro, as evident from the state of the active-macro-stack-#0. Hence, this token for the identifier x is allowed replacement. The CPP pushes on to the active-macro-stack-#0 an entry for the macro-invocation x:

    // active-macro-stack-#0

    x         <---- top of the stack
    f(y+1)    <---- bottom of the stack
    -----------------------------------

The replacement-list of the macro x is just 2:

    f(y+1)    (rest is: + f(f(z)) % . . .)
    +-----
    | after arg-substitution
    v-----------
    F(x * (y+1))    <--- mark f for non-replacement
      +
      |
      v
      2

The CPP now rescans the token 2. The token isn’t subject to replacement. Thus, the expansion of the macro-invocation x is 2. The CPP will move out of the boundaries of the replacement-list of the macro-invocation x, with no other processing related to that invocation pending. The corresponding entry within the active-macro-stack-#0 is popped off.

The state is:

    // active-macro-stack-#0

    f(y+1)    <---- bottom of the stack
    -----------------------------------
    f(y+1)    (rest is: + f(f(z)) % . . .)
    +-----
    | after arg-substitution
    v-----------
    f(x * (y+1))
    F(x * (y+1))    <--- mark f for non-replacement
    F(2 * (y+1))    <--- replace x by 2

The token-sequence * (y+1)) remains as it is. With that, the CPP now moves out of the boundaries of the replacement-list of the macro-invocation f(y+1), with no other processing related to that invocation pending. The corresponding entry within the active-macro-stack-#0 is popped off, and the stack is now empty.

The output at this stage is: F(2 * (y+1)).

(Note that the capital letters only declare that the token was marked for non-replacement. The actual output of a CPP implementation will be f(2 * (y+1)).)

The rest begins with +. That symbol is output as it is. Then follows the invocation f(f(z)).


Expansion of f(f(z))

The CPP pushes on to the now empty active-macro-stack-#0 an entry corresponding to the invocation f(f(z)).

    // active-macro-stack-#0

    f(f(z))    <---- bottom of the stack
    ------------------------------------
    f(f(z))    (rest is: % t(t(g) . . .)
    +------

The argument is f(z). It must be fully expanded, over a new active-macro-stack, before proceeding.


Expansion of f(z)

The CPP pushes on to a new active-macro-stack-#1 an entry corresponding to the invocation f(z).

    // active-macro-stack-#1

    f(z)    <---- bottom of the stack
    ---------------------------------
    f(z)    (rest is: % t(t(g) . . .)
    +---

The argument is z, which is the name of an object-like macro. It must be fully expanded, over a new active-macro-stack, before proceeding.


Expansion of z

The CPP pushes on to a new active-macro-stack-#2 an entry corresponding to the invocation z.

    // active-macro-stack-#2

    z    <---- bottom of the stack
    ------------------------------

Now towards expanding z:

    z   (rest is:  empty)
    +
    | after substitution
    v---
    z[0]

The CPP now rescans the token-sequence z[0]. The identifier z matches the name of an active macro, z itself, as evident from the active-macro-stack-#2. Hence, this token of the identifier z is marked for non-replacement. Note that this token will be substituted in the expansions of f(z) and of f(f(z)). These substitutions do not change the fact that this token is marked for non-replacement. The rescanning that will occur once the processing of macros f(z) and f(f(z)) resumes will not expand this token. The mark is persistent.

    z   (rest is:  empty)
    +
    | after substitution
    v---
    z[0]
    Z[0]    <--- mark z for non-replacement

The token-sequence [0] is output as it is, without change. With that, the CPP moves out of the boundaries of the replacement-list of the macro-invocation z, with no other processing for that invocation pending. The corresponding entry within the active-macro-stack-#2 is popped off; the stack is now empty.


Expansion of f(z) continues

    // active-macro-stack-#1

    f(z)    <---- bottom of the stack
    ---------------------------------

The fully expanded argument is substituted into the replacement-list of the macro f:

    f(z)    (rest is:   % t(t(g) . . .)
    +---
    | after arg-substitution
    v--------------
    f(x * (Z[0]+1))

The CPP now rescans the token-sequence f(x * (Z[0]+1)). Since the identifier f matches with the name of an active macro, f itself, as evident from the active-macro-stack-#1, this token for the identifier f is marked for non-replacement. Note that this token will make its way through to the replacement-list of f(f(z)); even there this token will remain non-replaceable.

    f(z)    (rest is:   % t(t(g) . . .)
    +---
    | after arg-substitution
    v--------------
    f(x * (Z[0]+1))
    F(x * (Z[0]+1))    <--- mark f for non-replacement

The token-sequence (x * (Z[0]+1)) therefore doesn’t form a macro-invocation of the macro f.

Since the identifier x doesn’t match with the name of any active macro, as evident from the state of the active-macro-stack-#1, it is allowed to be replaced. Its replacement is just 2.

    f(z)    (rest is:   % t(t(g) . . .)
    +---
    | after arg-substitution
    v--------------
    f(x * (Z[0]+1))
    F(x * (Z[0]+1))    <--- mark f for non-replacement
    F(2 * (Z[0]+1))    <--- replace x by 2

Since the identifier z is marked for non-replacement, it is not considered for replacement.

The CPP now moves out of the boundaries of the replacement-list of the macro-invocation f(z), with no other processing for that invocation pending. The corresponding entry within the active-macro-stack-#1 is popped off; the stack is now empty.

The output at this stage is: F(2 * (Z[0]+1)).


Expansion of f(f(z)) continues

    // active-macro-stack-#0

    f(f(z))    <---- bottom of the stack
    ------------------------------------
    f(f(z))     (rest is:   % t(t(g) . . .)
    +------

The fully expanded argument is substituted into the replacement-list of the macro f:

    f(f(z))     (rest is:   % t(t(g) . . .)
    +------
    | after arg-subsitution
    v-----------------------
    f(x * (F(2 * (Z[0]+1))))

The CPP now rescans the token-sequence f(x * F(2 * (Z[0]+1))). The first identifier f matches the name of an active macro, f itself, as evident from the active-macro-stack-#0. Hence, this token for the identifier f is marked for non-replacement.

    f(f(z))     (rest is:   % t(t(g) . . .)
    +------
    | after arg-subsitution
    v-----------------------
    f(x * (F(2 * (Z[0]+1))))
    F(x * (F(2 * (Z[0]+1))))    <--- mark f for non-replacement

Since the identifier x doesn’t match with the name of any active macro, as evident from the state of the active-macro-stack-#1, it is allowed to be replaced. Its replacement is just 2.

    f(f(z))     (rest is:   % t(t(g) . . .)
    +------
    | after arg-subsitution
    v-----------------------
    f(x * (F(2 * (Z[0]+1))))
    F(x * (F(2 * (Z[0]+1))))    <--- mark f for non-replacement
    F(2 * (F(2 * (Z[0]+1))))    <--- replace x by 2

The CPP now moves out of the boundaries of the replacement-list of the macro-invocation f(f(z)), with no other processing for that macro pending. The corresponding entry within the active-macro-stack-#0 is popped off; the stack is now empty.

The cumulative output until now is:

F(2 * (y+1)) + F(2 * (F(2 * (Z[0]+1)))).

The rest begins with %. That symbol is output as it is. Then follows the invocation t(t(g)(0) + t), with (1); . . . as the rest.


Expansion of t(t(g)(0) + t)

The CPP pushes on to the now empty active-macro-stack-#0 an entry corresponding to the macro-invocation t(t(g)(0) + t)

    // active-macro-stack-#0

    t(t(g)(0) + t)    <---- bottom of the stack
    -------------------------------------------
    t(t(g)(0) + t)      (rest is: (1); . . .)
    +-------------

The argument is t(g)(0) + t. It must first be fully expanded, over a new active-macro-stack, before proceeding.


Expansion of t(g)(0) + t

The CPP pushes on to a new active-macro-stack-#3 an entry corresponding to the invocation t(g), with (0) + t as rest.

    // active-macro-stack-#3

    t(g)    <---- bottom of the stack
    ---------------------------------
    t(g)    (rest is:   (0) + t)
    +---

The argument is g, which is the name of an object-like macro. It must be fully expanded, over a new active-macro-stack, before proceeding.


Expansion of g

The CPP pushes on to a new active-macro-stack-#4 an entry corresponding to the invocation g

    // active-macro-stack-#4

    g    <---- bottom of the stack
    ------------------------------
    g   (rest is:  empty)
    +

Now towards expanding g:

    g   (rest is:  empty)
    +
    | after substitution
    v
    f

The expansion of g is f. The CPP now rescans the token f. The identifier f does not match with the name of any active macro, as evident from the active-macro-stack-#4. Hence, this token for the identifier f is allowed replacement. The identifier f matches with the name of a function-like macro, but the rest here is empty. The CPP now moves out of the boundaries of the replacement-list of the macro-invocation g. The corresponding entry within the active-macro-stack-#0 is popped off;the stack is now empty.


Expansion of t(g)(0) + t continues

    // active-macro-stack-#3

    t(g)    <---- bottom of the stack
    ---------------------------------

The fully expanded argument is substituted into the replacement-list of the macro t:

    t(g)    (rest is: (0) + t)
    +---
    | after arg-substitution.
    v
    f

The CPP now rescans the token f. It arrived from the expansion of the argument g, with its ability to be replaced intact. The identifier f doesn’t match with the name of any active macro, as evident from the state of the active-macro-stack-#3.

The identifier f matches the name of a function-like macro, and looking ahead into the rest, the CPP can construct a complete invocation of that macro in the form f(0). Here the point of ambiguity, as described in the post, arises. Choosing Option 1, the identifier f is donated out of the influence of the t(g) macro-invocation. As a result, the entry corresponding to the invocation t(g) is popped off the active-macro-stack-#3 and an entry corresponding to the invocation f(0) is pushed:

    // active-macro-stack-#3

    f(0)    <---- bottom of the stack
    ---------------------------------
    f(0)    (rest is: + t)
    +---

After argument-substitution:

    f(0)    (rest is: + t)
    +---
    | after arg-substitution
    v---------
    f(x * (0))

The process now follows similar to other expansions of the macro f described above:

    f(0)    (rest is: + t)
    +---
    | after arg-substitution
    v---------
    f(x * (0))
    F(x * (0))    <--- mark f for non-replacement
    F(2 * (0))    <--- replace x by 2

With this, the CPP moves out of the boundaries of the replacement-list of the macro-invocation f(0). The corresponding entry within the active-macro-stack-#3 is popped off; the stack is now empty.

The rest consists of token-sequence + t. That sequence is expanded as is. The identifier t does not match with the name of any active macro, as evident by the active-macro-stack-#3; the stack is empty. It does match with the name of the macro t, but the macro is function-like, and here there’s nothing beyond t in the invocation. Hence, the token for the identifier t is allowed replacement during further rescanning.

The output is: F(2 * (0)) + t.

The cumulative output until now is:

F(2 * (y+1)) + F(2 * (F(2 * (Z[0]+1)))) % F(2 * (0)) + t.


Expansion of t(t(g)(0) + t) continues

    // active-macro-stack-#0

    t(t(g)(0) + t)    <---- bottom of the stack
    -------------------------------------------

The fully expanded argument is substituted into the replacement-list of the macro t:

    t(t(g)(0) + t)  (rest is: (1); . . .)
    +-------------
    | after arg-substitution
    v-------------
    F(2 * (0)) + t

The CPP now rescans the token-sequences F(2 * (0)) + t. The identifier f is already marked for no-replacement. The identifier t matches the name of an active macro, t itself, as evident from the state of the active-macro-stack-#0. Hence this token for the identifier t is marked for non-replacement.

    t(t(g)(0) + t)  (rest is: (1); . . .)
    +-------------
    | after arg-substitution
    v-------------
    F(2 * (0)) + t
    F(2 * (0)) + T    <--- mark t for non-replacement

The CPP now moves out of the boundaries of the replacement-list of the macro-invocation t(t(g)(0) + t). The corresponding entry within the active-macro-stack-#0 is popped off; the stack becomes empty.

The rest is (1);<new-line>g(x+ . . . . The next macro-invocation is for the macro g.

The cumulative output until now is:

F(2 * (y+1)) + F(2 * (F(2 * (Z[0]+1)))) % F(2 * (0)) + T(1);<new-line>.


Expansion of g(x+(3,4)-w)

The CPP pushes on to the now empty active-macro-stack-#0 an entry corresponding to the invocation g(x+(3,4)-w).

    // active-macro-stack-#0

    g(x+(3,4)-w)    <---- bottom of the stack
    -----------------------------------------
    g(x+(3,4)-w)    (rest is:  | h 5 . . .)
    +-----------

After argument-substitution:

    g(x+(3,4)-w)    (rest is:  | h 5 . . .)
    +-----------
    | after arg-substitution
    v-------------
    f(2+(3,4)-0,1)

The CPP now rescans the token-sequence f(2+(3,4)-0,1). The identifier f does not match with the name of any active macro, as evident by the state of the active-macro-stack-#0. Hence, this token for the identifier f is eligible for replacement.

The token-sequence corresponds to a complete invocation of the macro f, without looking ahead into the rest of the source file.

The CPP therefore pushes an entry on to the active-macro-stack-#0 corresponding to the invocation f(2+(3,4)-0,1).

    // active-macro-stack-#0

    f(2+(3,4)-0,1)    <--- top of the stack
    g(x+(3,4)-w)      <--- bottom of the stack
    ------------------------------------------

After argument expansion and substitution:

    f(2+(3,4)-0,1)    (rest is: | h 5 . . .)
    +-------------
    | after arg-substitution
    v-------------------
    f(x * (2+(3,4)-0,1))

After rescanning:

    f(2+(3,4)-0,1)    (rest is: | h 5 . . .)
    +-------------
    | after arg-substitution
    v-------------------
    f(x * (2+(3,4)-0,1))
    F(x * (2+(3,4)-0,1))    <--- mark f for non-replacement
    F(2 * (2+(3,4)-0,1))    <--- replace x by 2

With this, the CPP moves out of the boundaries of the replacement-list of the macro-invocation f(2+(3,4)-0,1). The corresponding entry within the active-macro-stack-#0 is popped off.

    // active-macro-stack-#0

    g(x+(3,4)-w)      <--- bottom of the stack
    ------------------------------------------

But with that, the CPP also moves out of the boundaries of the replacement-list of the macro-invocation g(x+(3,4)-w). The corresponding entry on the active-macro-stack-#0 is popped off; the stack is now empty.

The cumulative output until now is:

F(2 * (y+1)) + F(2 * (F(2 * (Z[0]+1)))) % F(2 * (0)) + T(1);
F(2 * (2+(3,4)-0,1)).

The rest begins with |. That symbol is output as is. The rest is now h 5) & m . . ..


Expansion of h

In short:

    // active-macro-stack-#0

    h    <--- bottom of the stack
    -----------------------------
    h    (rest is: 5) & m . . .)
    +
    | after substitution
    v------
    g(\~{ }

After rescanning, and choosing Option 1:

    // active-macro-stack-#0

    g(\~{ } 5)    <--- bottom of the stack
    --------------------------------------
    g(\~{ } 5)    (rest is: & m (f) . . .)
    +---------
    | after arg-substitution
    v---------
    f(\~{ } 5)

After rescanning,

    // active-macro-stack-#0

    f(\~{ } 5)    <--- top of the stack
    g(\~{ } 5)    <--- bottom of the stack
    --------------------------------------
    g(\~{ } 5)    (rest is: & m (f) . . .)
    +---------
    | after arg-substitution
    v---------
    f(\~{ } 5)
    +---------
    | after arg-substitution
    v---------------
    f(x * (\~{ } 5))

After rescanning,

    g(\~{ } 5)    (rest is: & m (f) . . .)
    +---------
    | after arg-substitution
    v---------
    f(\~{ } 5)
    +---------
    | after arg-substitution
    v---------------
    f(x * (\~{ } 5))
    F(x * (\~{ } 5))    <--- marking f for non-replacement
    F(2 * (\~{ } 5))    <--- replacing x by 2

The CPP moves out of the boundaries of the replacement-lists of both the macro-invocations f(\~{ } 5) and g(\~{ }5). The active-macro-stack-#0 becomes empty.

The cumulative output until now is:

F(2 * (y+1)) + F(2 * (F(2 * (Z[0]+1)))) % F(2 * (0)) + T(1);
F(2 * (2+(3,4)-0,1)) | F(2 * (\~{ } 5)).

The rest consists of & m (f)^m(m); . . ..


Expansion of m(f)

In short:

    // active-macro-stack-#0

    m(f)    <--- bottom of the stack
    --------------------------------
    m(f)    (rest is: ^m(m); . . .)
    +---
    | after arg-substitution
    v---
    f(w)

The identifier f is allowed replacement. The argument w is expanded to 0,1.

    // active-macro-stack-#0

    f(w)    <--- top of the stack
    m(f)    <--- bottom of the stack
    --------------------------------
    m(f)    (rest is: ^m(m); . . .)
    +---
    | after arg-substitution
    v---
    f(w)
    +---
    | after arg-substitution
    v-----------
    f(x * (0,1))

After rescanning,

    m(f)    (rest is: ^m(m); . . .)
    +---
    | after arg-substitution
    v---
    f(w)
    +---
    | after arg-substitution
    v-----------
    f(x * (0,1))
    F(x * (0,1))    <--- mark f for non-replacement
    F(2 * (0,1))    <--- replace x by 2

The CPP now moves out of the boundaries of the replacement-list of both the macro-invocations f(w) and m(f). The active-macro-stack-#0 is now empty.

The cumulative output until now is:

F(2 * (y+1)) + F(2 * (F(2 * (Z[0]+1)))) % F(2 * (0)) + T(1);
F(2 * (2+(3,4)-0,1)) | F(2 * (\~{ } 5)) & F(2 * (0,1)).

The rest consists of ^m(m); . . .


Expansion of m(m)

In short:

    // active-macro-stack-#0

    m(m)    <--- bottom of the stack
    --------------------------------
    m(m)    (rest is: ^m(m); . . .)
    +---
    | after arg-substitution
    v---
    m(w)
    +---
    | after arg-substitution
    v-----
    m(0,1)
    M(0,1)    <--- mark m for non-replacement.

The argument w is expanded to 0,1. The identifier m is marked to prevent replacement.

The CPP now moves out of the boundaries of the replacement-list of the macro-invocation m(m). The active-macro-stack-#0 is now empty.

The cumulative output until now is:

F(2 * (y+1)) + F(2 * (F(2 * (Z[0]+1)))) % F(2 * (0)) + T(1);
F(2 * (2+(3,4)-0,1)) | F(2 * (\~{ } 5)) & F(2 * (0,1))^M(0,1);.

The rest still has two source lines worth of macro invocations, but they are simple enough.


Conclusion

The argument expansions run on separate active-macro-stacks. Any token that is marked for non-replacement within the tokens of the argument expansion, retains its mark, even if substituted into parent-constructs’ replacement-lists and then rescanned.