info bison

4.6.1 Calling Convention for `yylex`

The value that yylex returns must be the positive numeric code for the type of token it has just found; a zero or negative value signifies end-of-input.

When a token is referred to in the grammar rules by a name, that name in the parser file becomes a C macro whose definition is the proper numeric code for that token type. So yylex can use the name to indicate that type. See section Symbols, Terminal and Nonterminal.

When a token is referred to in the grammar rules by a character literal, the numeric code for that character is also the code for the token type. So yylex can simply return that character code, possibly converted to unsigned char to avoid sign-extension. The null character must not be used this way, because its code is zero and that signifies end-of-input.

Here is an example showing these things:

int
yylex (void)
{
  …
  if (c == EOF)    /* Detect end-of-input.  */
    return 0;
  …
  if (c == '+' || c == '-')
    return c;      /* Assume token type for `+' is '+'.  */
  …
  return INT;      /* Return the type of the token.  */
  …
}

This interface has been designed so that the output from the lex utility can be used without change as the definition of yylex.

If the grammar uses literal string tokens, there are two ways that yylex can determine the token type codes for them:

If the grammar defines symbolic token names as aliases for the literal string tokens, yylex can use these symbolic names like all others. In this case, the use of the literal string tokens in the grammar file has no effect on yylex.

yylex can find the multicharacter token in the yytname table. The index of the token in the table is the token type's code. The name of a multicharacter token is recorded in yytname with a double-quote, the token's characters, and another double-quote. The token's characters are escaped as necessary to be suitable as input to Bison.

Here's code for looking up a multicharacter token in yytname, assuming that the characters of the token are stored in token_buffer, and assuming that the token does not contain any characters like ‘"’ that require escaping.

for (i = 0; i < YYNTOKENS; i++)
  {
    if (yytname[i] != 0
        && yytname[i][0] == '"'
        && ! strncmp (yytname[i] + 1, token_buffer,
                      strlen (token_buffer))
        && yytname[i][strlen (token_buffer) + 1] == '"'
        && yytname[i][strlen (token_buffer) + 2] == 0)
      break;
  }

The yytname table is generated only if you use the %token-table declaration. See section Bison Declaration Summary.

[ < ]

[ > ]

[ << ]

[ Up ]

[ >> ]

[Top]

[Contents]

[Index]

[ ? ]

4.6.1 Calling Convention for yylex

4.6.1 Calling Convention for `yylex`