1

How can I use flex lexer in C++ and modify a token's yytext value? Lets say, I have a rule like this:

"/*"    {
        char c;
        while(true)
            {
            c = yyinput();
            if(c == '\n')
                ++mylineno;

            if (c==EOF){
                yyerror( "EOF occured while processing comment" );
                break;
            }
            else if(c == '*')
                {
                if((c = yyinput()) == '/'){
                    return(tokens::COMMENT);}
                else
                    unput(c);
                }
            }
        }

And I want to get token tokens::COMMENT with value of comment between /* and */. (The bove solution gives "/*" as the value.

Additional, very important is tracking the line number, so I'm looking for solution supporting it.

EDIT Of course I can modify the yytext and yyleng values (like yytext+=1; yyleng-=1, but still I cannot solve the above problem)

8
  • Do you have a parser taking tokens from here. or it just lexer only? You can solve this easily in the parser. Commented Apr 5, 2013 at 14:30
  • I really would like to solve it in the lexer - is it possible? Commented Apr 5, 2013 at 14:41
  • Check out this existing answer: stackoverflow.com/a/2130124/1003855 Commented Apr 5, 2013 at 14:45
  • Sorry, but there is no answer to my question :( Commented Apr 5, 2013 at 14:49
  • @danilo2 ok. Then how do you handle strings? do you have some pool to store them in? or show me how you recognize a string literal. Commented Apr 5, 2013 at 14:50

1 Answer 1

1

I still think start conditions are the right answer.

%x C_COMMENT
char *str = NULL;
void addToString(char *data)
{
    if(!str)
    { 
        str = strdup(data);
    }
    else
    {
        /* handle string concatenation */
    }
}

"/*"                       { BEGIN(C_COMMENT); }
<C_COMMENT>([^*\n\r]|(\*+([^*/\n\r])))*    { addToString(yytext); }
<C_COMMENT>[\n\r]          { /* handle tracking, add to string if desired */ }
<C_COMMENT>"*/"            { BEGIN(INITIAL); }

I used the following as references:
http://ostermiller.org/findcomment.html
https://stackoverflow.com/a/2130124/1003855

You should be able to use a similar regular expression to handle strings.

Sign up to request clarification or add additional context in comments.

3 Comments

Your solutin do not track the line number. So if you want to get place information, you cannot do it.
If that's a requirement, then I will modify my code to support it.
Yes it is (I updated the question), so if you know the answer, I'll be thankful for it.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.