1

I'm writing a compiler in C and need to get the ASCII value of a character defined in a source code file. For normal letters this is simple but is there any way to convert the string "\n" to the ASCII number for '\n' in C (needs to work on all characters)?

Cheers

4 Answers 4

3

If the string is one character long, you can just index it:

char *s = "\n";
int ascii = s[0];

However, if you are on a system where the character set used is not ASCII, the above will not give you an ASCII value. If you need to make sure your code runs on such rare machines, you can build yourself an ASCII table and use that.

If on the other hand, you have two characters, i.e.,

char *s = "\\n";

then you can do something like this:

char c;
c = s[0];
if (c == '\\') {
    c = s[1]; /* assume s is long enough */
    switch (c) {
        case 'n': return '\n'; break;
        case 't': return '\t'; break;
        ...
        default: return c;
    }
}

The above assumes that your current compiler knows what '\n' means. If it doesn't, then you can still do it. For finding out how to do so, and a fascinating story, see Reflections on Trusting Trust by Ken Thompson.

Sign up to request clarification or add additional context in comments.

1 Comment

Yeah, the question wasn't clear, but I have updated my answer to cover that case too. Thanks!
1

I'm writing a compiler in C

Probably not a good idea to do it all in raw C. It's far better to be using something like Bison to handle the initial parsing.

That said, the best way of handling \* escapes is just to have a lookup table of what each escape turns into.

Comments

0

You will need to write your own parser/converter. The list of escape sequences can be found online in many places. Parsing C style syntax is extremely difficult, so you may also wish to check out existing free implementations such as Clang.

1 Comment

Boost.Spirit Qi or Lex might also be a good option for parsing a complex language.
0

You will need to implement this yourself. The reason is that what you are doing is determined by the String literal syntax of the language that you are compiling! (The fact that your compiler is implemented in C is immaterial.)

There are conventional escape sequences for String literals that span multiple languages; e.g. \n typically denotes the ASCII NewLine character. However, that doesn't mean that these conventions are appropriate for the language you are trying to compile.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.