What differentiates syntax analysis and semantic analysis?

Question

To my understanding, Parser is composed of three stages of lexical, syntactic and semantic analysis.

Lexical: It will split my input into tokens. Example: 123+100-0 -> 123 + 100 - 0
Syntactic: It will look into the tokens and check if they make sense with each other.

The problem i am having is understanding the final phase "semantic parsing" and how it differentiates to the second stage "syntactic analysis". To my understanding the final stage "semantic parsing" also validates the tokens that has been validated by "syntactic analysis" and then prints out the output.

Welcome to StackOverflow, please read this guide to ask good questions :). What is your request here ? — totok
– totok, Commented Jul 1, 2020 at 13:36
Please read How to Ask - SO is not a discussion forum but lives to collect a Q&A as resource for devs. Your question does not seem to pose any special question. — Patrick Artner
– Patrick Artner, Commented Jul 1, 2020 at 13:37

templatetypedef · Accepted Answer · 2020-07-01 21:03:23Z

Generally speaking, the goal of syntax analysis is to turn a stream of tokens (generated by lexical analysis) into some sort of structured representation of what the program "means." For example, in syntax analysis, you'd expect the tokens a, /, and b to be converted into something that says "the quotient of a and b." Typically, the output of syntax analysis is an abstract syntax tree, which is a tree structure encoding the structure of the program.

The job of semantic analysis is to make sure that the representation generated by the syntax analysis step actually "makes sense." For example, suppose that the syntax analyzer has converted a, /, and b into "the quotient of a and b." At the time the syntax analyzer did this, it probably had no conception of what the types of a and b were; more likely, the syntax analyzer simply saw that this could be interpreted as a mathematical expression and left things at that. However, it's entirely possible that this is not a well-defined operation. For example, if a is a string and b is a boolean, then a / b is a nonsense operation and the compiler should report an error.

For a more elaborate example, suppose you're implementing a Java compiler and you see the tokens public, class, A, extends, B, {, }. You could interpret this to be a class definition during syntax analysis, since it's a syntactically-valid class. But what if B, which is defined later in the file, isn't a class? Or what if B isn't defined at all? In that case, the code is incorrect. It's syntactically valid in the sense that it conceivably could be legitimate code, but it's semantically incorrect in that there isn't a class B to be found. The syntax analyzer would therefore produce a totally valid AST, while the semantic analyzer would then reject the code after it found that B didn't exist.

There's no fundamental reason why we have to keep these two steps separated, and in fact some compilers meld the two together. The main reason for doing this is practical - we have some great tools for building syntax analyzers (LR parsers, LL parsers, etc.), and they work by interpreting sequences of tokens without thinking too much about what they mean. The compiler author can then specify the general syntax of the language, then write a semantic analyzer as a second pass that looks over the AST and determines whether it's valid. This makes the separation between what the parser generators can do automatically and what the programmer needs to write code for cleaner.

Hope this helps!

Matt Timmermans · Accepted Answer · 2020-07-01 21:08:56Z

Theoretically, syntactic analysis determines whether or not an instance of the language is "well formed" and analyzes its grammatical structure, while semantic analysis analyzes its meaning and whether or not it "makes sense".

Basically, syntactic analysis may depend on the types of words, but not their meaning. Semantic analysis depends on the meanings. "I like green tomatoes" and "I fill wiggly thougts" have the same syntactic structure and are both well-formed, but the latter makes no sense semantically.

This theoretical distinction is actually a little blurry, because separating a word's "type" from its "meaning" involves some arbitrary choices.

In the practice of compiler development, however, the distinction is clear: Syntactic analysis is performed by the parser, driven by the grammar, depending on the types of the tokens. Semantic analysis starts with the actions, written in code, attached to the rules in the grammar. Basically anything that needs to look up an actual identifier or evaluate the actual value of a literal value, is semantic analysis.

Some languages require the results of semantic analysis to be fed back into the parser to help with syntactic analysis. This ruins the theoretical distinction between syntactic and semantic analysis, but practically we just recognize this as a bit of a cheat and continue to refer to the parsing code as syntactic.

Collectives™ on Stack Overflow

What differentiates syntax analysis and semantic analysis?

2 Answers 2

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related