3

I have a school project to develop a static analyzer in C for C.

Where should I start? What are some resources which could assist me?

I am assuming I will need to parse C, so what are some good parsers for C or tools for building C parsers?

3
  • What's a statistic analyzer for C? What does it analyze? Commented Feb 7, 2011 at 1:36
  • What level of static analysis are you being asked to tackle? Are we talking building a lint? Or something more robust. Filling out the details to a degree would help us. Commented Feb 7, 2011 at 2:10
  • @Jordan - I was asked for making the general one, as an analyzer of buffer overflow, arithmetic overflow.. Commented Feb 7, 2011 at 2:46

4 Answers 4

2

I would first take yourself over to antlr, look at its getting started guide, it has a wealth of information about parsing etc.., I personally use antlr as it gives a choice of code generation targets.

To use antlr you need a c or c++ grammar file, pick of these up and start playing.

Anyway have fun with it..

Sign up to request clarification or add additional context in comments.

Comments

1

Probably your best starting point would by Clang (with the proviso that it already has a static analyzer, so unless you want to write one for its own sake, you might be better off using/enhancing the existing one).

3 Comments

It is a school project, OP might need to start from scratch.
@Chris Lutz: It could well be. Without knowing the assignment, it's hard to guess exactly what he needs to do...
Clang is written in C++ so might not be allowed for this project.
1

Are you sure that you want to write the analyzer in C?

If you were using a modern langauge (e.g. C#, Java, Python), then I would second spgennard's suggestion of ANTLR for the parser.

If writing the analyzer in C is a requirement then you are stuck with lex and yacc (flex and bison) or maybe a hand-crafted parser.

Looks like Uno comes close to what you want to do. It uses lex/yacc and includes the grammar files. The analysis part however is written in C++.

Maybe you can get some more ideas about the how and what from tools listed at SpinRoot. Wikipedia also has some good info.

1 Comment

You don't want to write your own parser for C (and you really don't want to do this for C++), unless you want to spend all your time getting the parser right and none of it working on the static analyzer. Get a front end that already does the parsing and the name and type resolution for you.
1

Parsing is the easiest and least important part of a static analyser. Antlr was already suggested, it should be sufficient for parsing plain C (but not C++). Just a little tip - do not implement your own preprocessor, better reuse the output of gcc -E.

As for the rest, you can take a look at some of the existing analysers sources, namely Clang and CIL, read about an SSA representation and abstract interpretation. Choosing the right intermediate representation for your code is a key.

I doubt it can be an easy task in plain C, so you'd probably end up implementing some sort of DSL on top of it to handle ASTs and transforms. Sounds like something much bigger than a typical school project.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.