1

I would like to parse a list of Haskell statements. For instance, suppose I have the following code:

let a = b
    c = e
out <- return 3

I'd like a function, for instance parseStmts, which can return this in some parsed format.

I've looked into haskell-src-exts and saw parseStmt. This works for a single statement. It has type parseStmt :: String -> ParseResult Stmt, and if you try parseStmt "let a = 3", the result is a successful ParseOk. However, if you provide multiple statements, this function complains because there is more than one statement in the string.

How do I parse multiple statements, without wrapping them in a do block? Alternatively, how can I find the places in a string which are separations of Haskell statements, so I can separate them and then use parseStmt from haskell-src-exts?

Thanks!

2 Answers 2

2

You're looking for parseExp, although the output is a bit large:

> :m + Language.Haskell.Exts.Parser
> parseExp "do\n  let a = b\n      c = e\n  out <- return 3\n  return $ a + c + out"
ParseOk (Do [LetStmt (BDecls [PatBind (SrcLoc {srcFilename = "<unknown>.hs", srcLine = 2, srcColumn = 7}) (PVar (Ident "a")) Nothing (UnGuardedRhs (Var (UnQual (Ident "b")))) (BDecls []),PatBind (SrcLoc {srcFilename = "<unknown>.hs", srcLine = 3, srcColumn = 7}) (PVar (Ident "c")) Nothing (UnGuardedRhs (Var (UnQual (Ident "e")))) (BDecls [])]),Generator (SrcLoc {srcFilename = "<unknown>.hs", srcLine = 4, srcColumn = 3}) (PVar (Ident "out")) (App (Var (UnQual (Ident "return"))) (Lit (Int 3))),Qualifier (InfixApp (Var (UnQual (Ident "return"))) (QVarOp (UnQual (Symbol "$"))) (InfixApp (InfixApp (Var (UnQual (Ident "a"))) (QVarOp (UnQual (Symbol "+"))) (Var (UnQual (Ident "c")))) (QVarOp (UnQual (Symbol "+"))) (Var (UnQual (Ident "out")))))])

I had to add the return $ a + c + out to the end or else it throws an error, since it wouldn't be considered a valid do block otherwise.

Sign up to request clarification or add additional context in comments.

4 Comments

I considered this as a solution, but it requires adding a do, adding a return at the end, and indenting everything for the do. It's doable (get it? do-able?), but kinda ugly, and was hoping for a nicer solution.
@AndrewGibiansky You are wanting to parse correct haskell source code, right? If you don't indent properly and add the return statement, then you can't construct an AST from it. Are you instead wanting to simply parse the code into a less formal structure than the AST?
@AndrewGibiansky I'm not too familiar with using this library, but it looks like you could use the lexTokenStream function from Language.Haskell.Exts.Lexer module, it'll turn the code into tokens, but it still may be unwieldy to work with. This looks like a job for lenses.
I think I'm going to go ahead and use this hack. I need this to parse things that I'm feeding to GHCi. Given a block of text, I need to extract the statements, separate them, and then feed each one to GHCi to evaluate it - and to do this I need to be able to separate the statements. Thanks!
0

I don't think haskell-src-exts offers a ready-made function doing what you want, so one way or another you're going to have to write some of your own parsing code. That said, not all is lost. You may have to hack on haskell-src-exts itself to expose a few of its internals, but it should not be unduly difficult to throw together -- a few hours of work to get something decent if you're already familiar with whatever parsing technology it uses (alex/happy, I think?), or double it if you have to learn the parsing technology, too.

I'm sure some patches to the package to make this kind of thing easier would be welcomed with open arms, as well.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.