Skip to main content

Questions tagged [lexer]

a lexer is a program performing lexical analysis: it converts a sequence of characters into a sequence of tokens.

Filter by
Sorted by
Tagged with
2 votes
5 answers
2k views

We are using a Software as a Service platform that allows to create custom code which integrates in the platform and all its features (dialogues for common objects like Account, Customer, Address, and ...
surfmuggle's user avatar
0 votes
1 answer
112 views

<Definition> ::= <Name> <LeftPar> <param> <RightPar> <Name> ::= <Letter><LetterTail> <LetterTail> ::= <Letter><LetterTail> | ‘’ A ...
User's user avatar
  • 11
0 votes
2 answers
679 views

Aside from modes, Antlr grammars can use "actions" which have to be written in the target language, sometimes seen used to conditionally push and pop from the mode stack. If I were to make a ...
SeriousBusiness100's user avatar
-1 votes
1 answer
188 views

For example, keywords have a special prefix. Objective-C has @interface, @implementation, but that's for compatibility with C. It inherits all the C keywords of course, with no @. How about a language ...
Eugene's user avatar
  • 117
0 votes
1 answer
766 views

I created a simple parser in Rust and defined the AST like this: enum Expression { Number(i32), BinaryOperator(Box<Expression>, Operator, Box<Expression>), Identifier(String), }...
Iter Ator's user avatar
  • 111
-1 votes
1 answer
726 views

So lexers are supposed to emit tokens for key structures like INDENT and DEDENT for indentation stuff, or these: NUMBER ::= [0-9]+ ID ::= [a-Z]+, except for keywords IF ::= 'if' LPAREN ::= '(' ...
Lance Pollard's user avatar
0 votes
2 answers
407 views

So I'm trying to write a interpreter with a lexer. Currently, it adds a token line by line and does some more processing later on. But when I look at sources online, they all seem to go word by word ...
StandingPad Animations's user avatar
15 votes
3 answers
4k views

I'm wondering how to effectively test a lexer (tokenizer). The number of combinations of tokens in a source file can be huge, and the only way I've found is to make a batch of representative source ...
SuperJMN's user avatar
  • 453
0 votes
0 answers
488 views

I'm working on a feature where users can get data based on the if statement they write. The if statement looks something like the excel's conditionals. Basic syntax: IF ( lhs == rhs, ifTrue, ifFalse)...
Shahlin Ibrahim's user avatar
1 vote
4 answers
3k views

This is kinda like a concrete version of the question Coming up with tokens for a lexer. I'm writing a lexer for a small subset of HTML. I'm wondering what should I do when the input stream ends and ...
Lazar Ljubenović's user avatar
0 votes
1 answer
420 views

When writing a lexer/parser, why/when would an advised developer chose to define the tokens' types through an enumeration field/type hierarchy? The closest question I've found here so far was Lexing: ...
user avatar
0 votes
2 answers
447 views

This may be a stupid question, and it would certainly take one Hell of a lexer, but do any extant programming languages allow you to do something like: c# (1.2) { // c# code } Perl (5) { # ...
CoryG's user avatar
  • 125
-2 votes
1 answer
214 views

Lately I've been playing with writing my own programming language, following the excellent Crafting Interpreters book but I've hit something of a snag. I'd like to extend the parser to accept ...
FreeMemory's user avatar
0 votes
1 answer
442 views

For a couple of months now Im writing a interpeter / compiler for a programming language in C#. I have encountered some issues recently which make the code feel incorrectly written Classes change a ...
downrep_nation's user avatar
3 votes
3 answers
591 views

I started development of simulator for simulation of distributed algorithms in language C. My work consist of creating simple language for algorithm description and simulator which takes the described ...
M.Puk's user avatar
  • 133
125 votes
4 answers
26k views

I've taken a deep dive into the world of parsers recently, wanting to create my own programming language. However, I found out that there exist two somewhat different approaches of writing parsers: ...
Qqwy's user avatar
  • 4,947
16 votes
1 answer
3k views

While reading through an answer to the question Clarification about Grammars , Lexers and Parsers, the answer stated that: [...] a BNF grammar contains all the rules you need for lexical analysis ...
Chris's user avatar
  • 2,860
4 votes
1 answer
256 views

Wikipedia says that the lexical process is often divided into two phases. The scanning process, and the evaluation process. Wikipedia defines: The scanning process as: The first stage, the scanner, ...
Chris's user avatar
  • 2,860
5 votes
1 answer
677 views

Could you explain how parsers search for token patterns like in markdown? I probably could come up with something matching only the braces pattern []() as soon as nested patterns are involved it ...
t3chb0t's user avatar
  • 2,601
0 votes
3 answers
164 views

I want to read data in a format like the following using Java. [scenario] id=my_first_scenario next_scenario=null name=_"My First Scenario." map_data="{~add-ons/my_first_campaign/...
Can't Tell's user avatar
  • 1,191
5 votes
1 answer
2k views

tl;dr-ers: How does a lexer normally deal with none-inline statements. statements that do not end with a specified statement delimiter. Such as control flow statements? I believe that I have a fairly ...
Chris's user avatar
  • 2,860
4 votes
2 answers
7k views

I have recently caught the 'Toy Language' bug, and have been experimenting with various simple tokenizer configurations. The most recent one, makes use of the boost.regex library to identify and get ...
Chris's user avatar
  • 2,860
23 votes
3 answers
10k views

As said in the title, which data type should a lexer return/give the parser? When reading the lexical analysis article that Wikipedia has, it stated that: In computer science, lexical analysis is ...
Chris's user avatar
  • 2,860
0 votes
1 answer
737 views

I'm playing around with creating a parser in PHP for my own flavor of BNF, to match strings against grammar in this BNF variant. It's still a work in progress and subject to change (I may even end up ...
Decent Dabbler's user avatar
-2 votes
1 answer
777 views

I'm really interested in writing my own converter. I know C++/Python/Objective-C/Swift and a little Haskell. There are website like objectivec2swift and iswift.org, which can convert OC to Swift ...
Tiper's user avatar
  • 15
3 votes
6 answers
2k views

I'm really interested in writing my own general-purpose high-level programming language, but I'm somewhat confused. I know that Python and Ruby were written in C, which makes me wonder that if I want ...
Ericson Willians's user avatar
1 vote
2 answers
671 views

I have various objects inside my AST, such as IfBlock, FunctionBlock, LogicExpression, etc. All of those objects share a context, which is basically a hashmap with some variables. It's a very simple ...
vinnylinux's user avatar
8 votes
4 answers
7k views

I've always wanted to learn how to write a compiler - I've decided to use ANTLR, and am currently reading through the book (its very good by the way) I'm pretty new to this, so go easy, but the jist ...
phatmanace's user avatar
  • 2,485
26 votes
6 answers
7k views

I'm slowly working to finish my degree, and this semester is Compilers 101. We're using the Dragon Book. Shortly into the course and we're talking about lexical analysis and how it can be implemented ...
Telastyn's user avatar
  • 110k
4 votes
2 answers
750 views

How does Lexer/Parser work in a 2D programming language like Funciton in order to transform such an unusual source-code to the correct AST?
53777A's user avatar
  • 1,718
6 votes
1 answer
2k views

Is it a lexer's job to undo any escaping done to a string literal? For example: "Me: \"Hello World!\"" Becomes: Me: "Hello World!" Should this conversion be done inside the lexer? I am guessing it ...
Jeroen's user avatar
  • 613
5 votes
4 answers
2k views

When lexing, what would be the best way to tokenize operators? Would one just create a BinaryOperator token, or a separate token for every single binary operator? Examples: PlusOperator, MinusOperator,...
Jeroen's user avatar
  • 613
0 votes
2 answers
801 views

I am currently implementing a lexer that breaks XML files up into tokens, I'm considering ways of passing the tokens onto a parser to create a more useful data structure out of said tokens - my ...
The_Neo's user avatar
  • 201
0 votes
0 answers
975 views

I want to build a template engine (ITT not another template engine...) based on Razor. I've been at it for quite a long time not getting anywhere and quite frankly I'm at my limit. I've tried rolling ...
Daryl Teo's user avatar
  • 111
7 votes
2 answers
2k views

I'm trying to understand the theory behind a lexer with the purpose of building one (just for my own fun and experience and to compensate for not taking proper CS courses :)). What I have yet to ...
JohnDoDo's user avatar
  • 2,319
9 votes
3 answers
2k views

Background info (May Skip): I am working on a task we have been set at uni in which we have to design a grammar for a DSL we have been provided with. The grammar must be in BNF or EBNF. As well as ...
The_Neo's user avatar
  • 201
6 votes
3 answers
3k views

I'm aware that most modern languages use reserved words to prevent things like keywords from being used as identifiers. Reserved words aside, let's assume a language that allows keywords to be used ...
jhewlett's user avatar
  • 2,314
4 votes
3 answers
731 views

I'm writing a lexer in JavaScript. It's pretty typical - rules are specified with regular expressions and produce a token. I am unsure of the best way to handle when multiple rules are matched. The ...
jhewlett's user avatar
  • 2,314
7 votes
1 answer
1k views

I'm in the planning stage of making a code beautifier (similar to AStyle or Uncrustify) - originally I was going to just contribute to one of those projects, but reviewing their source led me to the ...
Matt Kline's user avatar
1 vote
4 answers
2k views

This extends off this other Q&A thread, but is going into details that are out of scope from the original question. I am generating a parser that is to parse a context-sensitive grammar which can ...
Adrian's user avatar
  • 157
-3 votes
2 answers
368 views

I was creating a language and discovered that my language tokenizer would have to change depending where in the parse it is. I.e. abc[1] would be parsed as 4 tokens (abc, [, 1, ]), where as { abc[1] }...
Adrian's user avatar
  • 157
3 votes
1 answer
1k views

Are there any practical references (with actual examples) for getting started implementing a small, lazy functional programming language with graph reduction? A reference that included the lexing and ...
user avatar
3 votes
1 answer
830 views

In http://www.json.org Douglas Crockford shows the specs of the JSON format in two interesting ways: In the right side column he lists a text spec that looks like a YACC or LEX listing. In the main ...
Sebastián Grignoli's user avatar
2 votes
1 answer
463 views

I'm working on a toy compiler (for some simple language like PL/0) and I have my lexer up and running. At this point I should start working on building the parse tree, but before I start I was ...
adrianton3's user avatar
5 votes
5 answers
7k views

I am going to make a compiler for C and looking up on how compilers work on Wikipedia has told me a lot. However, after reading up on lexers has confused me. The Wikipedia page states that: the GNU ...
Cole Tobin's user avatar
  • 1,533
16 votes
5 answers
7k views

I'm writing a parser for a markup language that I have created (writing in python, but that's not really relevant to this question -- in fact if this seems like a bad idea, I'd love a suggestion for a ...
Explosion Pills's user avatar
9 votes
5 answers
3k views

I've been looking at a few lexers in various higher level langauges (Python, PHP, Javascript among others) and they all seem to use regular expressions in one form or another. While I'm sure regex's ...
Blank's user avatar
  • 253
24 votes
5 answers
9k views

When I began to use parser combinators my first reaction was a sense of liberation from what felt like an artificial distinction between parsing and lexing. All of a sudden everything was just ...
Eli Frey's user avatar
  • 341
20 votes
4 answers
41k views

What are good resources on how to write a lexer in C++ (books, tutorials, documents), what are some good techniques and practices? I have looked on the internet and everyone says to use a lexer ...
user avatar