I am writing a tokenizer for files that have ignored preamble. The files are written in Markdown and there is a list of keywords in H1 titles that change the state of the parser. When an EOF is encountered the state machine goes back to the ignore state and looks for keyword headings.
I have decided to make a generalized package for my future use, so I have written the following peek code:
def peek(self):
while True:
if (self.skip_white_space and self.expect_type('WHITE_SPACE') or \
self.skip_EOF and self.expect_type('EOF')) and \
not self.is_end():
self.consume()
continue
break
return None if self.is_end() else self.lines[self.line][self.column]
Is there a better way of doing this? The infinite loop is ugly. The huge if condition is also poo.
I tried to learn Python and apparently failed. I was expecting something that isn't poo.
self.is_end()returns True, when you also exit.