1

I am writing a tokenizer for files that have ignored preamble. The files are written in Markdown and there is a list of keywords in H1 titles that change the state of the parser. When an EOF is encountered the state machine goes back to the ignore state and looks for keyword headings.

I have decided to make a generalized package for my future use, so I have written the following peek code:

def peek(self):
        while True:
            if (self.skip_white_space and self.expect_type('WHITE_SPACE') or \
            self.skip_EOF and self.expect_type('EOF')) and \
            not self.is_end():
                self.consume()
                continue
            break
        return None if self.is_end() else self.lines[self.line][self.column]

Is there a better way of doing this? The infinite loop is ugly. The huge if condition is also poo.

I tried to learn Python and apparently failed. I was expecting something that isn't poo.

4
  • What exactly is the definition of the methods you call? Can you include the code? More importantly, what is your definition of "poo" in this context? Is there something not working? If it is just a matter of you not liking how it looks, then this question is off topic here. Commented Jun 29, 2024 at 22:37
  • Hi @trincot. Thanks for the quick reply. To some extant I don't like how the code looks. More importantly, I am worried that the infinite loop may become actually infinite in some edge case. I am not sure what edge case to look for and I am hoping to refactor the code to avoid an infinite loop. My most pressing concern then is to figure out how to perform the same task without the infinite loop. Commented Jun 29, 2024 at 23:08
  • The loop is not infinite. Either you consume a token, or exit. If you keep consuming tokens, eventually self.is_end() returns True, when you also exit. Commented Jun 29, 2024 at 23:10
  • @trincot - Thanks for the confirmation. I pushed the code to github. Commented Jun 29, 2024 at 23:31

1 Answer 1

1

To make your loop more pythonic. You can rewrite the method in this way:

def peek(self):
    def skip_next() -> bool:
        conf = self.configuration
        return (
            (conf['skip_white_space'] and self.expect_type('WHITE_SPACE'))
            or (conf['skip_EOF'] and self.expect_type('EOF'))
        )
    
    while not self.out_of_tokens() and skip_next():
        self.consume()
    
    return (
        None if self.out_of_tokens()
        else self.tokens[self.line][self.column]
    )

In this case, while condition will do everything without the need to use continue and break. I also put your long if statement in a separate function to make it easier to read.

Check it out and let me know please if it helps.

Sign up to request clarification or add additional context in comments.

1 Comment

Yes. This helped a great deal. I keep forgetting that you can define helper functions within a function. Thank you.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.