6

I'm using the guess_lexer() method of Pygments library to identify the source code in a snippet:

This is how I'm using it right now:

from pygments.lexers import guess_lexer
text = "string containing source code"
lexer_subclass = guess_lexer(text)
print str(lexer_subclass)

And based on the language present in the text variable, it will return something like:

<pygments.lexers.PythonLexer>

What I want is only the PythonLexer part. I'm aware that I can get it using string manipulation, but it feels hacky. I want to do it in the correct way.

So I tried to see what Pygment's doing internally and found this method which is responsible for outputting the lexer name:

def __repr__(self):
    if self.options:
        return '<pygments.lexers.%s with %r>' % (self.__class__.__name__,
                                                 self.options)
    else:
        return '<pygments.lexers.%s>' % self.__class__.__name__

Sure enough, if I modify it to return only self.__class__.__name__, I'll get what I want, but that doesn't feel right.

How can I get what I want? Maybe inheriting the class and then overriding the function or something? Any ideas will be appreciated.

1 Answer 1

5

Turns out the solution was simple. I simply had to use the following:

guess_lexer(text).name
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.