1

I'm currently working on a code project in Python that transliterates LaTeX mathematical markup into standard Python commands \frac{a}{b} to a/b.

I went about this in a way that I felt would be the most friendly towards nested equations: recursion. Every equation is broken up into objects and operators, and objects, such as parenthetical statements and LaTeX terms, are evaluated again, until maximum depth is attained.

However, I've hit somewhat of a roadblock with regex when it comes to dismantling certain LaTeX terms with multiple nested parameters, like the one I mentioned above. After fiddling around and googling for an eternity, I ended up with this:

http://regex101.com/r/oO5oG9

Only problem is, I encounter this error when trying to evaluate the exact same term in Python:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python34\lib\re.py", line 206, in findall
    return _compile(pattern, flags).findall(string)
  File "C:\Python34\lib\re.py", line 288, in _compile
    p = sre_compile.compile(pattern, flags)
  File "C:\Python34\lib\sre_compile.py", line 465, in compile
    p = sre_parse.parse(p, flags)
  File "C:\Python34\lib\sre_parse.py", line 746, in parse
    p = _parse_sub(source, pattern, 0)
  File "C:\Python34\lib\sre_parse.py", line 358, in _parse_sub
    itemsappend(_parse(source, state))
  File "C:\Python34\lib\sre_parse.py", line 694, in _parse
    p = _parse_sub(source, state)
  File "C:\Python34\lib\sre_parse.py", line 358, in _parse_sub
    itemsappend(_parse(source, state))
  File "C:\Python34\lib\sre_parse.py", line 694, in _parse
    p = _parse_sub(source, state)
  File "C:\Python34\lib\sre_parse.py", line 358, in _parse_sub
    itemsappend(_parse(source, state))
  File "C:\Python34\lib\sre_parse.py", line 681, in _parse
    raise error("unexpected end of pattern")
sre_constants.error: unexpected end of pattern

I'm not quite sure what the problem is in my regex, and have been changing little things for a while trying to get it to work, to no avail...

3
  • seems you want something like this regex101.com/r/oO5oG9/2 Commented Oct 13, 2014 at 6:39
  • a more simple approach consists to replace the innermost expressions first. To do that you only need to forbids nested curly brackets. Commented Oct 13, 2014 at 6:59
  • Could you add a link to the project? Commented Oct 13, 2014 at 7:41

2 Answers 2

1

You could eventually solve it with pyparsing. It is available via pip (see PyPI). An example how to use it is https://stackoverflow.com/a/20846900/562769.

Pyparsing makes use of formal grammars to parse strings. It is not regex, but it might be suited better to your problem.

Sign up to request clarification or add additional context in comments.

Comments

0

Python's default re module won't support recursive calls such as (?R) or (?0) (which recurses the entire pattern). But the external regex module would support this.

>>> import regex
>>> s = "\\test{5-\\tan{66}} {8+\\frac{\\cos{2}}{1}} {\\acoth{}}"
>>> regex.findall(r'(\{(?:[^{}]|(?0))*\})', s)
['{5-\\tan{66}}', '{8+\\frac{\\cos{2}}{1}}', '{\\acoth{}}']

Source: http://www.regular-expressions.info/recurse.html

1 Comment

This worked! I hope in the future, Python adds support for recursion. For now, importing this package: pypi.python.org/pypi/regex as re works just fine

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.