REGEX parsing commands from latex lines - Python

Question

I'm trying to parse and remove any \command (\textit, etc...) from each line loaded (from .tex file or other commands from lilypond files as [\clef, \key, \time]).

How could I do that?

What I've tried

import re
f = open('example.tex')
lines = f.readlines()
f.close()

pattern = '^\\*([a-z]|[0-9])' # this is the wrong regex!!
clean = []
for line in lines:
    remove = re.match(pattern, line)
    if remove:
        clean.append(remove.group())

print(clean)

Example

Input

#!/usr/bin/latex

\item More things
\subitem Anything

Expected output

More things
Anything

Caio Oliveira · Accepted Answer · 2014-05-05 22:51:59Z

2

You could use a simple regex substitution using this pattern ^\\[^\s]*:

Sample code in python:

import re
p = re.compile(r"^\\[^\s]*", re.MULTILINE)

str = '''
\item More things
\subitem Anything
'''

subst = ""

print re.sub(p, subst, str)

The result would be:

More things
Anything

edited May 5, 2014 at 22:51

answered May 5, 2014 at 22:24

Caio Oliveira

1,26313 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Peter Mortensen · Accepted Answer · 2016-01-19 11:45:30Z

0

This will work:

'\\\w+\s'

It searches for the backslash, then for one or more characters, and a space.

edited Jan 19, 2016 at 11:45

Peter Mortensen

31.4k22 gold badges110 silver badges134 bronze badges

answered May 5, 2014 at 22:15

el3ien

5,4151 gold badge19 silver badges34 bronze badges

1 Comment

arnaldo Over a year ago

Hi, Thanks for answer. I also tried using with '^\\\w+\s' but didn't work as well.

Collectives™ on Stack Overflow

REGEX parsing commands from latex lines - Python

What I've tried

Example

2 Answers 2

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

What I've tried

Example

2 Answers 2

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related