0

Is it possible to refer to another regular expression inside a regular expression? When I try the following code:

element = re.compile (r"H|He|Li|Be|B|C|N|O|F|Ne|Na|Mg|Al|Si|P|S|Cl|Ar|K|Ca|Sc|Ti|V|Cr|Mn|Fe
|Co|Ni|Cu|Zn|Ga|Ge|As|Se|Br|Kr|Rb|Sr|Y|Zr|Nb|Mo|Tc|Ru|Rh|Pd|Ag|Cd|In|Sn|Sb|Te|I|Xe|Cs|Ba|La
|Ce|Pr|Nd|Pm|Sm|Eu|Gd|Tb|Dy|Ho|Er|Tm|Yb|Lu|Hf|Ta|W|Re|Os|Ir|Pt|Au|Hg|Tl|Pb|Bi|Po|At|Rn|Fr
|Ra|Ac|Th|Pa|U|Np|Pu|Am|Cm|Bk|Cf|Es|Fm|Md|No|Lr|Rf|Db|Sg|Bh|Hs|Mt|Ds")

regex_name01 = (r'(\b)' + element + r'-' + element)
regex_name02 = (r'(\b)' + element + r'-' + element + r'-' + element)
regex_name03 = (r'(\b)' + element + r'-' + element + r'-' + element + r'-' + element)
regex_name04 = (r'(\b)' + element + r'-' + element + r'-' + element + r'-' + element + r'-'
 + element)
regex_name05 = (r'(\b)' + element + r'-' + element + r'-' + element + r'-' + element + r'-' 
+ element + r'-' + element)

I get the following error:

"TypeError: cannot concatenate 'str' and '_sre.SRE_Pattern" objects

How can I solve this without having to put the long expression every time 'element' occurs.

1 Answer 1

2

element is an already compiled regular expression. Why don't concatenate and then compile:

element = r"H|He|Li|Be|B|C|N|O|F|Ne|Na|Mg|Al|Si|P|S|Cl|Ar|K|Ca|Sc|Ti|V|Cr|Mn|Fe
|Co|Ni|Cu|Zn|Ga|Ge|As|Se|Br|Kr|Rb|Sr|Y|Zr|Nb|Mo|Tc|Ru|Rh|Pd|Ag|Cd|In|Sn|Sb|Te|I|Xe|Cs|Ba|La
|Ce|Pr|Nd|Pm|Sm|Eu|Gd|Tb|Dy|Ho|Er|Tm|Yb|Lu|Hf|Ta|W|Re|Os|Ir|Pt|Au|Hg|Tl|Pb|Bi|Po|At|Rn|Fr
|Ra|Ac|Th|Pa|U|Np|Pu|Am|Cm|Bk|Cf|Es|Fm|Md|No|Lr|Rf|Db|Sg|Bh|Hs|Mt|Ds"

regex_name01 = re.compile(r'(\b)' + element + r'-' + element)
regex_name02 = re.compile(r'(\b)' + element + r'-' + element + r'-' + element)
regex_name03 = re.compile(r'(\b)' + element + r'-' + element + r'-' + element + r'-' + element)
regex_name04 = re.compile(r'(\b)' + element + r'-' + element + r'-' + element + r'-' + element + r'-'
 + element)
regex_name05 = re.compile(r'(\b)' + element + r'-' + element + r'-' + element + r'-' + element + r'-' 
+ element + r'-' + element)

Or, you can get the pattern as a string from a compiled regular expression via .pattern:

>>> import re
>>> element = re.compile(r".*")
>>> element.pattern
'.*'

As a side note, you may simplify things a little bit by using the periodictable package:

>>> from periodictable.core import PUBLIC_TABLE
>>> elements = [element.symbol for element in PUBLIC_TABLE]
>>> elements
['n', 'H', 'He', 'Li', 'Be', 'B', 'C', 'N', 'O', 'F', 'Ne', 'Na', 'Mg', 'Al', 'Si', 'P', 'S', 'Cl', 'Ar', 'K', 'Ca', 'Sc', 'Ti', 'V', 'Cr', 'Mn', 'Fe', 'Co', 'Ni', 'Cu', 'Zn', 'Ga', 'Ge', 'As', 'Se', 'Br', 'Kr', 'Rb', 'Sr', 'Y', 'Zr', 'Nb', 'Mo', 'Tc', 'Ru', 'Rh', 'Pd', 'Ag', 'Cd', 'In', 'Sn', 'Sb', 'Te', 'I', 'Xe', 'Cs', 'Ba', 'La', 'Ce', 'Pr', 'Nd', 'Pm', 'Sm', 'Eu', 'Gd', 'Tb', 'Dy', 'Ho', 'Er', 'Tm', 'Yb', 'Lu', 'Hf', 'Ta', 'W', 'Re', 'Os', 'Ir', 'Pt', 'Au', 'Hg', 'Tl', 'Pb', 'Bi', 'Po', 'At', 'Rn', 'Fr', 'Ra', 'Ac', 'Th', 'Pa', 'U', 'Np', 'Pu', 'Am', 'Cm', 'Bk', 'Cf', 'Es', 'Fm', 'Md', 'No', 'Lr', 'Rf', 'Db', 'Sg', 'Bh', 'Hs', 'Mt', 'Ds', 'Rg', 'Cn', 'Uuq', 'Uuh']
>>> element = r"|".join(elements)
Sign up to request clarification or add additional context in comments.

2 Comments

Oh wow, I've been trying millions of different options and have never seen this stupid tiny error. Thanks!
Thanks a lot for the side notes as well!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.