You can do it like this:
import re
s = r'''\begin{theorem}[Weierstrass Approximation] \label{wapprox}
but not match
\begin{theorem}[Weierstrass Approximation]
\label{wapprox}'''
p = re.compile(r'(\\(?:begin|end)(?=((?:{[^}]*}|\[[^]]*])*))\2)[^\S\n]*(?=\S)')
print(p.sub(r'\1\n', s))
pattern details:
( # capture group 1
\\
(?:begin|end)
# trick to emulate an atomic group
(?=( # the subpattern is enclosed in a lookahead and a capture group (2)
(?:{[^}]*}|\[[^]]*])*
)) # the lookahead is naturally atomic
\2 # backreference to the capture group 2
)
[^\S\n]* # eventual horizontal whitespaces
(?=\S) # followed by a non whitespace character
Explanation: if you write a pattern like (\\(?:begin|end)(?:{[^}]*}|\[[^]]*])*)[^\S\n]*(?=\S) you can't prevent cases that have a newline character before the next token. See the following scenario:
(\\(?:begin|end)(?:{[^}]*}|\[[^]]*])*)[^\S\n]*(?=\S) matches:
\begin{theorem}[Weierstrass Approximation]
\label{wapprox}
But since (?=\S) fails (because the next character is a newline) the backtracking mechanism occurs:
(\\(?:begin|end)(?:{[^}]*}|\[[^]]*])*)[^\S\n]*(?=\S) matches:
\begin{theorem}[Weierstrass Approximation]
\label{wapprox}
and (?=\S) now succeeds to match the [ character.
An atomic group is a non capturing group that forbids the backtracking in the subpattern enclosed in the group. The notation is (?>subpattern). Unfortunately the re module doesn't have this feature, but you can emulate it with the trick (?=(subpattern))\1.
Note that you can use the regex module (that has this feature) instead of re:
import regex
p = regex.compile(r'(\\(?:begin|end)(?>(?:{[^}]*}|\[[^]]*])*)[^\S\n]*(?=\S)')
or
p = regex.compile(r'(\\(?:begin|end)(?:{[^}]*}|\[[^]]*])*+[^\S\n]*+(?=\S)')