1

Hivemind!

I am trying to edit Memsource *.mxliff files with Notepad++.
When I create a search task, both source and target lines mix up in a huge list.
But I need to make amendments only for lines inside <target>.../<target> tags.

For example:

1. <target>от 15 до 35 °C</target>
2. <target>Допустимый диапазон температур воздуха от -40 °C до +70 °C {1&gt;1)&lt;1}</target>  

Inside these lines I need to replace all instances of degrees with a non-breaking space version:

Find: (\d) °C
Replace with: $1 $2

What is the most optimal way to do so?
Any hints would be much appreciated. Thanks a lot!

2
  • 1
    (\d)\s*(°C) -> \1 \2 Commented Dec 2, 2019 at 0:29
  • 1
    (\d)\s*°C < /target > -> \1 °C < /target > likely with &nbsp instead of the space, and maybe gentler search of the tag, with extra \s Commented Dec 2, 2019 at 1:40

3 Answers 3

3
  • Ctrl+H
  • Find what: (?:<target>|\G)(?:(?!target)[\s\S])*?\K(\d+)\s*(°C)(?=.*</target>)
  • Replace with: $1&nbsp;$2
  • CHECK Match case
  • CHECK Wrap around
  • CHECK Regular expression
  • UNCHECK . matches newline
  • Replace all

Explanation:

(?:                         # non capture group
    <target>                # opening tag
  |                         # OR
    \G                      # restart from last match position
)                           # end group
(?:(?!target)[\s\S])*?      # 0 or more any character but not "target"
\K                          # forget all we have seen until this position
(\d+)                       # group 1, 1 or more digits
\s*                         # 0 or more spaces
(°C)                        # group 2
(?=.*</target>)             # must be followed by </target>

Replacement:

$1              # content of group 1 (i.e. the temperature)
&nbsp;          # non breaking space
$2              # content of group 2 (i.e. °C)

Screen capture (before):

enter image description here

Screen capture (after):

enter image description here

Sign up to request clarification or add additional context in comments.

4 Comments

The updated version worked like a charm, Toto! Thank you so much, you've made my day. The only update I've implemented was replacement of $nbsp; with an alt+0160 combination.
@Toto Why is the |\G (OR restart from last match position) necessary?
@John: That allows to match all temperatures from the open tag or from the last temperature.
@Toto Tyvm! I really appreciate reading your answers to so many regex questions. In fact you are the only one I can remember who answers regex questions in a sensible way. - I assume I even have understood lookaheads and behinds now, whereas I often see just links to the PCRE manual in other answers.
1

Assuming that we'd have one °C in a target tag, maybe some expression similar to:

(<\s*target\s*>[\s\S]*?\d)\s+(°C[\s\S]*?<\s*\/target\s*>)

and replaced with,

$1$2

might be OK to look into.

RegEx Demo


If you wish to simplify/update/explore the expression, it's been explained on the top right panel of regex101.com. You can watch the matching steps or modify them in this debugger link, if you'd be interested. The debugger demonstrates that how a RegEx engine might step by step consume some sample input strings and would perform the matching process.


Comments

0

Short alternative as always ....
Find what: (\d+) (?=°C)
Replace all with: $1NBSP
Done!

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.