0

I have a basic query. I have a string like below:

on one off abcd on two off

I want to find out all the string between 'on' and 'off' the result I am expecting here is 'one' and 'two'

I believe this is possible with sed..

I tried with sed 's/on\(.*\)off/\1/g' but this returns one off abcd on two

3 Answers 3

2

With sed, I think the easiest way is to use two sed processes:

echo 'on one off abcd on two off' | sed 's/\<on\>[[:space:]]*/\non\n/g; s/[[:space:]]*\<off\>/\noff\n/g' | sed -n '/^on$/,/^off$/ { //!p; }'
one
two

This falls into two parts:

sed 's/\<on\>[[:space:]]*/\non\n/g; s/[[:space:]]*\<off\>/\noff\n/g'

puts the on and off on easily recognizable, single lines, and

sed -n '/^on$/,/^off$/ { //!p; }'

prints just the stuff between them.

Alternatively, you could do it with Perl (which supports non-greedy matching and lookarounds):

$ echo 'on one off abcd on two off' | perl -pe 's/.*?\bon\b\s*(.*?)\s*\boff\b.*?((?=\bon\b)|$)/\1\n/g; s/\n$//'
one
two

Where the

s/.*?\bon\b\s*(.*?)\s*\boff\b.*?((?=\bon\b)|$)/\1\n/g

puts everything between \bon\b and \boff\b (where \b matches word boundaries) on a single line. The main trick is that .*? matches non-greedily, which is to say it matches the shortest string necessary to find a match for the full regex. The (?=\bon\b) is a zero-length lookahead term, so that the .*? matches only before another on delimiter or the end of the line (this is to discard data between off and on).

The

s/\n$//

just removes the last newline that we don't need or want.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks for the anwser... but this is not printing anything at all.. can you please revisit
Do you, per chance, use Mac OS X?
The perl option seems to be working... now to the actual scenario where I would be doing testing <replayqueue>abdc</replayqueue>ccc<replayqueue>lmn</replayqueue>dddd<replayqueue>xyz</replayqueue> I would like to get abcd, ccc, xyz....
Honestly, if you wanted to extract data from XML, why didn't you ask a question about extracting data from XML? There are tools that make this much easier and more reliable. For example, with xmlstarlet you could just write xmlstarlet sel -t -v '//replayqueue/node()' -n.
0

Here is an awk version

awk -v RS=" " '/\<off\>/ {f=0} f; /\<on\>/ {f=1}' file
one
two

Comments

0
sed 's/\(.*\) off.*/ \1³/;s/ off /³/g;s/ on /²/g;s/³[^²]*²/³²/g;s/^[^²]*²/²/;s/²/\
/g;s/.//;s/³//g'
  • use ²and ³ as delimiter (because POSIX sed does not allow a group rejection but a class) instead of onand off. Other character not used in the string could be use (avoid maybe meta char like &, ...)
  • other action is to separate external content (remove) and reformat

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.