1

Below is the text that I have:

Etiam porta sem malesuada magna mollis euismod. Praesent commodo cursus magna,
vel scelerisque nisl consectetur et. Nulla vitae elit libero, a pharetra augue. 
Donec sed odio dui. Donec id elit non mi porta gravida at eget metus.
|------|------| 
|6 | TEXT | 
|7 | TEXT | 
|8,9 | TEXT | 
|------|------|
Etiam porta sem malesuada magna mollis euismod. Praesent commodo cursus magna,
vel scelerisque nisl consectetur et. Nulla vitae elit libero, a pharetra 

I want to match this bit how would I do with a regular expression?

|6 | TEXT | 
|7 | TEXT | 
|8,9 | TEXT |

Here is what I have so far

How can I achieve this?

2
  • @ne1410s Nice edit, but there are two more. Commented Feb 4, 2016 at 12:32
  • 1
    You need to be more explicit about the rules for determining the match. Do you want to match anything that is between two lines |------|------| or must the lines of text to be matched have the the specific form as in your example (i.e., | followed by one positive integer or two or more positive intergers followed by...)? In the latter case must the text to be matched be bracketed by the line |------|------|? Is it important that a regular expression be used or are you just assuming that's the only way of doing it? Commented Feb 4, 2016 at 12:39

4 Answers 4

1

The following matches what you need

\|\d(,\d)* \| .+ \|

It matches a | then a digit, then zero or more , and digit, then |, then any text, then |

As shown here: https://regex101.com/r/eB0vI3/2

Sign up to request clarification or add additional context in comments.

3 Comments

Not sure why that was down voted. Looks fine to me.
I think it is because of [A-z] (that does not match only letters) and in general we do not know what is in between | and |. And no idea how big the integer part can be.
Regular expression by itself is just an object. It does not do anything.
1

Must you use a regular expression? If your string is str, you can write;

puts str.split('|------|------|')[1]
  # |6 | TEXT | 
  # |7 | TEXT | 
  # |8,9 | TEXT | 

1 Comment

Oh, you used a fixed form.
0
string.split(/^[-|]+\s*\n/)[1]

............

1 Comment

Downvoted for "....." in place of explanatory text. I know you're a good writer, @sawa. You can do so much better.
0

I would use your current pattern as a delimiter and use the lazy dot matching pattern to match the subtexts you need (using a capturing group in the pattern around that subtext with String#scan):

/^\|-+\|-+\|\p{Zs}*\s*(.*?)(?=\s*^\|-+\|-+\|)/m

See regex demo. I added some more subpatterns to "trim" the output "on the fly". /m modifier is used to make a . match any character including a newline. ^\|-+\|-+\|\p{Zs}*\s* will match the leading delimiter, (.*?) will match and capture the shortest string up to the next delimiter, and (?=\s*^\|-+\|-+\|) will not be consumed (in case you want overlapping matches). Remove (?= and the last ) to avoid overlapping matches.

rx = /^\|-+\|-+\|\p{Zs}*\s*(.*?)(?=\s*^\|-+\|-+\|)/m
s = "Donec sed odio dui. Donec id elit non mi porta gravida at eget metus.\n|------|------| \n|6 | TEXT | \n|7 | TEXT | \n|8,9 | TEXT | \n|------|------|\nEtiam porta sem malesuada magna mollis euismod. Praesent commodo cursus magna,\nvel scelerisque nisl consectetur et. Nulla vitae elit libero, a pharetra "
puts s.scan(rx)

IDEONE demo

2 Comments

\h doesn't exist in Ruby.
Okey-dokey, still, \p{Zs} is there.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.