Regex delete lines after match

Question

I'm trying to match domain example.com and I would like to delete all IPs beneath it

Input:

[example.com]
10.100.251.1
10.100.251.2
10.100.251.3
[example.net]
10.100.251.22
10.100.251.33

Desired output:

[example.net]
10.100.251.22
10.100.251.33

Here is what I have tried so far:

\[example.com\](\s+^(?:[0-9]{1,3}\.){3}[0-9]{1,3}$)*

It works, but not sure if thats efficient.

I'm doing my regex testing with rubular here is a sample

http://rubular.com/r/cavVHWPvT2

this doesn't seem like the job for a regex, what do you mean delete? — Ryan
– Ryan, Commented Oct 30, 2016 at 6:40
Why don't you: try to put the second part into an array. Then, looping it and checking for containing in the first part. If matching, delete it? — Tân
– Tân, Commented Oct 30, 2016 at 6:50

Community · Accepted Answer · 2017-05-23 12:16:35Z

I wouldn't bother with a complex regex, I'd do it using Ruby's slice_before:

data = '[example.com]
10.100.251.1
10.100.251.2
10.100.251.3
[example.net]
10.100.251.22
10.100.251.33
'

data.lines.slice_before(/\A\[/).select { |ary| ary.first[/example\.net/] }.join
# => "[example.net]\n10.100.251.22\n10.100.251.33\n"

Breaking it down:

data
  .lines # => ["[example.com]\n", "10.100.251.1\n", "10.100.251.2\n", "10.100.251.3\n", "[example.net]\n", "10.100.251.22\n", "10.100.251.33\n"]
  .slice_before(/\A\[/) # => #<Enumerator: #<Enumerator::Generator:0x007f987b8b4528>:each>
  .select { |ary| ary.first[/example\.net/] } # => [["[example.net]\n", "10.100.251.22\n", "10.100.251.33\n"]]
  .join # => "[example.net]\n10.100.251.22\n10.100.251.33\n"

Regular expressions are great, and I use them when necessary but they're not always the best tool for a task. They can be very fragile and very treacherous, and greatly increase the task of maintaining code, especially as they get more complex.

This could also be accomplished using a flip-flop but explaining that is left to a different question: "What is a flip-flop operator?".

Tim Biegeleisen · Accepted Answer · 2016-10-30 06:53:05Z

0

Try this:

Find:

\[example\.com\].*?(\[(?:(?!example\.com).)*?\])

Replace:

$1

Regex101

edited Oct 30, 2016 at 6:53

answered Oct 30, 2016 at 6:43

Tim Biegeleisen

526k32 gold badges324 silver badges399 bronze badges

3 Comments

Tim Biegeleisen Over a year ago

First of all, update your question with the tool you are using. My regex would work in a tool such as Notepad++, but perhaps not yours.

Tim Biegeleisen Over a year ago

.* means match any character, zero or more times. .*? means match any character zero or more times, but it is a non greedy match.

Tim Biegeleisen Over a year ago

Explore this regex using the link provided.

Cary Swoveland · Accepted Answer · 2016-10-30 07:56:44Z

0

We are given

str =<<-END
[example.com]
10.100.251.1
10.100.251.2
10.100.251.3
[example.net]
10.100.251.22
10.100.251.33
END
  #=> "[example.com]\n10.100.251.1\n10.100.251.2\n10.100.251.3\n[example.net]\n10.100..."

The question is a bit confusing in that the desired output is said to be

[example.net]
10.100.251.22
10.100.251.33

but that is also what is to be deleted. What follows returns the lines that are not deleted, but it would be a simple matter to change it to return the deleted bits. Also, the question doesn't make clear if the string "[example.net]" is known or if it's just an example of what might follow the "[example.com]" "block". Nor is it clear if there are exactly two "blocks", as in the example, or there could be one or more than two blocks.

If you know "[example.net]" immediately follows the "[example.com]" block, you could write

r = /
    \[example\.com\]     # match string
    .*?                  # match any number of characters, lazily
    (?=\[example\.net\]) # match string in positive lookahead
    /mx                  # multiline and free-spacing modes

puts str[r]
[example.com]
10.100.251.1
10.100.251.2
10.100.251.3

If you don't know what follows the "[example.com]" "block", except that that the first line of the following block, if there is one, contains at least one character other than a digit or period, you could write

r = /
    \[example\.com\]\n  # match string
    .*?                 # match any number of any characters, lazily
    (?:[\d.]*\n)        # match a string containing > 0 digits and periods,
                        # followed by a newline, in a non-capture group
    +                   # match the above non-capture group > 0 times
    /x                  # free-spacing mode

puts str[r]
[example.com]
10.100.251.1
10.100.251.2
10.100.251.3

edited Oct 30, 2016 at 7:56

answered Oct 30, 2016 at 7:41

Cary Swoveland

111k6 gold badges69 silver badges105 bronze badges

6 Comments

Tim Biegeleisen Over a year ago

Nice regex...looks like mine ;-)

Cary Swoveland Over a year ago

@TimBiegeleisen, there certainly are similarities, but differences too, as I'm returning the keepers and you're returning the removals.

Deano Over a year ago

Thanks @CarySwoveland and Tim, I'm having truble running the example in my temrinal, you think you can help me with a sample on rubular.com ? Thanks

Cary Swoveland Over a year ago

Sure, Dean, but for me it will have to wait until morning.

Tim Biegeleisen Over a year ago

@CarySwoveland No, I am returning the keepers. Try it in Notepad++ and you will see.

|

Wiktor Stribiżew · Accepted Answer · 2016-10-30 08:48:44Z

0

Your regex is very close. What you miss is a bit of grouping and a linebreak construct at the right place:

/^\[example\.com\]\R*(?:(?:\d{1,3}\.){3}\d{1,3}\R*)*/

See the Rubular demo

Details:

^ - start of line
\[example\.com\] - [example.com] literal substring
\R* - zero or more linebreaks (for older Ruby versions, use (?:\r?\n|\r)*)
(?:(?:\d{1,3}\.){3}\d{1,3}\R*)* - zero or more sequences of
- (?:\d{1,3}\.){3} - 3 sequences of 1 to 3 digits and a dot
- \d{1,3} - 1 to 3 digits
- \R* - 0+ linebreaks

And a Ruby demo:

str =<<DATA
[example.com]
10.100.251.1
10.100.251.2
10.100.251.3
[example.net]
10.100.251.22
10.100.251.33
DATA
rx = /^\[example\.com\]\R*(?:(?:\d{1,3}\.){3}\d{1,3}\R*)*/
puts str[rx]

answered Oct 30, 2016 at 8:48

Wiktor Stribiżew

631k41 gold badges502 silver badges633 bronze badges

2 Comments

Aleksei Matiushkin Over a year ago

We end up with almost same regular expressions, but I still think \s* is better, than \R*. Either one claims the explicit precise format, then there should not be * matchers, or let’s allow spaces after IPs :)

Wiktor Stribiżew Over a year ago

\s matches horizontal whitespace, so [example.com]78.78.89.67556.87.87.87 can also be matched. I understand they must be on the subsequent lines.

Todd A. Jacobs · Accepted Answer · 2016-10-31 03:11:17Z

Treat Your Data Like an INI File: Scan for Sections

One way to deal with your data is to treat it like an INI file. A regex with the multi-line option enabled can break a string representation of your INI file into an array of sections as follows:

ini = <<~'EOF'
  [example.com]
  10.100.251.1
  10.100.251.2
  10.100.251.3
  [example.net]
  10.100.251.22
  10.100.251.33
EOF

# Scan for INI section headers.
sections = ini.scan /^\[.*?\]$[^\[]*/m

You can then extract just the sections you want using Enumerable#grep. For example, to extract the example.net section:

section_title = 'example.net'
sections.grep /\A\[#{Regexp.escape section_title}\]\s*$/
#=> ["[example.net]\n10.100.251.22\n10.100.251.33\n"]

Caveats

The multi-line regex above assumes you have the entire file loaded as a single String object. If you're doing something else, you may need a different approach.
Note the importance of Regexp#escape, which ensures that your string is properly converted for use in a regex pattern. Otherwise, characters like [, ., and ] would not match as you might expect.
INI files can be more complex than your sample data. You might consider a writing a real INI parser, or using a gem like inifile, rather than trying to handle all the possible edge cases in one regular expression.

Collectives™ on Stack Overflow

Regex delete lines after match

5 Answers 5

Comments

Regex101

3 Comments

6 Comments

2 Comments

Treat Your Data Like an INI File: Scan for Sections

Caveats

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

3 Comments

6 Comments

2 Comments

Treat Your Data Like an INI File: Scan for Sections

Caveats

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related