3

I have some extraneous html table rows I'd like to remove using sed. I want to match and delete these two lines.

[tr]
[/tr]

I've tried sed -i '/\[tr\](\r|\n|\r\n|\n\r)\[\/tr\]/d' ./file which matches on a regex testing site, but sed doesn't do anything.

1
  • 1
    Something like sed '/^\[tr]$/{N;/\n\[\/tr]$/d}', maybe. Commented Mar 25 at 9:23

4 Answers 4

4

sed operates line by line by default and does not handle the multi-line patterns unless it is explicitly instructed. Your regex is not matching the multiple lines in sed. sed treats each line separately.

Try using sed with N for multi-line matching.

sed -i '/\[tr\]/ {N; /\[tr\]\n\[\/tr\]/d; }' file

Sign up to request clarification or add additional context in comments.

1 Comment

What if there were these three lines [tr]\n[tr]\n[/tr]?
2

This might work for you (GNU sed):

sed 'N;/^\[tr\]\n\[\/tr\]$/d;P;D' file

Open a two line window using the N command.

If the first line of the window is [tr] and second line [/tr], delete both lines.

Otherwise, print and delete the first of the two lines and repeat.

N.B. The D deletes upto and including the first newline. It also has the side effect in that if the pattern space still contains characters after it has been executed the implicit fetch of the next line into the pattern is not carried out and as the first command of the above program is the N command the two line window is preserved i.e. the next line is appended to what would have been the second line of the pattern space or more simply the last line that was appended.

1 Comment

You don't need to escape the closing square brackets. But yes, this is the way.
1

Use this Perl one-liner:

perl -i.bak -0777 -pe 's{\[tr\]\s*\[/tr\]\n}{}g' infile

Example input file:

line 1
[tr]
[/tr]
line 2
[tr]
[/tr]
line 3

Example output file:

line 1
line 2
line 3

s{PATTERN}{REPLACEMENT} : Replace regex PATTERN with REPLACEMENT.

In the regex: \[ : Literal [ (needs to be escaped.
\] : Literal ] (needs to be escaped.
\s* : Zero or more whitespace (including newline).

The regex uses the modifier:
g : Match the pattern repeatedly.

The Perl one-liner uses these command line flags:
-e : Tells Perl to look for code in-line, instead of in a file.
-p : Loop over the input one line at a time, assigning it to $_ by default. Add print $_ after each loop iteration. Together with -0777, assigns the entire file contents (rather than 1 line) to $_.
-i.bak : Edit input files in-place (overwrite the input file). Before overwriting, save a backup copy of the original file by appending to its name the extension .bak. If you want to skip writing a backup file, just use -i and skip the extension.
-0777 : Slurp files whole.

See also:

Comments

0

In case you mean “<tr>” followed by “</tr>” then you may use Raku/Sparrow for that:

begin:
<tr>
</tr>
end:

code: <<RAKU
!raku

for captures-full() -> $i {
   my $ln = $i<index>;
   # remove <tr> line
   replace("/path/to/file.txt",$ln-1,"");
   # remove </tr> line 
   replace("/path/to/file.txt",$ln,"");  
}
RAKU

Removing when there are arbitrary number of empty lines between <tr> </tr> is also possible ( just with few more lines of code )

For this example works pleas make it sure you provide file.txt as an input for Sparrow task.check, something like, task.bash:

cat /path/to/file.txt 

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.