2

I need to shell script a way to get the random unknown junk text out of a text file. I am stuck on how to do this because i don't know what the junk text will say. Basically i need to remove everything before, after, and in between the pieces. I want to keep the text that is inside the pieces.

--Begin file


random unknown junk text

----Begin Piece one ---- 
random important text
----End Piece one ----

random unknown junk text

----Begin Piece two ---- 
random important text
----End Piece two ----

random unknown junk text

----Begin Piece two ---- 
random important text
----End Piece two ----

random unknown junk text


end of file

2 Answers 2

2
sed -n '/^\(--Begin file\|end of file\)/{p;b}; /^----Begin Piece/{p;:a;n;/^----End Piece/{p;b};p;ba}' inputfile

Explanation:

  • /^\(--Begin file\|end of file\)/{p;b} - Print the file beginning/ending lines (matches literal text)
  • /^----Begin Piece/{ - If the line matches the block begin marker
    • p - Print it
    • :a - label a
    • n - Read the next line
    • /^----End Piece/{ - If it's the block end marker
      • p - Print it
      • b - Branch to the end to read the next line of input
    • } - end if
    • p - Print a line that's within the block
    • ba - Branch to label a to see if there are more lines in the block
  • } - end if
Sign up to request clarification or add additional context in comments.

2 Comments

@Matt: sed -n -i ... (with some versions of sed the backup extension argument to -i is mandatory: sed -n -i .bak ...). You can also do sed ... inputfile > temp && mv temp inputfile.
Awesome that works, thanks for your help. This is the first time i ever had to use sed.
0
#!/bin/bash
exec 3< file.txt
fl=0
regex='----Begin Piece.+'
regexe='----End Piece.+'
while read <&3
do
    if [ $fl -eq 1 ] && [[ ! "$REPLY" =~ $regexe ]]; then
        echo "$REPLY"
    fi
    if [[ "$REPLY" =~ $regex ]]; then fl=1; fi
    if [[ "$REPLY" =~ $regexe ]]; then fl=0; fi
done
exec 3>&-

1 Comment

If you quote the pattern on the right hand side of =~ it ceases to be a regex and is taken literally.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.