2

I would like to combine two regex functions to clean up some textarea input. I wonder if it is even possible, or if I should keep it two separate ones (which work fine but aren't looking as pretty or clean).

I have adjusted either so that they utilize global and multiline (/gm) and are replaced by nothing (''). I tried with brackets and vertical/or lines in any position, but it never ends up giving the expected result, so I can only assume there is a way that I have overlooked or that I should keep it as is.

Regex 1: /^\s+[\r\n]/gm

Regex 2: /^\s+| +(?= )|\s+$/gm

Currently in JavaScript: string.replace(/^\s+[\r\n]/gm,'').replace(/^\s+| +(?= )|\s+$/gm,'')

The goal is to remove:

  • Empty spaces in the beginning and end of each line
  • Empty lines (including any in the very beginning and end)
  • Double spaces

Without it ending up on one and the same line. The single line breaks (\r\n) should still be there in the end.

Regex 1 is to remove any empty line (^\s+[\r\n]), Regex 2 does the trimming of whitespaces in the beginning (^\s+) and end (\s+$), and removes double (and triple, quadriple, etc) spaces in between (+(?= )).

Input:


   Let's  
make   this
 look

 a    little


    nicer   
  and 
more   

readible


Output:

Let's
make this
look
a little
nicer
and
more
readible

Edit: Many thanks to Wiktor Stribiżew and his comment for this complete solution:

/^\s*$[\r\n]*|^[^\S\r\n]+|[^\S\r\n]+$|([^\S\r\n]){2,}|\s+$(?![^])/gm

4
  • 1
    Try s.replace(/^\s*$[\r\n]*|^[^\S\r\n]+|[^\S\r\n]+$|([^\S\r\n]){2,}/gm, '$1'). To also remove the trailing line breaks, add |\s+$(?![^]) to the end of the pattern. Commented Jan 28, 2020 at 21:11
  • 1
    Start with /^\s*\n|^\s+/gm to remove empty lines and empty spaces in the beginning and end of lines. It doesn't cover the double spaces between words. Commented Jan 28, 2020 at 21:47
  • @BojanBedrač it definitely does most of it, especially when adding +(?= ) for the double spaces and Wiktor's |\s+$(?![^]) for the trailing line break. Unfortunately, spaces in the end of the lines are still there. Commented Jan 29, 2020 at 8:28
  • 1
    @BojanBedrač This hackery, however, would work: /^\s*[\r\n]|^\s+| +(?= )| +$|\s+$(?![^])/gm Commented Jan 29, 2020 at 8:58

1 Answer 1

2

I'd suggest the following expression with a substitution template "$1$2" (demo):

/^\s*|\s*$|\s*(\r?\n)\s*|(\s)\s+/g

Explanation:

  • ^\s* - matches whitespace from the text beginning
  • \s*$ - matches whitespace from the text ending
  • \s*(\r?\n)\s* - matches whitespace between two words located in different lines, captures one CRLF to group $1
  • (\s)\s+ - captures the first whitespace char in a sequence of 2+ whitespace chars to group $2
Sign up to request clarification or add additional context in comments.

1 Comment

Whoa, that's clean! Thank you so much!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.