I have a rich text area where the user can type something. I am trying to prevent JavaScript injection using the following regex:
return input == null ? null : input.replaceAll("(?i)<script.*?>.*?</script.*?>", "") // case 1
.replaceAll("(?i)<.*?javascript:.*?>.*?</.*?>", "") // case 2
.replaceAll("(?i)<.*?\\s+on.*?>.*?</.*?>", ""); // case 3
Above, input is the text from the rich text area and I am using this regex to avoid possible JavaScript injections.
The problem is case 3. If the user's text contains "on", all the text before "on" gets removed.
How can I make the last case more rigid and avoid the above problem?