1

I am trying to find the component in all files with specific attribute. Tried this regex pattern <Button[^>]*[\n\s]+className[^>]*>. 95% it works fine.

Regex Example

You can see in this above example. Button component with condition attribute won't match. It has className attribute too. It should match. It didn't match because of this greater than character => in condition attribute line. So, It stops even before the component close tag.

How do I avoid in between greater than character (>) in this regex pattern?

10
  • 3
    You shouldn't use regex to parse HTML: stackoverflow.com/questions/1732348/… Commented Oct 5, 2020 at 7:29
  • or in fact anything other than the most simple of cases (IMO) regex produces hard to read and hard to debug code and it seems it may not even be possible to parse html with regex. Commented Oct 5, 2020 at 7:41
  • Sorry, but what do you want to get as a result? All entries between button tags (including all atributes)? Commented Oct 5, 2020 at 8:06
  • 3
    <Button(?:\w+={[^{}]*}|[^>])*\sclassName=(?:\w+={[^{}]*}|[^>])*>. See regex101.com/r/SGB0aN/4 Commented Oct 5, 2020 at 9:07
  • 1
    @ChristianBaumann Please don't use that link for explaining why you shouldn't use regexes for HTML parsing. OP will not understand it. Here's a more illustrative page I put together: htmlparsing.com/regexes.html Commented Oct 5, 2020 at 21:01

1 Answer 1

1

You need to match any char but > or an attribute (a chunk of word chars) followed with = and then a substring between curly braces one or more times with (?:\w+=\{[^{}]*\}|[^>])*.

Also, you should keep in mind Visual Studio Code regex engine requires { and } outside of a character class to be escaped.

The pattern will look like

<Button(?:\w+=\{[^{}]*\}|[^>])*\sclassName=(?:\w+=\{[^{}]*\}|[^>])*>

See the regex demo.

Details

  • <Button - a literal string
  • (?:\w+=\{[^{}]*\}|[^>])* - zero or more repetitions of
    • \w+=\{[^{}]*\} - one or more letters, digits or underscores, ={, zero or more chars other than { and } and then a }
    • | - or
    • [^>] - any char other than >
  • \s - a whitespace
  • className= - a literal text
  • (?:\w+=\{[^{}]*\}|[^>])* - see above
  • > - a > char.
Sign up to request clarification or add additional context in comments.

9 Comments

Sorry to ask you through comment. This one doesn't match <Button onChange={(value) => {this.state;}} className={style.textArea}></Button>
@KaruppiahRK Yes, because of nested curly braces. Whether it is possible for you to solve or not, you should name the regex library you are using.
regex library name means? Sorry I don't get it. I tried to solve it. But couldn't. I don't have enough knowledge in regex pattern.
@KaruppiahRK What is the programming language?
@KaruppiahRK If you only have one nested parentheses level you may use <Button(?:\w+=\{[^{}]*(?:\{[^{}]*\}[^{}]*)*\}|[^>])*\sclassName=(?:\w+=\{[^{}]*(?:\{[^{}]*\}[^{}]*)*\}|[^>])*>, see this regex demo.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.