0

I have regex to parse all hash url in HTML content.

/(\#)([^\s]+")/g

HTML content will be as

Some text <a href="#some-hash1">some link</a>some content <a href="#some-hash2">some link1</a>

Expected is

#some-hash1, #some-hash2

But current regex is returning as (ending double come along with hash):

#some-hash1", #some-hash2"

I am unable to understand why its come along with double quotes. Any suggestion that will be very helpful.

4 Answers 4

2

I wouldn't use regex for this because it's overkill and because you can simply loop through the anchors pulling the value of their hrefs...

var anchors = document.querySelectorAll('a');
var hrefs = [];

anchors.forEach(function(e){

	hrefs.push(e.getAttribute('href'));

});

console.log(hrefs);
<a href="link 1">link 1</a>
<a href="link 2">link 2</a>

Sign up to request clarification or add additional context in comments.

Comments

1

Use non-capturing parenthesis,

/(\#)([^\s]+(?="))/g

DEMO

 var z = 'Some text <a href="#some-hash1">some link</a>some content <a href="#some-hash2">some link1</a>';
console.log(    z.match(/(\#)([^\s]+(?="))/g) );

Comments

0

Just move double quote out the brackets:

(\#)([^\s]+)"

See how it works: https://regex101.com/r/fmrDyu/1

Comments

0

I am assuming that you are looking at the content of $2 for your result.

If so, the problem is the " inside the second capture group. Changing /(\#)([^\s]+")/g to /(\#)([^\s]+")/g results in the correct result.

I suggest joining the capture groups. Then /(\#[^\s]+)"/g will return $1=>#some-hash1, #some-hash2

Since $1 will always just return #, I suppose you trim it off elsewhere in your program, so perhaps you should use /\#([^\s]+)"/g which will return some-hash1, some-hash2 without the #

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.