1

Here is an HTML code:

<h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p>
<p><a href="/css/default.asp">CSS Tutorial</a></p>

How can I replace, change case or do something with text without affecting any html tags using Golang? For example:

<h2>RELATIVE URLS</h2>
<p><a href="html_images.asp">HTML IMAGES</a></p>
<p><a href="/css/default.asp">CSS TUTORIAL</a></p>
3
  • 2
    Please include what you tried Commented Oct 8, 2022 at 8:01
  • Use golang.org/x/net/html. Commented Oct 8, 2022 at 8:57
  • I am new in go.... Can you show me please more detailed. Thanks a lot! Commented Oct 8, 2022 at 13:46

1 Answer 1

1

You can try some xpath based parser like htmlquery

s := `<html><head></head><body><h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p></body></html>`

doc, _ := htmlquery.Parse(strings.NewReader(s))
fmt.Printf("Before update \n%s\n", htmlquery.OutputHTML(doc, true))

nodes := htmlquery.Find(doc, "/html/body//*")

for _, node := range nodes {
  if node.FirstChild.DataAtom == 0 { 
    // DataAtom is the atom for Data, or zero if Data is not a known tag name.
    node.FirstChild.Data = strings.ToUpper(node.FirstChild.Data)
  }
}
fmt.Printf("After update \n%s\n", htmlquery.OutputHTML(doc, true))

Output

Before update 
<html><head></head><body><h2>Relative URLs</h2>
<p><a href="html_images.asp">HTML Images</a></p></body></html>
After update 
<html><head></head><body><h2>RELATIVE URLS</h2>
<p><a href="html_images.asp">HTML IMAGES</a></p></body></html>
Sign up to request clarification or add additional context in comments.

7 Comments

Thank you very much for your example! But as you see Relative URLs is still not in upper case.... How to fix that ALL text will be upper case besides tags?
With this code: node := htmlquery.FindOne(doc, "/html/body/*") changes only Relative URLs ...
@Seomat need to use "/html/body//*" for all nodes in the body and htmlQuery.Find that returns list of matched nodes, all in this case. Update can be done based on the DataAtom of the node.
@Seomat, if this answer solved your problem, please accept it (click the ✔️ near the top left of the answer)
check node.FirstChild for nil in the loop
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.