4

Let's say I have a page as follows, saved at c:\temp\html_page.html:

<html>
   <head>
      <link rel="stylesheet" href="styles.css">
   </head>
   <body>
      <div id="xxx1">
         <img src="test.png">
      </div>
   </body>
</html>

I'd like to programatically adjust the src attribute of the img, based on Excel data & VBA. Basically a way to find the div with Xpath, and adjust the (single) img tag that is contained in it.

I found an example for manipulating XML with VBA through the XML library here, but I've been crunching my head around making this work with the HTML object library; can't find any examples and/or documentation.

Dim XDoc As Object, root As Object

Set XDoc = CreateObject("MSXML2.DOMDocument")
XDoc.async = False: XDoc.validateOnParse = False

If XDoc.Load(html_path) Then
    Debug.Print "Document loaded"
Else
    Dim strErrText As String
    Dim xPE As MSXML2.IXMLDOMParseError
    ' Obtain the ParseError object
    Set xPE = XDoc.parseError
    With xPE
       strErrText = "Your XML Document failed to load" & _
         "due the following error." & vbCrLf & _
         "Error #: " & .ErrorCode & ": " & xPE.reason & _
         "Line #: " & .Line & vbCrLf & _
         "Line Position: " & .linepos & vbCrLf & _
         "Position In File: " & .filepos & vbCrLf & _
         "Source Text: " & .srcText & vbCrLf & _
         "Document URL: " & .URL
    End With
    MsgBox strErrText, vbExclamation

All I want to do is:

'...
Set outer_div = XDoc.SelectFirstNode("//div[id='xxx1'")
... edit the img attribute

But I can't load the HTML page, because it's not proper XML (img tag isn't closed).

Any help is greatly appreciated. Oh, and I can't use other languages such as Python, bummer.

2 Answers 2

4

This isn't quite what you want, but it may be close enough. Rather than using the XML library, use the HTML library:

Sub changeImg()

    Dim dom As Object
    Dim img As Object
    Dim src As String

    Set dom = CreateObject("htmlFile")

    Open "C:\temp\test.html" For Input As #1
        src = Input$(LOF(1), 1)
    Close #1

    dom.body.innerHTML = src

    Set img = dom.getelementsbytagname("img")(0)

    img.src = "..."

    Open "C:\temp\test.html" For Output As #1
        Print #1, dom.DocumentElement.outerHTML
    Close #1


End Sub

The issue is that the resulting file will have Head nodes added and the tag names will be in upper case. If you can live with this, the solution will work for you.

As an aside if you want to do things a bit more in depth, with better selectors consider early binding. The HTML interface exposed is different than the interface when late bound and supports more properties - you'll want to add a reference to the HTML Object Library:

Sub changeImg()

    Dim dom As HTMLDocument
    Dim img As Object
    Dim src As String

    Set dom = CreateObject("htmlFile")

    Open "C:\temp\test.html" For Input As #1
        src = Input$(LOF(1), 1)
    Close #1

    dom.body.innerHTML = src

    Set img = dom.getelementsbytagname("img")(0)

    img.src = "..."

    Open "C:\temp\test.html" For Output As #1
        Print #1, dom.DocumentElement.outerHTML
    Close #1


End Sub
Sign up to request clarification or add additional context in comments.

2 Comments

Thanks a bunch! It seems like I'm almost there: the question wasn't 100% accurate. I'm looking for a solution that works for a multiline HTML file. I'm trying to find how to adjust the code, but haven't been succesful yet. Would you mind adding that to the answer?
@MattV, sorry I must be missing something, why doesn't this work for multi-line files? Let me know and I'll update
1

For this purpose you can use doc.querySelector("div[id='xxx1'] img"). To change src attribute use img.setAttribute "src", "new.png". HTH

Option Explicit

' Add reference to Microsoft Internet Controls (SHDocVw)
' Add reference to Microsoft HTML Object Library

Sub Demo()
    Dim ie As SHDocVw.InternetExplorer
    Dim doc As MSHTML.HTMLDocument
    Dim url As String

    url = "file:///C:/Temp/StackOverflow/html/html_page.html"
    Set ie = New SHDocVw.InternetExplorer
    ie.Visible = True
    ie.navigate url
    While ie.Busy Or ie.readyState <> READYSTATE_COMPLETE: DoEvents: Wend
    Set doc = ie.document

    Dim img As HTMLImg
    Set img = doc.querySelector("div[id='xxx1'] img")
    If Not img Is Nothing Then
        img.setAttribute "src", "new.png"
    End If
    ie.Quit
End Sub

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.