2

Diclosure: I'm just an office clerk and very-very new to VBA and HTML. Hopefully you will be patient with me. I would really appreciate any guidance and help. Hopefully I'm formatting it correctly.

I spent whole day yesterday trying to import information from intranets web-page to automate routing copying and pasting. This will really help in the long run.

Since power-query doesn't seem to see the table I need, I figured the only option is using VBA. MsServer tool grabs page perfectly, but to my disappointment page came out with error, since it needs to authorize before access it first.

I figured with the use of IE it should work, since IE have login information in cookies.

Here where I got so far.

 Sub ExtractFromEndeca() Dim ie As InternetExplorer Dim html As
 IHTMLDocument Set ie = CreateObject("InternetExplorer.Application")
 ie.Visible = False 
 ie.Navigate "intranet address"
 While ie.Busy
     DoEvents Wend While ie.ReadyState < 4
     DoEvents Wend
     Set Doc = CreateObject("htmlfile")
     Set Doc = ie.document
     Set Data = Doc.getElementById("findSimilarOptions2")
     Sheet1.Cells(1, 1) = Data
     ie.Quit Set ie = Nothing

 ThisWorkbook.Sheets(1).Cells(1, 1) = Data

 End Sub

Result is [object] in Cell A1 and that's it and I can't understand if I got past login or not.

Here is a page fragment I'm trying to grab. Ideally this data will be output as a table.

   <td valign="top" id="findSimilarOptions2">
<div class="subtitle">Part Attributes</div>
    <input type="checkbox" id="n_200012" value="-19192896" NAME="n_200012">
    <b>
    ASSY TYPE</b>&nbsp;>
    Component<br>

    <input type="checkbox" id="n_200013" value="-18148519" NAME="n_200013">
    <b>
    PARAMETER I NEED(1)</b>&nbsp;>
    VALUE I NEED(1)<br>

    <input type="checkbox" id="n_200006" value="-20823731" NAME="n_200006">
    <b>
    PARAMETER I NEED(2)</b>&nbsp;>
    VALUE I NEED(2)<br>

    <input type="checkbox" id="n_200006" value="-20823618" NAME="n_200006">
    <b>
    PARAMETER I NEED(3)</b>&nbsp;>
    VALUE I NEED(3)<br>

    <input type="checkbox" id="n_200006" value="-20823586" NAME="n_200006">
    <b>
    PARAMETER I NEED(4)</b>&nbsp;>
    VALUE I NEED(4)<br>
    ...
7
  • 1
    Welcome to SO. What happens if you do Sheet1.Cells(1, 1) = Data.Value instead of Sheet1.Cells(1, 1) = Data? Commented Jan 9, 2020 at 9:16
  • 1
    Ahh I think I know why. <td valign="top" id="findSimilarOptions2"> doesn't have any value, so it cannot return it to your Excel file. However, other elements of your HTML code should work, e.g. <input type="checkbox" id="n_200012" value="-19192896" NAME="n_200012">. Try testing Set Data = Doc.getElementById("n_200012") and then Sheet1.Cells(1, 1) = Data.Value. Commented Jan 9, 2020 at 9:39
  • 1
    @JustynaMK Yes, it grabs the value -19192896! Thank you. But it is not what I require. I need the inner text of id "findSimilarOptions2" Commented Jan 9, 2020 at 10:03
  • 1
    @JustynaMK UPD: I just tested and it grabs it with data.innerText! I'm so excited I'm finally getting somewhere I forgot to go to lunch. I will try and incorporate the answer below to help structure this data after lunch. Thanks a lot again. Commented Jan 9, 2020 at 10:11
  • 1
    Very positive news! Glad you are progressing nicely. I know exactly how you feel but please do not forget to eat :-) take care. Commented Jan 9, 2020 at 10:26

1 Answer 1

3

Please read my comments in the following code:

'Use the following line in every module head
'It forces you to define all variables
Option Explicit

Sub ExtractFromEndeca()

Dim ie As InternetExplorer
Dim doc As IHTMLDocument 'You don't use html in your code, but doc
Dim data As HTMLHtmlElement 'You should define all variables
Dim singleData As HTMLHtmlElement 'New variable
Dim row As Long 'New variable

  row = 1 'First row for output in Excel table

  'Set ie = CreateObject("InternetExplorer.Application") 'This could be problematic on the intranet due to security guidelines
  Set ie = GetObject("new:{D5E8041D-920F-45e9-B8FB-B1DEB82C6E5E}") 'Try this instead to initialize the IE
  ie.Visible = True 'This property should be True while development
  ie.Navigate "intranet address"
  'While ie.Busy: DoEvents: Wend 'You don't need this line
  While ie.ReadyState <> 4: DoEvents: Wend
  'Set Doc = CreateObject("htmlfile") 'You don't need this line
  Set doc = ie.document
  Set data = doc.getElementById("findSimilarOptions2").getElementsByTagName("input")

  'Data is only a reference to an object
  'You want the text information which lies in the value attributes of each input tag
  For Each singleData In data
    Sheet1.Cells(row, 1) = data.Value
    row = row + 1
  Next singleData

  'Clean up
  '(Automatic after development has finished)
  'ie.Quit
  'Set ie = Nothing
End Sub
Sign up to request clarification or add additional context in comments.

3 Comments

Hi and thanks a lot. This became very clear. It opens new IE window, however it produces an error in this line Set data = doc.getElementById("findSimilarOptions2")(0).getElementsByTagName("input"). Error is 424: Object required. Does it mean it hasn't get through the login?
@AndreyRassanov Sorry, my fault. getElementByID() don't need the (0) to get a specific index element of a node collection, because an id should be only once in a html document. Thats the reason why there is no node collection by using getElementbyID(). I edited the line.
It produces error '13'. Type mismatch in the same line. Am I understanding correctly, that you are trying to grab <input> tag? What I need is a text in between those tags. If I understand correctly, tag input is self contained and doesn't include this text in the first place.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.