0

I used this below mentioned code to fetch HTML source from websites. I have no problem fetching data which are in English. But if they are in any other language, i am unable to import that text without turning that text into gibberish.

How can I allow below code to import other language text in their actual form.

Sub test()
Dim FILENAME As String
Dim FileNum As Long

FILENAME = "C:\Temp\Source.txt"
FileNum = FreeFile


Open FILENAME For Output As FileNum
Print #FileNum, GetSource("https://www.pleasehelp.com/thankyou.html")
Close FileNum

    With ActiveSheet.QueryTables.Add(Connection:="TEXT;C:\TEMP\Source.txt", Destination:=Range("A1"))
    .Name = "Source"
    .AdjustColumnWidth = True
    .TextFileParseType = xlFixedWidth
    .TextFileTextQualifier = xlTextQualifierDoubleQuote
    .TextFileColumnDataTypes = Array(2)
    .Refresh BackgroundQuery:=False
End With

End Sub

Function GetSource(sURL As String) As String
Dim oXHTTP As Object
Set oXHTTP = CreateObject("MSXML2.XMLHTTP")
oXHTTP.Open "GET", sURL, False
oXHTTP.send
GetSource = oXHTTP.responsetext
Set oXHTTP = Nothing
End Function

1 Answer 1

0

Try using the Stream object instead of the Open statement for output to a file. The Stream object allows one to set the character set to unicode for translating the contents. The following example specifically uses UTF-8 encoding.

Note that the code uses early binding, so you'll need to set a reference (Visual Basic Editor >> Tools >> References) to Microsoft ActiveX Data Objects x.x. Library.

Option Explicit

Sub test()

    Dim outFile As String
    outFile = "C:\Temp\Source.txt"

    Dim stream As ADODB.stream
    Set stream = New ADODB.stream
    With stream
        .Charset = "UTF-8"
        .Mode = adModeReadWrite
        .Type = adTypeText
        .Open
        .WriteText GetSource("https://www.pleasehelp.com/thankyou.html")
        .SaveToFile outFile, adSaveCreateOverWrite 'overwrites any already existing file
        .Close
    End With

    With ActiveSheet.QueryTables.Add(Connection:="TEXT;" & outFile, Destination:=Range("A1"))
        .Name = "Source"
        .AdjustColumnWidth = True
        .TextFileParseType = xlFixedWidth
        .TextFileTextQualifier = xlTextQualifierDoubleQuote
        .TextFileColumnDataTypes = Array(2)
        .Refresh BackgroundQuery:=False
    End With

End Sub
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks for you help Domenic.. I tried you code but since I don't much about VBA. I was still getting error. Could be i chose wrong ActiveX library. I just added .TextFilePlatform = 65001 under query table and replaced 'GetSource = oXHTTP.responsetext with GetSource = StrConv(oXHTTP.responseBody, vbUnicode) in function and it worked..

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.