I have XML text in an Excel sheet that I want to encode with Base64. I've found some code which actually does a good job. The only problem is that I cannot encode special characters like the German umlaut with VBA in Excel. So does anybody know how to do this? I think it should be possible since the PHP base64 function can handle such characters too.
1 Answer
Base64 encoded string represents the initial string containing special characters in some charset, suitable for such special characters. So first of all you should choose which charset to use. Usually UTF-8 or UTF-16 will do. I guess the issue you encountered is cased by ASCII.
In the below example TextBase64Encode function allows to encode a text to Base64 and TextBase64Decode to decode it back to text:
Function TextBase64Encode(strText, strCharset)
Dim arrBytes
With CreateObject("ADODB.Stream")
.Type = 2 ' adTypeText
.Open
.Charset = strCharset
.WriteText strText
.Position = 0
.Type = 1 ' adTypeBinary
arrBytes = .Read
.Close
End With
With CreateObject("MSXML2.DOMDocument").createElement("tmp")
.DataType = "bin.base64"
.nodeTypedValue = arrBytes
TextBase64Encode = Replace(Replace(.Text, vbCr, ""), vbLf, "")
End With
End Function
Function TextBase64Decode(strBase64, strCharset)
Dim arrBinary
With CreateObject("MSXML2.DOMDocument").createElement("tmp")
.DataType = "bin.base64"
.Text = strBase64
arrBinary = .nodeTypedValue
End With
With CreateObject("ADODB.Stream")
.Type = 1 ' adTypeBinary
.Open
.Write arrBinary
.Position = 0
.Type = 2 ' adTypeText
.Charset = strCharset
TextBase64Decode = .ReadText
.Close
End With
End Function
To make it more clear I entered a sample, containing special characters like the German umlaut in the cell A2, charset ASCII in B2 and formula =TextBase64Encode($A$2;B2) in C2. ASCII represented string encoded to Base64 appeared in cell C2:
Entered formula =TextBase64Decode(C2;B2) in D2 to decode Base64 back:
Added more charsets, stretched formulas down and added header:
Now you can see, that UTF-8, UTF-16, UTF-7, Windows-1250, latin1 keeps initial sample umlauts, but ASCII spoils. For a list of the character set names that are known by a system, see the subkeys of HKEY_CLASSES_ROOT\MIME\Database\Charset in the Windows Registry. Note that UTF-16 LE encoded string often includes the bytes 0xFF, 0xFE at the start, which are the Unicode byte order mark (BOM), and UTF-8 representation of the BOM are the bytes 0xEF, 0xBB, 0xBF at the start. BTW just replace bin.base64 with bin.hex to work with heximal values instead of Base64.



ADODB.Stream, then convert that binary to Base64 viaMSXml2.DOMDocument.