0

I have an XML file in my blob storage. It contains words like this: Družstevní. When I download the XML using Azure portal, this word is still correct.
But when I try using DownloadToStreamAsync the result is Dru�stevn�.

How do I fix this?

I found DownloadTextAsync is working because I get set the encoding: Encoding.GetEncoding(1252).
But then I end up with a string and the rest of my code is expecting a stream. Should I read the string again as a stream or exists a more elegant option?

Here's my code:

public Task<string> DownloadAsTextAsync(string code, Encoding encoding)
{
    var blockBlob = _container.GetBlockBlobReference(code);
    var blobRequestOptions = new BlobRequestOptions
    {
        MaximumExecutionTime = TimeSpan.FromMinutes(15),
        ServerTimeout = TimeSpan.FromHours(1)
    };

    return blockBlob.DownloadTextAsync(Encoding.GetEncoding(1252), null, blobRequestOptions, null);
}

public async Task<Stream> DownloadAsStreamAsync(string code)
{
    var blockBlob = _container.GetBlockBlobReference(code);
    var blobRequestOptions = new BlobRequestOptions
        {
            MaximumExecutionTime = TimeSpan.FromMinutes(15),
            ServerTimeout = TimeSpan.FromHours(1)
        };
    var output = new MemoryStream();
    await blockBlob.DownloadToStreamAsync(output, null, blobRequestOptions, null);
    return output;
}

Edit, after comment of Zhaoxing Lu:
I changed my unit test and added the encoding to StreamReader and now the unit test is passing:

using (var streamReader = new StreamReader(stream, Encoding.GetEncoding(1252)))
{
    string line;
    while ((line = streamReader.ReadLine()) != null)
    {
        if (!line.StartsWith("            <Str>Dru")) continue;

        Debug.WriteLine(line);
        var street = line.Trim().Replace("<Str>", "").Replace("</Str>", "");
        Assert.AreEqual("Družstevní", street);
    }
}

But in my 'real' code I'm sending the stream to load as XML:

fileStream.Position = 0;
var xmlDocument = XDocument.Load(fileStream);

The resulting xmlDocument is in the wrong encoding. I can't find how to set the encoding.

3
  • Could you share the code how you consume Stream returned from DownloadAsStreamAsync? I think this is the place you need to change. Stream is just a binary stream, it's the reader who can decide the encoding when consuming the stream. Commented Jul 30, 2019 at 1:50
  • Thanks @ZhaoxingLu-Microsoft I've updated my post and you are right the stream is OK. The problem seems to be when reading the stream as an XDocument Commented Jul 30, 2019 at 6:56
  • Glad to hear that the stream is good. Regarding the XDocument issue, I'd suggest you to create a new question since the question above was about Azure Blob Storage topic. Commented Jul 30, 2019 at 9:02

1 Answer 1

1

The problem seems to be when reading the stream as an XDocument

You could set the encoding as Encoding.GetEncoding("Windows-1252") with the following code to read the stream as XDocument.

XDocument xmlDoc = null;

using (StreamReader oReader = new StreamReader(stream, Encoding.GetEncoding("Windows-1252")))
{
    xmlDoc = XDocument.Load(oReader);
}

The result: enter image description here

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.