0

I’m developing a .NET 5, C# console application to create a CSV file from a list of custom objects, gzip it, and upload it to an Azure Storage container with this code:

var blobServiceClient = new BlobServiceClient("My connection string");
var containerClient = GetBlobContainerClient("My container name");
var config = new CsvConfiguration(CultureInfo.CurrentCulture) { Delimiter = ";", Encoding = Encoding.UTF8 };

var list = new List<FakeModel>
{
   new FakeModel { Field1 = "A", Field2 = "B" },
   new FakeModel { Field1 = "C", Field2 = "D" }
};

await using var memoryStream = new MemoryStream();
await using var streamWriter = new StreamWriter(memoryStream);
await using var csvWriter = new CsvWriter(streamWriter, config);

await csvWriter.WriteRecordsAsync(list);

await using var zip = new GZipStream(memoryStream, CompressionMode.Compress, true);
await memoryStream.CopyToAsync(zip);

memoryStream.Seek(0, SeekOrigin.Begin);
var blockBlob = containerClient.GetBlockBlobClient("test.csv.gz");
await blockBlob.UploadAsync(memoryStream);

It appears to work, but when I download the gzip from the cloud to check it, I get the following error when trying to decompress it:

Error message from 7-Zip saying the archive is invalid

Inspection of the file shows it has a length of 0.

Can you help me understand why?

5
  • The CopyToAsync method may not be completing before the Upload is complete. Commented Jun 11, 2021 at 10:23
  • If you open the .gz file in a hex editor (e.g. Can I hex edit a file in Visual Studio?), does it look like a .gz file? Commented Jun 11, 2021 at 10:24
  • @jdweng I’m awaiting it. Commented Jun 11, 2021 at 10:25
  • @AndrewMorton if I open it like that, I get 00000000 as content. Commented Jun 11, 2021 at 10:31
  • 1
    I guess GZipStream should be in your writer-stack between MemoryStream and StreamWriter. So, CsvWriter -> StreamWriter -> GZipStream -> MemoryStream Commented Jun 11, 2021 at 10:38

1 Answer 1

2

The stream parameter of a new GZipStream is the destination stream. To process the input to the output, you need to write to the instance of GZipStream somehow.

When I experimented with it, I found that a call to csvWriter.FlushAsync() was necessary.

Like this:

class Program
{
    public class FakeModel
    {
        public string Field1 { get; set; }
        public string Field2 { get; set; }
    }

    static async Task Main(string[] args)
    {
        var config = new CsvConfiguration(CultureInfo.CurrentCulture) { Delimiter = ";", Encoding = Encoding.UTF8 };

        var list = new List<FakeModel>
                        {
                           new FakeModel { Field1 = "A", Field2 = "B" },
                           new FakeModel { Field1 = "C", Field2 = "D" }
                        };

        await using var memoryStream = new MemoryStream();
        await using var streamWriter = new StreamWriter(memoryStream);
        await using var csvWriter = new CsvWriter(streamWriter, config);

        await csvWriter.WriteRecordsAsync(list);
        await csvWriter.FlushAsync();

        memoryStream.Position = 0;

        await using var fs = new FileStream(@"C:\temp\SO67935249.csv.gz", FileMode.Create);
        await using var zip = new GZipStream(fs, CompressionMode.Compress, true);
        await memoryStream.CopyToAsync(zip);

    }
}
Sign up to request clarification or add additional context in comments.

1 Comment

I think the example could be simplified by writing to gzipStream directly, i.e. new Streamwriter(new GzipStream(...)). Also, using var ... will do the cleanup when the method ends, in this case it a scoped using statement might be better, I think that should avoid the need for explicit Flushing.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.