1

I am building a simple proxy server in .NET 8.

The process is as follows:

  • A client POSTs a request (possibly a large one)
  • My proxy server gets that request and needs to process the incoming stream to actually change it (required for our application)
  • The proxy server then builds a new HttpRequestMessage with a StreamContent
  • The incoming request's stream is read in chunks of 64K (set in appsettings)
  • Each chunk is transformed with our specific process
  • The transformed chunks then need to be written to the StreamContent
  • Once the entire incoming request is processed, the HttpClient is posted with the transformed payload

The issue I have is that using StreamContent requires me to write ALL transformed data to it and I have to set the Position back to zero before I can POST it. This means (if I understand correctly) that the entire "new request" is in memory on my proxy server.

If I use the obsolete HttpWebRequest for my new request, I can get the RequestStream and process my incoming message in chunks which I write directly to the RequestStream. It seems like this is a much better approach as it should induce less memory pressure.

Am I missing something here?

Following is the code I have for using StreamContent:

/// <summary>
/// Uses the HttpRequestMessage / HttpResponseMessage to communicate with
/// the proxied API. This requires the entire Request stream to be assembled
/// before it can be sent to the API.
/// </summary>
/// <param name="clientHttpContext">The HttpContext for this request.</param>
/// <returns>An HttpResponseMessage which exposes its stream for processing.</returns>
private async Task<HttpResponseMessage> SendToProxiedAPIWithStreamContent(HttpContext clientHttpContext)
{
    byte[] incomingRequestBuffer = new byte[_settings.ChunkSize];
    HttpResponseMessage? response = null;

    //
    // Create a upstreamRequestStream to write chunks to for sending to the upstream server.
    //
    using (MemoryStream upstreamRequestStream = new MemoryStream(_settings.ChunkSize))
    {
        //
        // Create a StreamContent with the memory upstreamRequestStream as its internal implementation.
        //
        StreamContent upstreamContent = new StreamContent(upstreamRequestStream);

        //
        // Make a Request to send to the proxied API.
        //
        HttpRequestMessage upstreamRequest = new HttpRequestMessage();
        upstreamRequest.Method = new HttpMethod(clientHttpContext.Request.Method);
        upstreamRequest.RequestUri = new Uri($"{_settings.UpstreamUrl}/api/postdata");
        upstreamRequest.Content = upstreamContent;
        upstreamRequest.Content.Headers.ContentType = System.Net.Http.Headers.MediaTypeHeaderValue.Parse(clientHttpContext.Request.Headers.ContentType.First());

        //
        // Loop through the incoming upstreamRequestStream sending it to the Transform method
        // and then writing it to the upstream data stream.
        //
        int incomingBufferBytesRead = await clientHttpContext.Request.Body.ReadAsync(incomingRequestBuffer, 0, incomingRequestBuffer.Length);

        while (incomingBufferBytesRead > 0)
        {
            //
            // Process (transform) a single chunk prior to its going to the proxied API.
            //
            byte[] transformedBuffer = TransformTheIncomingBuffer(incomingRequestBuffer, incomingBufferBytesRead);

            //
            // Write the transformed data to the upstreamRequestStream that is wrapped in the StreamContent.
            //
            await upstreamRequestStream.WriteAsync(transformedBuffer, 0, transformedBuffer.Length);

            //
            // Clear my transformed buffer and get the next chunk from the input upstreamRequestStream.
            //
            Array.Clear(transformedBuffer);
            incomingBufferBytesRead = await clientHttpContext.Request.Body.ReadAsync(incomingRequestBuffer, 0, incomingRequestBuffer.Length);
        }

        //
        // Reset the upstreamRequestStream pointer on the outgoing upstreamRequestStream.
        // This is a problem - it means the entire object is in memory
        // so large objects will overwhelm this.
        // How can we feed chunks to the upstream request's StreamContent?
        //
        upstreamRequestStream.Position = 0;

        //
        // Send this request on to the httpClient that is bound to the proxied API.
        // But, by now we have read and transformed the entire incoming request clientResponseBody
        // which may be huge. How do we send this using chunks as we transform it?
        //
        response = await _httpClient.SendAsync(upstreamRequest);
    }

    return response;
}

And here is my code using the obsolete HttpWebRequest:

    /// <summary>
    /// Uses the obsolete WebRequest / WebResponse to communicate with
    /// the proxied API.  This allows us access to the upstream request stream.
    /// </summary>
    /// <param name="clientHttpContext">The HttpContext for this request.</param>
    /// <returns>An HttpWebResponse which exposes its stream for processing.</returns>
    private async Task<HttpWebResponse> SendToProxiedAPIWithWebRequest(HttpContext clientHttpContext)
    {
        byte[] incomingRequestBuffer = new byte[_settings.ChunkSize];
        HttpWebResponse? response = null;
        HttpWebRequest webRequest = (HttpWebRequest)WebRequest.CreateHttp($"{_settings.UpstreamUrl}/api/postdata");
        webRequest.Method = "POST";
        webRequest.ContentType = "application/json";

        using (var upstreamRequestStream = webRequest.GetRequestStream())
        {
            int incomingBufferBytesRead = await clientHttpContext.Request.Body.ReadAsync(incomingRequestBuffer, 0, incomingRequestBuffer.Length);
            long contentLength = 0;

            while (incomingBufferBytesRead > 0)
            {
                //
                // Process (transform) a single chunk prior to its going to the proxied API.
                //
                byte[] transformedBuffer = TransformTheIncomingBuffer(incomingRequestBuffer, incomingBufferBytesRead);
                contentLength += transformedBuffer.LongLength;
                //
                // Write the transformed data directly to the outgoing request upstreamRequestStream.
                // (Note: there is no way to do this using the HttpClient)
                //
                upstreamRequestStream.Write(transformedBuffer, 0, transformedBuffer.Length);
                //
                // Clear my transformed buffer and get the next chunk from the input stream.
                //
                Array.Clear(transformedBuffer);
                incomingBufferBytesRead = await clientHttpContext.Request.Body.ReadAsync(incomingRequestBuffer, 0, incomingRequestBuffer.Length);
            }

            webRequest.ContentLength = contentLength;
        }
        //
        // Send the request to the proxied API and get the httpResponseMessage.
        //
        response = (HttpWebResponse)await webRequest.GetResponseAsync();
        return response;
    }

Is there any way to use the HttpRequestMessage and still process my data in chunks?

I would like to use the newer method since from what I understand, it does a much better job of reusing connections and in a high volume proxy server this would definitely be an advantage.

Thanks in advance for any guidance.

2
  • Is this a learning-exercise? If not, you're basically reinventing Microsoft YARP: microsoft.github.io/reverse-proxy - do you have a reason for doing this? Commented Nov 14, 2024 at 22:47
  • I looked at YARP and started down that path but could not figure out how to include my "Transform" without exhausting too much memory for large requests. However, I will go back to that with the use of the "Custom HttpContent" as was suggested below. I would prefer to use YARP as it saves me a lot of headaches... Commented Nov 15, 2024 at 13:28

1 Answer 1

3

You can't do this with the standard HttpContent classes, as they all expect the data to be ready upfront. What you need is a class than can pull the data from somewhere else.

Here is one possible solution. It takes a Func which can be used to stream the data at the exact point that HttpClient demands it, and also optionally accepts a Func to supply a length if available.

public class PullingStreamContent(Func<Stream, CancellationToken, Task> streamWriter, Func<long?>? getLength = null)
    : HttpContent
{
    private readonly Func<Stream, CancellationToken, Task> _streamWriter = streamWriter;
    private readonly Func<long?>? _getLength = getLength;
    
    protected override Task SerializeToStreamAsync(Stream stream, TransportContext? context) =>
        _streamWriter(stream, default);

    protected override Task SerializeToStreamAsync(Stream stream, TransportContext? context, CancellationToken cancellationToken) =>
        _streamWriter(stream, cancellationToken);

    protected override bool TryComputeLength(out long length)
    {
        var l = _getLength?.Invoke();
        length = l.GetValueOrDefault();
        return l.HasValue;
    }
}

The length lambda is optional, and if you don't provide it then you won't get a Content-Length, and instead the client will do Chunked Transfer.

Then you can pass in a lambda that pulls the data from one side and sends direct to the other.

private async Task<HttpResponseMessage> SendToProxiedAPIWithStreamContent(HttpContext clientHttpContext)
{
    using var upstreamContent = new PullingStreamContent(async (outputStream, ct) =>
    {
        var incomingRequestBuffer = new byte[_settings.ChunkSize];
        var body = clientHttpContext.Request.Body;

        int incomingBufferBytesRead;
        while ((incomingBufferBytesRead = await body.ReadAsync(incomingRequestBuffer, ct)) > 0)
        {
            // perhaps a reusable transform buffer as well??
            var transformedBuffer = TransformTheIncomingBuffer(incomingRequestBuffer.AsMemory(incomingBufferBytesRead));
            await outputStream.WriteAsync(transformedBuffer, ct);
        }
    });
    upstreamContent.Headers.ContentType = MediaTypeHeaderValue.Parse(clientHttpContext.Request.Headers.ContentType.First());

    using var upstreamRequest = new HttpRequestMessage(HttpMethod.Parse(clientHttpContext.Request.Method), $"{_settings.UpstreamUrl}/api/postdata");
    upstreamRequest.Content = upstreamContent;
    var response = await _httpClient.SendAsync(upstreamRequest);
    return response;
}

Other points to note

  • Handing around an HttpResponseMessage is a bit of a code smell. This function should really deal with and dispose it immediately.
  • I've used AsMemory to pass around segments of arrays.
  • A standard read loop has only one place that does the read, of the form:
    while ((bytesRead = DoRead()) > 0) {
    
    This removes the need to repeat the read statement.
  • HttpMethod.Parse returns singletons, rather than doing new HttpMethod on every run.
Sign up to request clarification or add additional context in comments.

6 Comments

Thank you - I was trying to create a "CustomContent" like you showed, but could not figure out if it needed a callback and where to put that.... I will implement your suggestion today and after it works I will accept this answer. Thank You!
One slight change -- the argument in the line that has the lambda needs to be named upstreamRequestStream rather than outputStream. Other wise this works well! Thank you, I am marking this as the answer.
I do have a problem with this solution. When I go to actually SendAsync it calls my lambda just fine and works as expected. However, the Content-Length header is null. I see that the constructor for the CustomContent has a delegate parameter to get the length, however, I can;'t figure out where and how to call that Func. Any guidance here?
You can provide whatever you want there. I don't know how to calculate it, because I don't know what your transformation is. If you want it you'd need something like () => clientHttpContext.Request.ContentLength * 2 if you wanted to exactly double the length (and the incoming request provided it). It's not required in many cases to actually supply the Content-Length, because the handler will just do Chunked Transfer encoding instead, see stackoverflow.com/a/15995101/14868997
No it's not possible as far as I'm aware. In theory Chunked Transfer allows more headers to be sent, but I don't think HttpClient allows it, and in any case there's little point sending it when doing Chunked. Note that the while loop happens during the SendAsync, as you are passing it as a lambda which only gets called later.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.