gRPC With. Net Performance Created: 26 Mar 2026 Updated: 26 Mar 2026

gRPC Performance: When Streaming Is Better Than Individual (Unary) Calls

1. Recap: The Four Types of gRPC Calls

Before we dive in, let's quickly recall the four types of gRPC calls:

TypeClient sendsServer sends back
UnaryOne messageOne message
Server streamingOne messageMany messages (stream)
Client streamingMany messages (stream)One message
Bi-directional streamingMany messages (stream)Many messages (stream)

This article focuses on the difference between making many separate unary calls versus making one bi-directional streaming call that processes all the messages at once.

2. The Problem with Repeated Unary Calls

A unary call works like a normal function call: you send one request and get one response. This is perfectly fine when you only need to call it occasionally.

But imagine you need to process 1,000 items. With unary calls, you would make 1,000 separate gRPC calls — one per item:

// Making 1,000 separate gRPC calls — expensive!
for (int i = 0; i < 1000; i++)
{
var response = await client.GetPerformanceAsync(new PerformanceStatusRequest
{
ClientName = $"client {i + 1}"
});
ProcessResponse(response);
}

Each of those 1,000 calls involves:

  1. Starting a new HTTP/2 stream on the connection
  2. Sending the request headers and request body
  3. Waiting for the server to respond
  4. Receiving the response headers and response body
  5. Closing the stream

All of that overhead is repeated 1,000 times. Even though individual calls are fast, this overhead accumulates. In tests with 1,000 calls over a channel reusing the same connection, it took approximately 15 seconds.

Real-world analogy: Imagine a post office. Unary calls are like writing 1,000 individual letters, going to the post office for each one, waiting in line, handing over the letter, waiting to receive a reply, and then going back home — 1,000 times. Streaming is like going to the post office once with a bag containing all 1,000 letters, handing them all over at once, and waiting at the counter while the clerk processes each letter and hands you back 1,000 replies one by one.

3. What Is Bi-Directional Streaming?

Bi-directional streaming opens a single long-lived connection between client and server through which both sides can send as many messages as they want.

Instead of 1,000 separate request-response cycles, you:

  1. Open one streaming call
  2. Send all 1,000 request messages through the client stream
  3. Receive all 1,000 response messages from the server stream
  4. Close the call

The key benefit is that the per-call HTTP/2 overhead (opening a stream, sending headers, closing a stream) happens only once instead of 1,000 times. The messages themselves can flow continuously without any interruption.

4. Step 1: Define the Streaming RPC in the .proto File

Let's start from the beginning with a real example. We have a Monitor service in a performance.proto file that already has a unary RPC:

syntax = "proto3";

package performance;

service Monitor {
// Unary RPC — one request, one response
rpc GetPerformance (PerformanceStatusRequest) returns (PerformanceStatusResponse);
}

message PerformanceStatusRequest {
string client_name = 1;
}

message PerformanceStatusResponse {
double cpu_percentage_usage = 1;
double memory_usage = 2;
int32 processes_running = 3;
int32 active_connections = 4;
}

To add a bi-directional streaming RPC, we add a new entry to the Monitor service. The stream keyword before both the request and response types marks them as streams:

service Monitor {
rpc GetPerformance (PerformanceStatusRequest) returns (PerformanceStatusResponse);

// Bi-directional streaming RPC — stream of requests, stream of responses
rpc GetManyPerformanceStats (stream PerformanceStatusRequest)
returns (stream PerformanceStatusResponse);
}

The .proto file uses the same request and response message types as before. All we have changed is that both sides are now streams instead of single messages. After saving this file, the gRPC tooling regenerates the C# classes automatically.

5. Step 2: Implement the Server Side

On the server side, you need to override the new RPC method. The signature is different from a unary call:

  1. Instead of receiving a single PerformanceStatusRequest, you receive an IAsyncStreamReader<PerformanceStatusRequest> — an asynchronous reader you loop through to read incoming request messages one at a time.
  2. Instead of returning a single PerformanceStatusResponse, you write to an IServerStreamWriter<PerformanceStatusResponse> — you call WriteAsync on it to push response messages to the client.
// PerformanceMonitor.cs — server implementation
using System.Threading.Tasks;
using Grpc.Core;
using Performance;

public class PerformanceMonitor : Monitor.MonitorBase
{
// Existing unary method unchanged
public override Task<PerformanceStatusResponse> GetPerformance(
PerformanceStatusRequest request, ServerCallContext context)
{
return Task.FromResult(GenerateResponse());
}

// New bi-directional streaming method
public override async Task GetManyPerformanceStats(
IAsyncStreamReader<PerformanceStatusRequest> requestStream,
IServerStreamWriter<PerformanceStatusResponse> responseStream,
ServerCallContext context)
{
// Read every request message from the client as it arrives
while (await requestStream.MoveNext())
{
// For each incoming request, write one response back to the client
await responseStream.WriteAsync(GenerateResponse());
}
// When requestStream is exhausted, the loop ends and the call is complete
}

private PerformanceStatusResponse GenerateResponse()
{
var rng = new Random();
return new PerformanceStatusResponse
{
CpuPercentageUsage = rng.NextDouble() * 100,
MemoryUsage = rng.NextDouble() * 100,
ProcessesRunning = rng.Next(),
ActiveConnections = rng.Next()
};
}
}

The server processes exactly one response per request, maintaining a strict 1:1 mapping. But crucially, all of this happens over a single open stream — there is no per-message overhead of opening and closing HTTP/2 streams.

6. Step 3: Implement the Client Side

The client side is slightly more complex because the sending and receiving happen concurrently. You need to:

  1. Open the streaming call with client.GetManyPerformanceStats().
  2. Start a background task that reads from the response stream.
  3. Write all request messages to the request stream.
  4. Close the request stream (signal that you are done sending).
  5. Wait for the background reading task to finish (all responses received).
// GrpcPerformanceClient.cs — streaming method
using System.Collections.Generic;
using System.Threading.Tasks;
using Grpc.Core;
using Performance;

public async Task<IEnumerable<ResponseModel.PerformanceStatusModel>>
GetPerformanceStatuses(IEnumerable<string> clientNames)
{
var client = new Monitor.MonitorClient(channel);

// Open ONE streaming call for all messages
using var call = client.GetManyPerformanceStats();

var responses = new List<ResponseModel.PerformanceStatusModel>();

// Step 1: Start reading responses in a background task
// (we run this concurrently because the server sends responses
// as it receives each request — we must be ready to receive them)
var readTask = Task.Run(async () =>
{
await foreach (var response in call.ResponseStream.ReadAllAsync())
{
responses.Add(new ResponseModel.PerformanceStatusModel
{
CpuPercentageUsage = response.CpuPercentageUsage,
MemoryUsage = response.MemoryUsage,
ProcessesRunning = response.ProcessesRunning,
ActiveConnections = response.ActiveConnections
});
}
});

// Step 2: Write all request messages to the server
foreach (var clientName in clientNames)
{
await call.RequestStream.WriteAsync(new PerformanceStatusRequest
{
ClientName = clientName
});
}

// Step 3: Signal we are done sending requests
await call.RequestStream.CompleteAsync();

// Step 4: Wait until all responses have been read
await readTask;

return responses;
}

The important insight here is that the client writes to the request stream and the server writes to the response stream simultaneously. While the client is still sending request messages, the server is already processing the first few and sending back responses. This pipeline effect is what makes streaming so much faster.

7. Step 4: Expose an API Endpoint That Uses Streaming

Now let's wire everything up in an ASP.NET Core controller to compare streaming and unary side by side:

// PerformanceController.cs
[HttpGet("streaming-call/{count}")]
public async Task<ResponseModel> GetPerformanceFromStreamingCall(int count)
{
var stopWatch = Stopwatch.StartNew();
var response = new ResponseModel();

// Build the list of client names to send as request messages
var clientNames = new List<string>();
for (var i = 0; i < count; i++)
{
clientNames.Add($"client {i + 1}");
}

// ONE streaming call processes all count messages
response.PerformanceStatuses.AddRange(
await clientWrapper.GetPerformanceStatuses(clientNames));

response.RequestProcessingTime = stopWatch.ElapsedMilliseconds;
return response;
}

Compare this to the unary version, which makes count separate calls. The calling code looks almost identical, but the underlying mechanism is completely different. The streaming version sends all messages over a single open stream, while the unary version opens and closes a new stream for each message.

8. Performance Comparison: Unary vs. Streaming

Here is the measured performance for processing 1,000 items in both approaches:

ApproachgRPC calls madeTime for 1,000 items
Repeated unary calls (reused channel)1,000~15 seconds
Single bi-directional streaming call1~3 seconds

The bi-directional streaming call is 5 times faster despite processing exactly the same number of messages and returning exactly the same number of results. The only difference is the mechanism used to transmit them.

The performance gain comes entirely from eliminating the per-call HTTP/2 overhead. With 1,000 unary calls, you pay that overhead 1,000 times. With one streaming call, you pay it once.

9. When Should You Use Streaming vs. Unary?

Streaming is not always the right choice. Here is a practical guide:

SituationRecommended approach
Occasional single calls (e.g., user login, one-time lookup)Unary — simple and sufficient
Batch processing of many items in one operationBi-directional streaming
Real-time data feed from server (e.g., stock ticker)Server streaming
Uploading many files or large chunks to the serverClient streaming
Chat applications, real-time collaborative editingBi-directional streaming
Simple CRUD-style operations triggered by user actionsUnary — easier to implement and reason about
Rule of thumb: If you find yourself calling the same unary RPC many times in a loop to process a collection, that is a strong signal that a streaming RPC would serve you better.

10. Summary

  1. A unary call is one request and one response. It is the simplest form of gRPC communication and fine for occasional single calls.
  2. When you need to process many items, making repeated unary calls accumulates per-call HTTP/2 overhead that adds up significantly.
  3. Bi-directional streaming opens one long-lived call through which both client and server send as many messages as needed. The per-call overhead is paid only once.
  4. To define a streaming RPC in .proto, add the stream keyword before the request and/or response type: rpc GetManyPerformanceStats (stream PerformanceStatusRequest) returns (stream PerformanceStatusResponse);
  5. On the server, override the method with IAsyncStreamReader and IServerStreamWriter parameters. Read from one, write to the other.
  6. On the client, open the streaming call, start a background reading task, write all requests, close the request stream, and await the reading task.
  7. Measured result: 5x faster than repeated unary calls for 1,000 messages (~3 seconds vs. ~15 seconds).
Share this lesson: