Streaming vs. Non-Streaming with Microsoft Agent Framework in .NET
The Microsoft Agent Framework provides two ways to run an agent: non-streaming and streaming. The difference is simple — non-streaming waits for the entire response before returning it, while streaming delivers the response in real-time chunks as the model generates them.
The Setup (Shared)
Both approaches use the same agent configuration. We create a ChatClientAgent with system instructions via ChatOptions.Instructions:
Non-Streaming: RunAsync
The RunAsync method sends the user message to the model and waits for the complete response before returning. The result is an AgentResponse object whose .Text property contains the full output.
When to use: When you need the full response at once — for example, to parse it, store it, or pass it to another agent.
Streaming: RunStreamingAsync
The RunStreamingAsync method returns an IAsyncEnumerable<AgentResponseUpdate>. Each update contains a small text chunk that you can write to the console (or send to a client) immediately.
When to use: When you want a responsive user experience — the user sees text appearing token by token instead of waiting for the full generation to finish. This is the same behavior you see in ChatGPT's web interface.
Key Differences
| Aspect | Non-Streaming (RunAsync) | Streaming (RunStreamingAsync) |
| Return type | AgentResponse | IAsyncEnumerable<AgentResponseUpdate> |
| First output | After full generation completes | As soon as the first token arrives |
| Best for | Processing, storing, or chaining results | Real-time UI, chat interfaces, long responses |
| Print method | Console.WriteLine(response.Text) | Console.Write(update) inside await foreach |
Both methods use the same agent, the same instructions, and the same underlying model — the only difference is how the response is delivered to your code.