Generative AI for Beginners AI Patterns and Applications in .NET Created: 04 Apr 2026 Updated: 04 Apr 2026

Embeddings and Semantic Search with Microsoft.Extensions.AI

In this lesson you will learn how AI represents text meaning as numbers, how to measure the similarity between those numbers, and how to build a semantic search system that finds results by intent rather than exact keyword matching — all using Microsoft.Extensions.AI and Microsoft.Extensions.VectorData.

The running example is a software consulting firm's internal knowledge base. The firm's consultants need to find relevant technical articles quickly. A keyword search for "separating reads from writes" would miss an article titled "Introduction to CQRS Pattern" even though they mean the same thing. Semantic search solves this.

Part 1 — What Is an Embedding?

An embedding is a fixed-length array of floating-point numbers that represents the meaning of a piece of text. An AI model produces this array by projecting the text into a high-dimensional space where similar meanings end up geometrically close to each other.

"CQRS separates read and write models" → [0.12, -0.45, 0.89, 0.23, -0.67, ...]
"segregating queries from commands" → [0.11, -0.43, 0.91, 0.25, -0.66, ...] ← close!
"quarterly report for finance team" → [-0.34, 0.67, -0.12, 0.88, 0.02, ...] ← far away

Text-embedding models such as text-embedding-3-small output 1 536-dimensional vectors. You do not need to understand the individual numbers; the model has learned that texts with related meanings produce vectors that are numerically similar.

The IEmbeddingGenerator Interface

Microsoft.Extensions.AI defines IEmbeddingGenerator<TInput, TEmbedding> as the vendor-neutral abstraction for any embedding model. The OpenAI implementation is provided by the Microsoft.Extensions.AI.OpenAI package:

IEmbeddingGenerator<string, Embedding<float>> embeddingGenerator =
new OpenAIClient(apiKey)
.GetEmbeddingClient("text-embedding-3-small")
.AsIEmbeddingGenerator();
CallPurpose
new OpenAIClient(apiKey)Authenticates with the OpenAI API
.GetEmbeddingClient("text-embedding-3-small")Selects the embedding model
.AsIEmbeddingGenerator()Wraps it behind the standard interface

Generating a Single Embedding

var embeddings = await embeddingGenerator.GenerateAsync(["CQRS separates read and write models"]);
Console.WriteLine(embeddings[0].Vector.Length); // 1536

Generating Embeddings in Bulk

Send a batch of strings in one call to minimize round trips to the API:

string[] summaries = [
"CQRS separates read and write models ...",
"Event sourcing stores immutable events ...",
"REST API design covers resource naming ..."
];

var embeddings = await embeddingGenerator.GenerateAsync(summaries);

The method returns a list where embeddings[i] corresponds to summaries[i]. Each Embedding<float> exposes its numerical data through the Vector property (ReadOnlyMemory<float>).

Part 2 — Measuring Similarity with Cosine Similarity

Two embedding vectors are compared using cosine similarity, which measures the angle between them. The formula returns a value between −1 and 1:

  1. 1.0 — identical meaning
  2. 0.0 — unrelated
  3. −1.0 — opposite meaning

The implementation in C# works directly on the Span<float> to avoid allocations:

static float CosineSimilarity(ReadOnlyMemory<float> a, ReadOnlyMemory<float> b)
{
var spanA = a.Span;
var spanB = b.Span;

float dot = 0, normA = 0, normB = 0;
for (int i = 0; i < spanA.Length; i++)
{
dot += spanA[i] * spanB[i];
normA += spanA[i] * spanA[i];
normB += spanB[i] * spanB[i];
}

return dot / (MathF.Sqrt(normA) * MathF.Sqrt(normB));
}

Manual Ranked Search

With embeddings and the cosine similarity function in hand, you can rank every article in the knowledge base against a consultant's query:

const string consultantQuery =
"how to separate reading data from writing data in large applications";

var queryVector = await embeddingGenerator.GenerateVectorAsync(consultantQuery);

var ranked = ArticleData
.Zip(embeddings)
.Select(pair => (Article: pair.First, Score: CosineSimilarity(queryVector, pair.Second.Vector)))
.OrderByDescending(r => r.Score)
.ToList();

Expected output:

Consultant query: "how to separate reading data from writing data in large applications"

Top 3 matches:
[0.891] Introduction to CQRS Pattern (Architecture)
[0.823] Event Sourcing in Distributed Systems (Architecture)
[0.761] Domain-Driven Design Aggregates (Architecture)

Notice that none of the query words appear in the article titles, yet the correct articles surface at the top because the model has encoded their shared conceptual territory.

Part 3 — Storing Embeddings in an InMemoryVectorStore

Recomputing embeddings for every search is wasteful. A vector store persists embeddings alongside the original data and uses optimised algorithms to retrieve the nearest neighbours quickly.

Microsoft.SemanticKernel.Connectors.InMemory provides InMemoryVectorStore, which is ideal for development and learning before switching to a production store such as Azure AI Search or Qdrant.

Defining the Data Model

Annotate a plain C# class with attributes from Microsoft.Extensions.VectorData:

public class KnowledgeArticle
{
[VectorStoreKey]
public int Id { get; set; }

[VectorStoreData]
public string Title { get; set; } = string.Empty;

[VectorStoreData]
public string Summary { get; set; } = string.Empty;

[VectorStoreData]
public string Category { get; set; } = string.Empty;

[VectorStoreVector(1536)]
public ReadOnlyMemory<float> Embedding { get; set; }
}
AttributePurpose
[VectorStoreKey]Unique identifier for the record
[VectorStoreData]Searchable / filterable text field
[VectorStoreVector(1536)]The embedding vector (1 536 dimensions)

Creating and Populating the Collection

var vectorStore = new InMemoryVectorStore();
var collection = vectorStore.GetCollection<int, KnowledgeArticle>("articles");
await collection.EnsureCollectionExistsAsync();

for (int i = 0; i < ArticleData.Length; i++)
{
var (id, title, summary, category) = ArticleData[i];
await collection.UpsertAsync(new KnowledgeArticle
{
Id = id,
Title = title,
Summary = summary,
Category = category,
Embedding = embeddings[i].Vector, // pre-computed above
});
}

Searching the Collection

string query = "automating software delivery and reducing manual deployments";

var qVector = await embeddingGenerator.GenerateVectorAsync(query);
var results = collection.SearchAsync(qVector, top: 2);

await foreach (var hit in results)
{
Console.WriteLine($"[{hit.Score:F3}] {hit.Record.Title} ({hit.Record.Category})");
}

Expected output:

Query: "automating software delivery and reducing manual deployments"
[0.872] GitHub Actions CI/CD Pipelines (DevOps)
[0.798] Kubernetes Horizontal Pod Autoscaling (DevOps)

The query mentions neither "GitHub" nor "Kubernetes", yet both DevOps articles rank highly because the embedding model associates automated delivery with those technologies.

Part 4 — Production Vector Stores

InMemoryVectorStore is ephemeral — data is lost when the process stops. For production use, swap it for a persistent provider. The client code stays identical because all providers implement the same IVectorStore abstraction.

ProviderPackageBest For
Azure AI SearchMicrosoft.Extensions.VectorData.AzureAISearchEnterprise full-text + vector search
QdrantMicrosoft.SemanticKernel.Connectors.QdrantHigh-performance dedicated vector DB
PostgreSQL + pgvectorMicrosoft.SemanticKernel.Connectors.PostgresSQL workflows with vector capabilities
RedisMicrosoft.SemanticKernel.Connectors.RedisLow-latency caching scenarios
// Switch from in-memory to Qdrant by changing one line:
// var vectorStore = new InMemoryVectorStore();
var vectorStore = new QdrantVectorStore(new QdrantClient("localhost", 6333));

// Everything below stays the same
var collection = vectorStore.GetCollection<int, KnowledgeArticle>("articles");

Review — What You Learned

ConceptDescription
EmbeddingA vector of floats that encodes the meaning of text
IEmbeddingGeneratorVendor-neutral .NET interface for embedding models
Cosine similarityMeasure of angle between two vectors; 1.0 = identical meaning
VectorStoreVectorAttribute that marks the embedding property on a data model
InMemoryVectorStoreDevelopment-friendly vector store; swap for a persistent provider in production
Semantic searchFind by intent and meaning, not by matching keywords

Full Example

The complete source code for the demo — a consulting firm knowledge base with three parts (bulk embedding generation, manual cosine similarity ranking, and InMemoryVectorStore semantic search):

using Microsoft.Extensions.AI;
using Microsoft.Extensions.VectorData;
using Microsoft.SemanticKernel.Connectors.InMemory;
using OpenAI;

namespace MicrosoftAgentFrameworkLesson.ConsoleApp;

// -- Data model stored in the vector store ----------------

public class KnowledgeArticle
{
[VectorStoreKey]
public int Id { get; set; }

[VectorStoreData]
public string Title { get; set; } = string.Empty;

[VectorStoreData]
public string Summary { get; set; } = string.Empty;

[VectorStoreData]
public string Category { get; set; } = string.Empty;

[VectorStoreVector(1536)]
public ReadOnlyMemory<float> Embedding { get; set; }
}

// -- Demo class -------------------------------------------

public static class EmbeddingsSemanticSearchDemo
{
// Consulting firm knowledge base -- articles without embeddings yet
private static readonly (int Id, string Title, string Summary, string Category)[] ArticleData =
[
(1,
"Introduction to CQRS Pattern",
"Command Query Responsibility Segregation separates read and write operations into distinct models, improving scalability and maintainability in complex business domains.",
"Architecture"),
(2,
"Event Sourcing in Distributed Systems",
"Event sourcing stores every change as an immutable event rather than overwriting state, enabling complete audit trails, temporal queries, and full event replay.",
"Architecture"),
(3,
"Designing RESTful HTTP APIs",
"Effective REST API design covers resource naming conventions, HTTP verb semantics, status code usage, pagination strategies, and versioning for long-lived interfaces.",
"API Design"),
(4,
"PostgreSQL Index Optimization",
"Choosing the right index type -- B-tree, GiST, GIN, or BRIN -- and reading query plans can dramatically cut execution time and improve database throughput.",
"Database"),
(5,
"GitHub Actions CI/CD Pipelines",
"Automating build, test, and deployment pipelines with GitHub Actions reduces human error and increases delivery frequency, enabling trunk-based development at scale.",
"DevOps"),
(6,
"Domain-Driven Design Aggregates",
"Aggregates are transactional consistency boundaries that group related entities under a single aggregate root, enforcing business invariants and isolating domain logic.",
"Architecture"),
(7,
"Kubernetes Horizontal Pod Autoscaling",
"HPA automatically adjusts pod replica counts based on real-time CPU utilization or custom Prometheus metrics so services scale to meet unpredictable traffic demand.",
"DevOps"),
(8,
"OAuth 2.0 and Token-Based Authorization",
"OAuth 2.0 delegates limited resource access through short-lived access tokens and refresh tokens, decoupling identity from API security without sharing passwords.",
"Security"),
];

public static async Task RunAsync()
{
var apiKey = Environment.GetEnvironmentVariable("OPEN_AI_KEY");
if (string.IsNullOrWhiteSpace(apiKey))
{
Console.WriteLine("Please set the OPEN_AI_KEY environment variable.");
return;
}

// Create the embedding generator using the OpenAI text-embedding-3-small model
IEmbeddingGenerator<string, Embedding<float>> embeddingGenerator =
new OpenAIClient(apiKey)
.GetEmbeddingClient("text-embedding-3-small")
.AsIEmbeddingGenerator();

// =============================================
// PART 1 -- Generate Embeddings for All Articles
// =============================================
Console.WriteLine("=== Part 1: Generating Embeddings ===\n");

var summaries = ArticleData.Select(a => a.Summary).ToArray();
var embeddings = await embeddingGenerator.GenerateAsync(summaries);

Console.WriteLine($"Generated {embeddings.Count} embeddings from article summaries.");
Console.WriteLine($"Each embedding vector has {embeddings[0].Vector.Length} dimensions.\n");

// =============================================
// PART 2 -- Manual Cosine Similarity Search
// =============================================
Console.WriteLine("=== Part 2: Cosine Similarity Search ===\n");

const string consultantQuery =
"how to separate reading data from writing data in large applications";

Console.WriteLine($"Consultant query: \"{consultantQuery}\"\n");

var queryVector = await embeddingGenerator.GenerateVectorAsync(consultantQuery);

var ranked = ArticleData
.Zip(embeddings)
.Select(pair => (Article: pair.First, Score: CosineSimilarity(queryVector, pair.Second.Vector)))
.OrderByDescending(r => r.Score)
.ToList();

Console.WriteLine("Top 3 matches:");
foreach (var (article, score) in ranked.Take(3))
{
Console.WriteLine($" [{score:F3}] {article.Title} ({article.Category})");
}

// =============================================
// PART 3 -- InMemoryVectorStore Semantic Search
// =============================================
Console.WriteLine("\n=== Part 3: InMemoryVectorStore Search ===\n");

var vectorStore = new InMemoryVectorStore();
var collection = vectorStore.GetCollection<int, KnowledgeArticle>("articles");
await collection.EnsureCollectionExistsAsync();

// Populate the store with articles + their pre-computed embeddings
for (int i = 0; i < ArticleData.Length; i++)
{
var (id, title, summary, category) = ArticleData[i];
await collection.UpsertAsync(new KnowledgeArticle
{
Id = id,
Title = title,
Summary = summary,
Category = category,
Embedding = embeddings[i].Vector,
});
}

Console.WriteLine("Knowledge base indexed. Running consultant queries...\n");

string[] queries =
[
"automating software delivery and reducing manual deployments",
"protecting APIs from unauthorized access using tokens",
"speeding up slow database queries with better indexing",
];

foreach (var query in queries)
{
var qVector = await embeddingGenerator.GenerateVectorAsync(query);
var results = collection.SearchAsync(qVector, top: 2);

Console.WriteLine($"Query: \"{query}\"");
await foreach (var hit in results)
{
Console.WriteLine($" [{hit.Score:F3}] {hit.Record.Title} ({hit.Record.Category})");
}
Console.WriteLine();
}
}

// -- Cosine similarity helper ----------------------

private static float CosineSimilarity(ReadOnlyMemory<float> a, ReadOnlyMemory<float> b)
{
var spanA = a.Span;
var spanB = b.Span;

float dot = 0, normA = 0, normB = 0;
for (int i = 0; i < spanA.Length; i++)
{
dot += spanA[i] * spanB[i];
normA += spanA[i] * spanA[i];
normB += spanB[i] * spanB[i];
}

return dot / (MathF.Sqrt(normA) * MathF.Sqrt(normB));
}
}

Entry point in Program.cs:

using MicrosoftAgentFrameworkLesson.ConsoleApp;

await EmbeddingsSemanticSearchDemo.RunAsync();
Share this lesson: