The most common use of embeddings is semantic similarity search: given a query, find the most relevant documents from a collection based on meaning, not just keyword matching. Two texts with similar meaning produce embedding vectors that are close together, measured by cosine similarity.
Key Concepts
1. Cosine Similarity
Cosine similarity measures the angle between two vectors. A value of 1.0 means identical direction (maximum similarity), 0 means orthogonal (unrelated), and -1 means opposite. For normalized embeddings, it is computed as:
static float CosineSimilarity(ReadOnlyMemory<float> a, ReadOnlyMemory<float> b)
{
var aSpan = a.Span;
var bSpan = b.Span;
float dot = 0f, normA = 0f, normB = 0f;
for (int i = 0; i < aSpan.Length; i++)
{
dot += aSpan[i] * bSpan[i];
normA += aSpan[i] * aSpan[i];
normB += bSpan[i] * bSpan[i];
}
return dot / (MathF.Sqrt(normA) * MathF.Sqrt(normB));
}
2. Index at Startup, Query at Runtime
A typical pattern is to embed all documents once at startup (or on ingest), then embed each user query at query time and rank documents by similarity:
// 1. Embed all articles once
GeneratedEmbeddings<Embedding<float>> articleEmbeddings =
await generator.GenerateAsync(articles);
// 2. Embed the user's query
ReadOnlyMemory<float> queryVector =
await generator.GenerateVectorAsync("I forgot my password");
// 3. Rank articles by cosine similarity
var ranked = articles
.Select((article, i) => (article, score: CosineSimilarity(queryVector, articleEmbeddings[i].Vector)))
.OrderByDescending(x => x.score)
.ToList();
Console.WriteLine($"Best match: {ranked[0].article}");
3. Why Not Keyword Search?
A keyword search for "forgot password" would miss articles that say "reset account credentials" or "trouble signing in". Semantic search finds them because their meaning is close in vector space.
Full Example
using Microsoft.Extensions.AI;
using OpenAI;
namespace MicrosoftAgentFrameworkLesson.ConsoleApp.Embeddings;
/// <summary>
/// Demonstrates semantic similarity search using cosine similarity on embedding vectors.
/// Scenario: Support knowledge base — find the most relevant article for a user's question.
/// </summary>
public static class SimilaritySearchDemo
{
private static float CosineSimilarity(ReadOnlyMemory<float> a, ReadOnlyMemory<float> b)
{
var aSpan = a.Span;
var bSpan = b.Span;
float dot = 0f, normA = 0f, normB = 0f;
for (int i = 0; i < aSpan.Length; i++)
{
dot += aSpan[i] * bSpan[i];
normA += aSpan[i] * aSpan[i];
normB += bSpan[i] * bSpan[i];
}
return dot / (MathF.Sqrt(normA) * MathF.Sqrt(normB));
}
public static async Task RunAsync()
{
var apiKey = Environment.GetEnvironmentVariable("OPEN_AI_KEY")
?? throw new InvalidOperationException("Set OPEN_AI_KEY environment variable.");
IEmbeddingGenerator<string, Embedding<float>> generator =
new OpenAIClient(apiKey)
.GetEmbeddingClient("text-embedding-3-small")
.AsIEmbeddingGenerator();
Console.WriteLine("====== IEmbeddingGenerator — Semantic Similarity Search ======\n");
string[] articles =
[
"How to reset your account password via email verification.",
"Steps to upgrade your subscription plan and billing information.",
"Troubleshooting slow internet connection and router settings.",
"How to export your data and download account backups.",
"Setting up two-factor authentication for extra security."
];
Console.WriteLine("Indexing knowledge base...");
GeneratedEmbeddings<Embedding<float>> articleEmbeddings =
await generator.GenerateAsync(articles);
Console.WriteLine($" {articles.Length} articles indexed.\n");
string[] queries =
[
"I forgot my password and cannot log in.",
"My WiFi is very slow, what should I do?",
"I want to enable 2FA on my account."
];
foreach (var query in queries)
{
Console.WriteLine($"Query: \"{query}\"");
ReadOnlyMemory<float> queryVector = await generator.GenerateVectorAsync(query);
var ranked = articles
.Select((article, i) => (article, score: CosineSimilarity(queryVector, articleEmbeddings[i].Vector)))
.OrderByDescending(x => x.score)
.ToList();
Console.WriteLine($" Best match : \"{ranked[0].article}\" (score: {ranked[0].score:F4})");
Console.WriteLine($" Second best : \"{ranked[1].article}\" (score: {ranked[1].score:F4})");
Console.WriteLine();
}
}
}