Embeddings: The Language of Meaning for LLMs
Large language models don't understand raw words, images, or audio — they only understand numbers. An embedding is the bridge between the two worlds: it is a numeric representation of non-numeric data (text, images, audio) that preserves semantic meaning. Two pieces of content with similar meaning end up with similar embedding vectors, even if they don't share any exact words.
Embeddings are the mechanism that lets an LLM compare concepts, find related documents, summarize text, translate between languages, and ground answers in your own private data. They are also what makes vector databases and retrieval-augmented generation (RAG) possible.
1. What Are Embeddings?
An embedding is a long array of floating-point numbers — a vector — produced by an embedding model. The model reads raw data such as a sentence or a paragraph and outputs a fixed-length list of numbers that encodes its meaning in a high-dimensional space.
For example, the OpenAI model text-embedding-3-small outputs a vector with 1,536 dimensions. Each dimension is a single float, so each piece of text is represented by roughly 1,536 numbers.
You cannot read meaning from individual numbers — meaning only emerges from the relationships between whole vectors.
2. Capturing Semantic Relationships
Embedding models are trained on huge amounts of text, so concepts that appear in similar contexts end up close together in vector space. Think of the space as a map where every idea has a location.
- "Snow Leopard" and "Arctic Fox" land near each other — both are cold-climate predators.
- "Mango Tree" and "Pineapple Plant" cluster together — both are tropical fruit plants.
- "Snow Leopard" and "Mango Tree" land far apart — they share almost no semantic context.
The distance between two vectors (typically measured with cosine similarity) is what "similar meaning" looks like mathematically. Cosine similarity returns a value between -1 and 1, where 1 means identical direction (very similar meaning) and 0 means unrelated.
3. Practical Applications
| Use Case | How Embeddings Help |
|---|---|
| Semantic Search | Find documents by meaning, not by matching keywords. |
| Text Summarization | Identify which chunks of a long document carry the most core meaning, and keep only those. |
| Classification | Group texts (spam vs ham, positive vs negative, topic A vs topic B) by their vector positions. |
| Text-to-Image | Translate a text description into a visual concept that an image model can render. |
| Recommendations | Suggest similar products, songs, or articles based on vector closeness. |
| Retrieval-Augmented Generation (RAG) | Fetch the most relevant chunks of your own data and inject them into an LLM prompt. |
4. Semantic Memory and Vector Databases
Embeddings are usually pre-computed once and stored. A vector database (also called a vector store) is a database that is optimized to index these high-dimensional vectors and run nearest-neighbor searches quickly across millions or billions of entries.
A typical vector-search workflow looks like this:
- Generate an embedding for every record in your data set using an embedding model.
- Store the vectors — together with the original records — inside a vector database.
- When a user asks a question, convert that question into an embedding using the same model.
- Ask the vector database for the records whose vectors are closest to the query vector.
- Feed those records back into an LLM as grounded context to produce the final answer.
This is precisely what gives an LLM "long-term memory" about things it was never trained on — your internal docs, your product catalog, your support tickets.
Embeddings in .NET
The Microsoft.Extensions.AI library provides an abstraction called IEmbeddingGenerator<string, Embedding<float>> that represents any embedding model. Combined with Microsoft.Extensions.VectorData and a connector like Microsoft.SemanticKernel.Connectors.InMemory, you can build a working semantic search pipeline with a handful of lines.
Install the packages:
Key Building Blocks
| Type / Method | Purpose |
|---|---|
IEmbeddingGenerator<string, Embedding<float>> | Abstraction over any embedding model (OpenAI, Azure OpenAI, local, etc.). |
GenerateAsync(IEnumerable<string>) | Produces embedding vectors for a batch of strings in one call. |
GenerateVectorAsync(string) | Convenience helper for embedding a single string and getting the raw vector. |
[VectorStoreKey] / [VectorStoreData] / [VectorStoreVector(n)] | Attributes that describe how a record maps into a vector store. |
InMemoryVectorStore | A lightweight in-process vector store — perfect for demos and tests. |
collection.SearchAsync(vector, top: N) | Returns the top-N records whose vectors are closest to the query vector. |
Full Example
The following demo builds a tiny wildlife sanctuary field guide. It takes eight entries — four animals and four fruit plants — turns every field note into a numeric vector, shows that animals cluster away from fruits in vector space, stores everything in an in-memory vector database, and runs natural-language semantic searches against it.
Key Takeaways
- Embeddings are numeric vectors that encode the meaning of non-numeric data.
- Similar concepts produce similar vectors — similarity is measured with cosine similarity.
- They power semantic search, classification, summarization, recommendations, text-to-image, and RAG.
- Vectors are stored in vector databases to give LLMs long-term semantic memory.
- In .NET,
IEmbeddingGenerator<string, Embedding<float>>plus a vector-store connector is enough to build a full pipeline.
Reference
Embeddings in .NET — Microsoft Learn