Semantic Kernel Prompt Created: 23 Jan 2026 Updated: 23 Jan 2026

Prompt Execution Settings in Semantic Kernel

What Are Prompt Execution Settings?

Prompt Execution Settings are parameters passed to the LLM alongside your prompt. They don't change what you are asking the AI, but rather how the AI should go about generating the answer.

In Semantic Kernel architecture, Execution Settings are a core component. While there is a base PromptExecutionSettings class, developers working with OpenAI or Azure OpenAI models will primarily use their specific derivatives:

  1. OpenAIPromptExecutionSettings
  2. AzureOpenAIPromptExecutionSettings

These classes expose properties that directly map to the API parameters of the underlying models (like GPT-4 or GPT-3.5 Turbo).

Key Execution Setting Properties

Let's examine the most popular properties used to configure AI behavior, understanding their accepted values and impact.

1. Temperature: Controlling Randomness

  1. Type: double
  2. Default Value: Usually 1.0 (depending on the specific model)
  3. Range: typically 0.0 to 2.0

Temperature is perhaps the most frequently tweaked setting. It controls the randomness and creativity of the model's completion. It influences the internal algorithms that determine predicted token probabilities.

  1. Lower Values (e.g., 0.0 - 0.4): The model becomes more deterministic, focused, and conservative. It will almost always choose the most probable next token. Use this for tasks requiring factual accuracy, code generation, or classification.
  2. Higher Values (e.g., 1.0 - 1.5+): The model takes more risks, choosing less probable tokens. This leads to more creative, diverse, and sometimes unexpected outputs. Use this for creative writing, brainstorming, or generating novel ideas.
Analogy: A temperature of 0.0 is like an accountant; a temperature of 1.5 is like a poet.

2. MaxTokens: Controlling Length

  1. Type: int / int? (nullable integer)
  2. Value: A positive integer representing the limit.

MaxTokens sets a hard limit on the maximum number of tokens the model generates in its completion.

This is crucial for two reasons:

  1. Cost Control: Preventing the model from generating exceptionally long, expensive responses.
  2. Output Management: Ensuring responses fit within UI constraints.
  3. Example: If set to 50, the model stops generating text instantly after producing the 50th token, even if it is in the middle of a sentence.

3. ChatSystemPrompt: Defining Persona

  1. Type: string
  2. Default Value: Usually generic, like "Assistant is a large language model."

When using chat completion models (like gpt-3.5-turbo or gpt-4), the ChatSystemPrompt is essential. It sets the "system message," which defines the AI's persona, role, and fundamental rules of engagement.

Unlike the user prompt, which carries the specific task, the system prompt defines who the AI is while performing that task.

  1. Example Use:
  2. Standard: "You are a helpful AI assistant."
  3. Specific: "You are an expert historian specializing in the Roman Empire. Provide detailed, academically rigorous answers."
  4. Behavioral: "You are a JSON converter. Output only raw JSON without any markdown formatting or explanations."

4. User: Tracking End-Users

  1. Type: string
  2. Value: A unique identifier string.

The User property allows you to pass a unique identifier representing your end-user to OpenAI. This does not change the AI's output directly, but it is vital for enterprise applications. It helps OpenAI monitor usage patterns, detect abuse, and assist with debugging issues linked to specific user sessions.

5. Logprobs and TopLogprobs: Inspecting Confidence

AI generation is historically non-deterministic. Even without changing a prompt, you might get different answers because internal algorithms select based on probabilities.

These two settings are designed for debugging and analyzing the model's "confidence" in its choices.

Logprobs

  1. Type: bool / bool?
  2. Default Value: false

Setting Logprobs = true instructs the model to return metadata about the logarithmic probabilities of the tokens it generated. It essentially asks the AI, "Show me the math behind why you chose these words."

When enabled, the response metadata will contain ContentTokenLogProbabilities, revealing how strongly the model felt about its chosen token versus alternatives.

TopLogprobs

  1. Type: int / int?
  2. Value: An integer specifying how many alternatives to show.

If Logprobs is true, TopLogprobs defines how many alternative token choices you want to see for each generated token.

  1. Example: If set to 5, for every word the AI generates, it will report the top 5 words it considered choosing at that specific step, along with their respective probabilities.


using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using OpenAI.Chat;

var apiKey = Environment.GetEnvironmentVariable("OPEN_AI_KEY");

if (string.IsNullOrEmpty(apiKey))
{
Console.WriteLine("Please set the OPEN_AI_KEY environment variable.");
return;
}

var kernel = Kernel.CreateBuilder()
.AddOpenAIChatCompletion(
"gpt-4o",
apiKey)
.Build();


Console.WriteLine("=== Execution Settings Demo ===\n");

#pragma warning disable SKEXP0010
var executionSettings = new OpenAIPromptExecutionSettings
{
// Controls randomness. Higher = more creative/random.
Temperature = 0.9,
// Limits the output length to save costs and maintain focus.
MaxTokens = 100,
// Enables return of probability data for debugging confidence.
Logprobs = true,
// Asks for the top 10 alternative choices for each token generated.
TopLogprobs = 10,
// Identifies the end-user for telemetry and abuse detection.
User = "HealthTracker",
// Defines the crucial persona and operational constraints of the AI.
ChatSystemPrompt = """
You are a professional health and fitness assistant specializing in nutrition advice.
Your task is to provide quick, actionable health tips.
Keep responses concise and focused on practical nutrition guidance.
"""
};
#pragma warning restore SKEXP0010

// Pass settings into KernelArguments
var kernelArguments = new KernelArguments(executionSettings);

// Define the user prompt
var userPrompt = """
Suggest one healthy breakfast option that is high in protein.
Respond with only the meal name and protein content.
""";

// Invoke the prompt using the kernel and the arguments
var response = await kernel.InvokePromptAsync(userPrompt, kernelArguments);

// Output the result with detailed information
Console.WriteLine("=== RESPONSE DETAILS ===\n");

Console.WriteLine($"Content: {response}\n");

// Access metadata
if (response.Metadata != null)
{
Console.WriteLine("--- Metadata Summary ---");

// Display basic metadata
if (response.Metadata.TryGetValue("Id", out var id)) Console.WriteLine($"Request ID: {id}");

if (response.Metadata.TryGetValue("CreatedAt", out var createdAt)) Console.WriteLine($"Created At: {createdAt}");

if (response.Metadata.TryGetValue("SystemFingerprint", out var fingerprint))
Console.WriteLine($"System Fingerprint: {fingerprint}");

if (response.Metadata.TryGetValue("FinishReason", out var finishReason))
Console.WriteLine($"Finish Reason: {finishReason}");

// Display token usage details
if (response.Metadata.TryGetValue("Usage", out var usage))
{
Console.WriteLine("\n--- Token Usage ---");
var tokenUsage = usage as ChatTokenUsage;
if (tokenUsage != null)
{
Console.WriteLine($"Prompt Tokens: {tokenUsage.InputTokenCount}");
Console.WriteLine($"Completion Tokens: {tokenUsage.OutputTokenCount}");
Console.WriteLine($"Total Tokens: {tokenUsage.TotalTokenCount}");
}
}

// Display log probabilities if available
if (response.Metadata.TryGetValue("ContentTokenLogProbabilities", out var logProbs))
{
Console.WriteLine("\n--- Token Log Probabilities ---");
var tokenLogProbs = logProbs as IReadOnlyList<ChatTokenLogProbabilityDetails>;

if (tokenLogProbs != null && tokenLogProbs.Count > 0)
{
Console.WriteLine($"Total tokens generated: {tokenLogProbs.Count}");
Console.WriteLine("\nShowing first 5 tokens:\n");

for (var i = 0; i < tokenLogProbs.Count && i < 5; i++)
{
var tokenInfo = tokenLogProbs[i];
Console.WriteLine($"Token {i + 1}: '{tokenInfo.Token}'");
Console.WriteLine($" Log Probability: {tokenInfo.LogProbability:F4}");
Console.WriteLine($" Probability: {Math.Exp(tokenInfo.LogProbability) * 100:F2}%");

if (tokenInfo.TopLogProbabilities != null && tokenInfo.TopLogProbabilities.Count > 0)
{
Console.WriteLine($" Top {Math.Min(3, tokenInfo.TopLogProbabilities.Count)} Alternatives:");
for (var j = 0; j < Math.Min(3, tokenInfo.TopLogProbabilities.Count); j++)
{
var alt = tokenInfo.TopLogProbabilities[j];
Console.WriteLine($" {j + 1}. '{alt.Token}' - {Math.Exp(alt.LogProbability) * 100:F2}%");
}
}

Console.WriteLine();
}

if (tokenLogProbs.Count > 5) Console.WriteLine($"... and {tokenLogProbs.Count - 5} more tokens\n");
}
else
{
Console.WriteLine("No token log probabilities available.");
}
}
}
else
{
Console.WriteLine("No metadata available.");
}

Console.WriteLine("=== Demo Complete ===");


Share this lesson: