Asp.Net Core Security Rate Limit Created: 24 Jan 2026 Updated: 24 Jan 2026

API Rate Limiting: A Comprehensive Overview

Introduction
Why Rate Limiting Matters
The Four Pillars
Core Concepts
Seven Strategies Overview
HTTP 429 Response
Next Steps

Introduction

In today's API-driven world, rate limiting is not optional—it's essential. Whether you're building a public REST API, a microservices architecture, or a SaaS platform, protecting your infrastructure while ensuring fair resource distribution is critical.

What is Rate Limiting?

Rate limiting controls the number of requests a client can make to your API within a specific time window. Think of it as a bouncer at a nightclub—only a certain number of people can enter at a time.

Simple Example:

Without Rate Limiting:

Client → 10,000 requests/second → Server 💥 Crash

With Rate Limiting:

Client → 10,000 requests/second → Rate Limiter (100 allowed) → Server ✅ Stable

This Article Series

This is Part 1 of 10 in our comprehensive rate limiting series:

Overview (This article) - Why, what, and how
Fixed Window Strategy
Sliding Window Strategy
Token Bucket Strategy
Concurrency Strategy
Per-User Strategy
Chained Strategy
Tiered Strategy
Comparison & Selection Guide
Testing & Production Guide

Why Rate Limiting Matters

Real-World Problem

Scenario: E-commerce Flash Sale

9:59 AM: Normal traffic (100 req/s)

10:00 AM: Flash sale starts

10:00 AM: Traffic spikes to 50,000 req/s

10:00:30 AM: Database overwhelmed

10:00:45 AM: Complete service outage

10:15 AM: Customers angry, revenue lost

With Rate Limiting:

9:59 AM: Normal traffic (100 req/s)

10:00 AM: Flash sale starts

10:00 AM: Rate limiter allows 1,000 req/s

10:00 AM: Queue system handles overflow

10:00-10:15 AM: All customers served fairly

Result: Happy customers, no outage, maximum revenue

The Cost of NOT Having Rate Limiting

Example: Startup API (Real numbers)

Incident Report - January 15, 2025

Without Rate Limiting:

- 14:23: Buggy mobile app released

- 14:24: App makes 10 req/s per user instead of 1 req/s

- 14:25: 10,000 active users = 100,000 req/s

- 14:26: API Gateway crashes

- 14:27-16:30: Complete service outage (2 hours)

Impact:

- Lost revenue: $50,000

- AWS overage charges: $8,500

- Customer churn: 15%

- Engineer overtime: $2,000

- Reputation damage: Priceless

Total Cost: $60,500+ for a 2-hour outage

With Rate Limiting (100 req/s per user):

- 14:23: Buggy app released

- 14:24: Rate limiter kicks in

- 14:25: App gets 429 responses

- 14:26: Monitoring alerts triggered

- 14:30: Buggy version pulled

Impact:

- Lost revenue: $500

- AWS costs: Normal

- Customer churn: 0%

- Total Cost: $500

ROI: $60,000 saved with rate limiting!

The Four Pillars

1. 🛡️ Security

Protection Against Attacks:

Attack TypeWithout Rate LimitWith Rate Limit
DDoS	Server down in seconds	Attacker gets 429, others unaffected
Brute Force	1M password attempts in 10 min	10 attempts, then blocked for 1 hour
Credential Stuffing	100K accounts compromised	Only 5 login attempts per IP
Web Scraping	Entire database downloaded	Slow drip, scraper gives up

Real Example: GitHub

GitHub Rate Limits:

- Authenticated: 5,000 requests/hour

- Unauthenticated: 60 requests/hour

Result:

- Legitimate users: Never hit limit

- Scrapers: Frustrated and blocked

- API stays stable

2. 💰 Cost Control

Cloud Cost Reduction:

Scenario: Weather API

Without Rate Limiting:

Monthly Traffic:

- Expected: 10M requests

- Actual: 500M requests (bot traffic)

AWS Costs:

- API Gateway: 500M × $3.50/M = $1,750

- Lambda: 500M × $0.20/M = $100

- DynamoDB: 500M reads × $0.25/M = $125

- Data Transfer: $300

Total: $2,275/month

With Rate Limiting (1K req/hour per IP):

Monthly Traffic:

- Expected: 10M requests

- Actual: 12M requests (2M legitimate bursts)

AWS Costs:

- API Gateway: 12M × $3.50/M = $42

- Lambda: 12M × $0.20/M = $2.40

- DynamoDB: 12M reads × $0.25/M = $3

- Data Transfer: $10

Total: $57.40/month

Savings: $2,217.60/month (97% reduction!)

Annual Savings: $26,611

3. ⚖️ Fair Usage

The Noisy Neighbor Problem:

Shared API (100 requests/second capacity)

Without Rate Limiting:

┌─────────────────────────────────────────────────┐

│ User A (Power User): ████████████████████ 90% │

│ User B: █ 5% │

│ User C: █ 5% │

└─────────────────────────────────────────────────┘

Result: 99% of users have poor experience

With Rate Limiting (10 req/s per user):

┌─────────────────────────────────────────────────┐

│ User A: ██████████ 10 req/s │

│ User B: ██████████ 10 req/s │

│ User C: ██████████ 10 req/s │

└─────────────────────────────────────────────────┘

Result: Fair access for everyone

4. 📊 Predictability

Capacity Planning Benefits:

Known Rate Limits = Predictable Infrastructure

Example:

- API Limit: 1,000 req/s

- Average Response Time: 50ms

- Required Capacity:

* Concurrent Requests: 1,000 × 0.05 = 50

* CPU Cores: 50 / 10 = 5 cores

* Memory: 5 × 2GB = 10GB RAM

* Instances: ceil(10GB / 4GB) = 3 instances

Budget:

- 3 instances × $100/month = $300/month

- Predictable, no surprises!

Without Rate Limiting:

- Peak load unknown (could be 1K or 100K)

- Must overprovision (10x capacity)

- Cost: $3,000/month or risk outages

Core Concepts

Request Rate

Formula:

Rate = Number of Requests / Time Window

Examples:

Web API: 100 requests/minute = 1.67 requests/second

Payment API: 5 requests/minute = 0.08 requests/second

CDN: 10K requests/second = 600K requests/minute

Time Windows

Common Window Sizes:

Window SizeUse CaseExample
1 second	Real-time APIs	Chat, streaming
1 minute	Standard APIs	REST endpoints
1 hour	Bulk operations	Report generation
1 day	Freemium tiers	Free API access

Burst Handling

What is a Burst?

Normal Traffic:

00:00 → 10 req/s

00:10 → 10 req/s

00:20 → 10 req/s

Bursty Traffic:

00:00 → 5 req/s

00:10 → 5 req/s

00:20 → 100 req/s (BURST!)

00:30 → 5 req/s

Burst Strategies:

Reject Immediately (Strict)

Burst detected → All requests > limit rejected

Use: Payment processing, critical operations

Queue (Tolerant)

Burst detected → Extra requests queued

Use: File uploads, batch processing

Allow with Token Bucket (Flexible)

Burst detected → Use accumulated tokens

Use: User-facing APIs, bursty workloads

Queue Management

Queue States:

Request arrives:

↓

Is rate limit exceeded?

├─ No → Process immediately ✅

└─ Yes → Is queue full?

├─ No → Add to queue 🟡 (Wait)

└─ Yes → Reject request ❌ (429)

Queue Processing:

Request completes → Dequeue next request → Process

Queue Configuration:

options.QueueLimit = 10; // Max 10 requests waiting

options.QueueProcessingOrder = QueueProcessingOrder.OldestFirst; // FIFO

Seven Strategies Overview

Quick Comparison Matrix

StrategyComplexityBurst HandlingMemoryBest For
Fixed Window	⭐ Low	❌ Poor	⭐ Low	Internal APIs
Sliding Window	⭐⭐ Medium	✅ Good	⭐⭐ Medium	Production APIs
Token Bucket	⭐⭐⭐ High	✅ Excellent	⭐⭐ Medium	Bursty traffic
Concurrency	⭐ Low	N/A	⭐ Low	Long operations
Per-User	⭐⭐ Medium	Varies	⭐⭐⭐ High	Multi-tenant
Chained	⭐⭐⭐ High	✅ Strict	⭐⭐⭐ High	Critical resources
Tiered	⭐⭐ Medium	Varies	⭐⭐ Medium	SaaS/Monetization

1. Fixed Window

Concept: Fixed time buckets with counters

[00:00-00:59] → 10 requests allowed

[01:00-01:59] → Counter resets, 10 more allowed

Pros: Simple, low memory

Cons: Boundary burst problem

Read More: Fixed Window Strategy

2. Sliding Window

Concept: Weighted sliding time window

Uses previous window data to smooth traffic

No boundary burst problem

Pros: Smooth, production-ready

Cons: More complex, higher memory

3. Token Bucket

Concept: Bucket of tokens, refills at constant rate

Start: 20 tokens

Request: -1 token

Refill: +5 tokens every 10 seconds

Pros: Excellent burst handling

Cons: Complex configuration

4. Concurrency

Concept: Limits simultaneous requests

Max 5 concurrent requests

Request 6-15: Queued

Request 16+: Rejected

Pros: Protects resources

Cons: Doesn't limit rate

5. Per-User Partitioned

Concept: Independent limits per user/IP

User A: 30 req/min

User B: 30 req/min

Independent buckets

Pros: Fair allocation

Cons: Higher memory

6. Chained

Concept: Multiple limiters in sequence

Request → Concurrency Check → Rate Check → Token Check

All must pass

Pros: Comprehensive protection

Cons: Complex, restrictive

7. Tiered

Concept: Different limits by subscription

Free: 10 req/min

Basic: 50 req/min

Premium: 200 req/min

Enterprise: 1000 req/min

Pros: Monetization-ready

Cons: Requires user management

HTTP 429 Response

Standard Response Format

HTTP/1.1 429 Too Many Requests

Content-Type: application/json

Retry-After: 60

X-RateLimit-Limit: 100

X-RateLimit-Remaining: 0

X-RateLimit-Reset: 1642521600

{

"error": "rate_limit_exceeded",

"message": "API rate limit exceeded. Please retry after 60 seconds.",

"documentation_url": "https://api.example.com/docs/rate-limits",

"limit": 100,

"remaining": 0,

"reset": "2025-01-15T10:35:00Z"

}

Response Headers

Standard Headers:

HeaderDescriptionExample
`Retry-After`	Seconds to wait	`60`
`X-RateLimit-Limit`	Max requests allowed	`100`
`X-RateLimit-Remaining`	Requests left	`0`
`X-RateLimit-Reset`	Unix timestamp reset	`1642521600`

Share this lesson:

Navigation

Progress 7 / 15

Start 47% Complete

What is Reference Token? Server-Side Token Management with .NET Fixed Window Rate Limiting Strategy

Statistics

8 Lessons in Rate Limit

5 SubCategories in Asp.Net Core Security

15 Total Lessons in Asp.Net Core Security

dotnetacademy

API Rate Limiting: A Comprehensive Overview

Table of Contents

Introduction

What is Rate Limiting?

This Article Series

Why Rate Limiting Matters

Real-World Problem

The Cost of NOT Having Rate Limiting

The Four Pillars

1. 🛡️ Security

2. 💰 Cost Control

3. ⚖️ Fair Usage

4. 📊 Predictability

Core Concepts

Request Rate

Time Windows

Burst Handling

Queue Management

Seven Strategies Overview

Quick Comparison Matrix

1. Fixed Window

2. Sliding Window

3. Token Bucket

4. Concurrency

5. Per-User Partitioned

6. Chained

7. Tiered

HTTP 429 Response

Standard Response Format

Response Headers

Details

Category

Navigation

Statistics