Asp.Net Core Security Rate Limit Created: 24 Jan 2026 Updated: 24 Jan 2026

API Rate Limiting: A Comprehensive Overview

Table of Contents

  1. Introduction
  2. Why Rate Limiting Matters
  3. The Four Pillars
  4. Core Concepts
  5. Seven Strategies Overview
  6. HTTP 429 Response
  7. Next Steps

Introduction

In today's API-driven world, rate limiting is not optionalβ€”it's essential. Whether you're building a public REST API, a microservices architecture, or a SaaS platform, protecting your infrastructure while ensuring fair resource distribution is critical.

What is Rate Limiting?

Rate limiting controls the number of requests a client can make to your API within a specific time window. Think of it as a bouncer at a nightclubβ€”only a certain number of people can enter at a time.

Simple Example:

Without Rate Limiting:
Client β†’ 10,000 requests/second β†’ Server πŸ’₯ Crash

With Rate Limiting:
Client β†’ 10,000 requests/second β†’ Rate Limiter (100 allowed) β†’ Server βœ… Stable

This Article Series

This is Part 1 of 10 in our comprehensive rate limiting series:

  1. Overview (This article) - Why, what, and how
  2. Fixed Window Strategy
  3. Sliding Window Strategy
  4. Token Bucket Strategy
  5. Concurrency Strategy
  6. Per-User Strategy
  7. Chained Strategy
  8. Tiered Strategy
  9. Comparison & Selection Guide
  10. Testing & Production Guide

Why Rate Limiting Matters

Real-World Problem

Scenario: E-commerce Flash Sale

9:59 AM: Normal traffic (100 req/s)
10:00 AM: Flash sale starts
10:00 AM: Traffic spikes to 50,000 req/s
10:00:30 AM: Database overwhelmed
10:00:45 AM: Complete service outage
10:15 AM: Customers angry, revenue lost

With Rate Limiting:

9:59 AM: Normal traffic (100 req/s)
10:00 AM: Flash sale starts
10:00 AM: Rate limiter allows 1,000 req/s
10:00 AM: Queue system handles overflow
10:00-10:15 AM: All customers served fairly
Result: Happy customers, no outage, maximum revenue

The Cost of NOT Having Rate Limiting

Example: Startup API (Real numbers)

Incident Report - January 15, 2025

Without Rate Limiting:
- 14:23: Buggy mobile app released
- 14:24: App makes 10 req/s per user instead of 1 req/s
- 14:25: 10,000 active users = 100,000 req/s
- 14:26: API Gateway crashes
- 14:27-16:30: Complete service outage (2 hours)

Impact:
- Lost revenue: $50,000
- AWS overage charges: $8,500
- Customer churn: 15%
- Engineer overtime: $2,000
- Reputation damage: Priceless

Total Cost: $60,500+ for a 2-hour outage

With Rate Limiting (100 req/s per user):

- 14:23: Buggy app released
- 14:24: Rate limiter kicks in
- 14:25: App gets 429 responses
- 14:26: Monitoring alerts triggered
- 14:30: Buggy version pulled

Impact:
- Lost revenue: $500
- AWS costs: Normal
- Customer churn: 0%
- Total Cost: $500

ROI: $60,000 saved with rate limiting!

The Four Pillars

1. πŸ›‘οΈ Security

Protection Against Attacks:

Attack TypeWithout Rate LimitWith Rate Limit
DDoSServer down in secondsAttacker gets 429, others unaffected
Brute Force1M password attempts in 10 min10 attempts, then blocked for 1 hour
Credential Stuffing100K accounts compromisedOnly 5 login attempts per IP
Web ScrapingEntire database downloadedSlow drip, scraper gives up

Real Example: GitHub

GitHub Rate Limits:
- Authenticated: 5,000 requests/hour
- Unauthenticated: 60 requests/hour

Result:
- Legitimate users: Never hit limit
- Scrapers: Frustrated and blocked
- API stays stable

2. πŸ’° Cost Control

Cloud Cost Reduction:

Scenario: Weather API

Without Rate Limiting:
Monthly Traffic:
- Expected: 10M requests
- Actual: 500M requests (bot traffic)

AWS Costs:
- API Gateway: 500M Γ— $3.50/M = $1,750
- Lambda: 500M Γ— $0.20/M = $100
- DynamoDB: 500M reads Γ— $0.25/M = $125
- Data Transfer: $300
Total: $2,275/month

With Rate Limiting (1K req/hour per IP):
Monthly Traffic:
- Expected: 10M requests
- Actual: 12M requests (2M legitimate bursts)

AWS Costs:
- API Gateway: 12M Γ— $3.50/M = $42
- Lambda: 12M Γ— $0.20/M = $2.40
- DynamoDB: 12M reads Γ— $0.25/M = $3
- Data Transfer: $10
Total: $57.40/month

Savings: $2,217.60/month (97% reduction!)
Annual Savings: $26,611

3. βš–οΈ Fair Usage

The Noisy Neighbor Problem:

Shared API (100 requests/second capacity)

Without Rate Limiting:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ User A (Power User): β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 90% β”‚
β”‚ User B: β–ˆ 5% β”‚
β”‚ User C: β–ˆ 5% β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Result: 99% of users have poor experience

With Rate Limiting (10 req/s per user):
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ User A: β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 10 req/s β”‚
β”‚ User B: β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 10 req/s β”‚
β”‚ User C: β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ 10 req/s β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Result: Fair access for everyone

4. πŸ“Š Predictability

Capacity Planning Benefits:

Known Rate Limits = Predictable Infrastructure

Example:
- API Limit: 1,000 req/s
- Average Response Time: 50ms
- Required Capacity:
* Concurrent Requests: 1,000 Γ— 0.05 = 50
* CPU Cores: 50 / 10 = 5 cores
* Memory: 5 Γ— 2GB = 10GB RAM
* Instances: ceil(10GB / 4GB) = 3 instances

Budget:
- 3 instances Γ— $100/month = $300/month
- Predictable, no surprises!

Without Rate Limiting:
- Peak load unknown (could be 1K or 100K)
- Must overprovision (10x capacity)
- Cost: $3,000/month or risk outages

Core Concepts

Request Rate

Formula:

Rate = Number of Requests / Time Window

Examples:

Web API: 100 requests/minute = 1.67 requests/second
Payment API: 5 requests/minute = 0.08 requests/second
CDN: 10K requests/second = 600K requests/minute

Time Windows

Common Window Sizes:

Window SizeUse CaseExample
1 secondReal-time APIsChat, streaming
1 minuteStandard APIsREST endpoints
1 hourBulk operationsReport generation
1 dayFreemium tiersFree API access

Burst Handling

What is a Burst?

Normal Traffic:
00:00 β†’ 10 req/s
00:10 β†’ 10 req/s
00:20 β†’ 10 req/s

Bursty Traffic:
00:00 β†’ 5 req/s
00:10 β†’ 5 req/s
00:20 β†’ 100 req/s (BURST!)
00:30 β†’ 5 req/s

Burst Strategies:

  1. Reject Immediately (Strict)
Burst detected β†’ All requests > limit rejected
Use: Payment processing, critical operations
  1. Queue (Tolerant)
Burst detected β†’ Extra requests queued
Use: File uploads, batch processing
  1. Allow with Token Bucket (Flexible)
Burst detected β†’ Use accumulated tokens
Use: User-facing APIs, bursty workloads

Queue Management

Queue States:

Request arrives:
↓
Is rate limit exceeded?
β”œβ”€ No β†’ Process immediately βœ…
└─ Yes β†’ Is queue full?
β”œβ”€ No β†’ Add to queue 🟑 (Wait)
└─ Yes β†’ Reject request ❌ (429)

Queue Processing:
Request completes β†’ Dequeue next request β†’ Process

Queue Configuration:

options.QueueLimit = 10; // Max 10 requests waiting
options.QueueProcessingOrder = QueueProcessingOrder.OldestFirst; // FIFO

Seven Strategies Overview

Quick Comparison Matrix

StrategyComplexityBurst HandlingMemoryBest For
Fixed Window⭐ Low❌ Poor⭐ LowInternal APIs
Sliding Window⭐⭐ Mediumβœ… Good⭐⭐ MediumProduction APIs
Token Bucket⭐⭐⭐ Highβœ… Excellent⭐⭐ MediumBursty traffic
Concurrency⭐ LowN/A⭐ LowLong operations
Per-User⭐⭐ MediumVaries⭐⭐⭐ HighMulti-tenant
Chained⭐⭐⭐ Highβœ… Strict⭐⭐⭐ HighCritical resources
Tiered⭐⭐ MediumVaries⭐⭐ MediumSaaS/Monetization

1. Fixed Window

Concept: Fixed time buckets with counters

[00:00-00:59] β†’ 10 requests allowed
[01:00-01:59] β†’ Counter resets, 10 more allowed

Pros: Simple, low memory

Cons: Boundary burst problem

Read More: Fixed Window Strategy

2. Sliding Window

Concept: Weighted sliding time window

Uses previous window data to smooth traffic
No boundary burst problem

Pros: Smooth, production-ready

Cons: More complex, higher memory

3. Token Bucket

Concept: Bucket of tokens, refills at constant rate

Start: 20 tokens
Request: -1 token
Refill: +5 tokens every 10 seconds

Pros: Excellent burst handling

Cons: Complex configuration

4. Concurrency

Concept: Limits simultaneous requests

Max 5 concurrent requests
Request 6-15: Queued
Request 16+: Rejected

Pros: Protects resources

Cons: Doesn't limit rate

5. Per-User Partitioned

Concept: Independent limits per user/IP

User A: 30 req/min
User B: 30 req/min
Independent buckets

Pros: Fair allocation

Cons: Higher memory

6. Chained

Concept: Multiple limiters in sequence

Request β†’ Concurrency Check β†’ Rate Check β†’ Token Check
All must pass

Pros: Comprehensive protection

Cons: Complex, restrictive

7. Tiered

Concept: Different limits by subscription

Free: 10 req/min
Basic: 50 req/min
Premium: 200 req/min
Enterprise: 1000 req/min

Pros: Monetization-ready

Cons: Requires user management

HTTP 429 Response

Standard Response Format

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1642521600

{
"error": "rate_limit_exceeded",
"message": "API rate limit exceeded. Please retry after 60 seconds.",
"documentation_url": "https://api.example.com/docs/rate-limits",
"limit": 100,
"remaining": 0,
"reset": "2025-01-15T10:35:00Z"
}

Response Headers

Standard Headers:

HeaderDescriptionExample
Retry-AfterSeconds to wait60
X-RateLimit-LimitMax requests allowed100
X-RateLimit-RemainingRequests left0
X-RateLimit-ResetUnix timestamp reset1642521600
Share this lesson: