Operations

Rate Limiting

Rate limiting stops clients from sending too many requests to a server in any given timeframe. Servers cap requests and slow down traffic to prevent damage from overload or abuse. By keeping tabs on API thresholds and request quotas, they spot and block bots that act nothing like real users.

/reɪt ˈlɪmɪtɪŋ/noun

Quick Facts

Also known as: API throttling, request throttling, bandwidth limiting
IP source: Residential IPs from a 2.5M+ pool across 195+ countries help distribute requests below detection thresholds
Detection risk: High when sending repeated requests from a single IP; low when rotating across a large residential pool
Typical use: Web scraping, data collection, price monitoring, ad verification
Price range: $0.27–$0.79/GB, down to $0.27/GB at scale

How a rate limiting works

Here's what happens: when servers see requests coming in, they count each client's hits over a specific time—could be seconds or minutes, maybe hours. Cross the line on request count? Expect to see an HTTP 429 Too Many Requests. Things freeze until that counter resets. All of this happens close to the edge or API gateway, with state stored fast in something like Redis, so who's asking and how much they've asked is consistent across instances. You see token bucket (fixed rate with chances for spurts), leaky bucket (steady flow no matter what), and window counters (simple but might let bursts slip by) approaches mix efficiency with burst handling in varied ways.

Rate Limiting vs. IP Banning

Rate limiting acts like a pause button based on usage quotas. When time's up, traffic resumes. It slows legit requests, but doesn't stop them for good. On the other hand, IP banning shuts the door completely on a specific address—it's long-term. You need to switch out that IP to get past it, pacing won't help.

Why this is different

Advantages

It keeps your APIs from drowning in traffic surges. Nail a good rule, and a flood won't overload your backend.
You make sure everyone gets their share. One heavy user can't hog all the resources on a shared endpoint.
Stop runaway request loops eating up your compute. After you fine-tune it, teams often hold burst traffic under 10% of total quota.
No more crazy p99 latency under pressure. You keep the request queue manageable, so lag doesn't spiral out of control.

Tradeoffs

Your real users might still hit these limits unexpectedly. It gets tricky during big launches or batch jobs.
Those fancy algorithms? They need more work. Think sliding window, Redis Lua scripts—it's a hassle.
In distributed systems, shared rate states get messy. Clock skew or network hiccups mean enforcement varies across nodes.
Set thresholds too tight, and developers find hacks. Or they might just skip using your API altogether.

Examples in practice

Real-world deployments of Rate Limiting , where it works and where alternatives win.

Twitter API Rate Limits

Twitter locks down its v2 API with 15-minute windows. You get 1,500 tweets per 15 minutes; hit that, and it's HTTP 429 time. Try doing 5,000 tweets an hour, and you'll hit that wall each time on the third window. Your retries better have exponential backoff—start with a 1-second delay, double each time you hit 429, max at 64 seconds so you don't hit a brick wall repeatedly.

Shopify REST API Throttling

Shopify sets REST API calls at 40 requests per app per store each minute using a leaky bucket system. The thing drains at 2 requests per second, no matter how fast you throw requests at it. Bursting beyond 40? Fine for bursts like order syncs, but go steady over, and throttling kicks in eventually.

Cloudflare DDoS Rate Rules

Cloudflare steps in with rate limits at as low as 1 request per second to choke back volumetric attacks on particular URL paths. Set rules to block any IP over 100 requests per minute on something like /api/checkout, and Cloudflare stops it at the edge. Your backend won't even see the spike.

GitHub REST API Limits

Unauthenticated requests? GitHub gives you 60 an hour per IP. With auth, it bumps up to 5,000. Why? To make scraping without creds costly. Use up that unauthenticated limit in under a minute if you have any real workload. You get headers like X-RateLimit-Remaining and X-RateLimit-Reset to back off before hitting zero.

Stripe Payment API Controls

Stripe caps at 100 read and 100 write requests per second in live mode. Ticket hit? You get an HTTP 429 with Retry-After seconds. They suggest using idempotency keys on writes, so retrying doesn't charge folks twice when the rate limit bites.

Google Maps Platform Quotas

Google Maps keeps both per-minute and per-day checks. Geocoding's 50 requests a second stops at OVER_QUERY_LIMIT when exceeded. And daily quotas in the Cloud Console keep batch runs from tanking your month's budget overnight. Once limits are hit, it takes a break till midnight UTC.

Common misconceptions

Common myths about Rate Limiting , and what is actually true.

Myth	Reality
"Rate limiting and throttling are the same thing"	Rate limiting sets a hard cap , once you hit the quota, the server rejects requests outright with a 429. Throttling slows the response rate rather than blocking it entirely, often by adding artificial delay. Some systems use both: throttle moderate overuse and hard-limit severe overuse.

Need Rate Limits?

2.5M+ residential IPs, 195+ countries, from $0.27/GB.

View Residential Proxies

Rate Limiting FAQ

Rate limiting is how servers keep requests in check over time windows. They cap concurrent requests and throttle them to guard infrastructure from overload. By setting quotas, servers spot and stop excess automated traffic that's not your regular human web browsing.