Reverse Proxy Load Balancing Tutorial: Queues and Health Check Lab

Reading info

Level: Professional Reading time: 13 min

Reverse Proxy
Load Balancing
Health Checks
Python

Open knowledge map

English

Reverse Proxy Load Balancing: Queues, Health Checks, and a Reproducible Scheduler

A reverse proxy frequently serves as the critical entry point for modern web architectures, handling TLS termination, request routing, and load balancing. While simply alternating requests across two backends (Round Robin) is logically straightforward, it catastrophically fails under heterogeneous workloads. When service time varies, equalizing request counts exacerbates tail latency due to Head-of-Line (HoL) blocking. In this deep-dive, we mathematically dissect load balancing utilizing Queuing Theory, explore Nginx's C source code for Smooth Weighted Round Robin, and mathematically prove the variance reduction of Consistent Hashing with virtual nodes via MurmurHash3.

1. The Queuing Theory of Load Balancing

Why does Round Robin fail? Let's model our proxy-to-backend system as an $M/M/c$ queue (Poisson arrivals, Exponential service time, $c$ backend servers). Under Round Robin, the proxy blindly dispatches requests, effectively decoupling the system into $c$ independent $M/M/1$ queues.

According to Little's Law and the Pollaczek-Khinchine formula, the waiting time in an $M/M/1$ queue is highly sensitive to the variance of service times ($sigma^2$). If one request requires a heavy database aggregation (high $sigma^2$), the specific $M/M/1$ queue it occupies becomes saturated, blocking all subsequent requests assigned to that backend, even if other backends are idle.

In contrast, a Least-Connections algorithm acts dynamically, approximating an $M/M/c$ global queue where requests are dispatched to the first available worker. The probability of delay $P(W > 0)$ in an $M/M/c$ queue is defined by Erlang's C formula:

$$ C(c, lambda/mu) = frac{frac{(c rho)^c}{c!} frac{1}{1-rho}}{sum_{k=0}^{c-1} frac{(c rho)^k}{k!} + frac{(c rho)^c}{c!} frac{1}{1-rho}} $$

Mathematically, the $M/M/c$ model drastically reduces the variance of waiting times compared to $c$ disjoint $M/M/1$ queues, proving why dynamic active-load-aware routing is strictly superior to static Round Robin.


graph TD
    Proxy[Reverse Proxy / eBPF XDP]
    
    subgraph M/M/c Dynamic Queueing (Least Connections)
        Queue((Global Virtual Queue))
        B1[Backend Node 1]
        B2[Backend Node 2]
        B3[Backend Node 3]
        
        Proxy ==>|O(1) Dispatch| Queue
        Queue -->|Idle Worker Pull| B1
        Queue -->|Idle Worker Pull| B2
        Queue -->|Idle Worker Pull| B3
    end
    
    style Queue fill:#e6f3ff,stroke:#0066cc

2. Source Code Analysis: Nginx Smooth Weighted Round Robin (SWRR)

When weights are introduced (e.g., node A is 3x faster than node B), Nginx does not naively send 3 requests to A, then 1 to B (A, A, A, B). That would cause bursty micro-saturations. Instead, Nginx implemented the Smooth Weighted Round Robin (SWRR) algorithm, written in C inside ngx_http_upstream_module.c.


// Simplified Nginx SWRR core logic
// ngx_http_upstream_round_robin.c
ngx_http_upstream_rr_peer_t *peer, *best = NULL;
ngx_uint_t total = 0;

for (peer = peers->peer; peer; peer = peer->next) {
    if (peer->down || peer->max_fails <= peer->fails) {
        continue;
    }
    
    peer->current_weight += peer->effective_weight;
    total += peer->effective_weight;
    
    if (best == NULL || peer->current_weight > best->current_weight) {
        best = peer;
    }
}

if (best == NULL) { return NULL; }

best->current_weight -= total;
return best;

By constantly accumulating effective_weight and subtracting the total weight from the chosen peer, Nginx ensures a perfectly interleaved distribution (A, B, A, A), minimizing transient queue buildup on heavy nodes.

3. Consistent Hashing and Virtual Node Mathematical Distribution

When caching statefully, requests must route to the same backend based on a key (e.g., User ID). Standard modulo hashing ($H(k) pmod n$) collapses when a node dies, remapping nearly $100%$ of keys and causing a catastrophic cache stampede. Consistent Hashing maps nodes and keys onto a unit circle $[0, 2^{32}-1]$.

However, pure consistent hashing suffers from skewed load variance. If we map $N$ physical nodes, the expected load variance is high. To solve this, we introduce $V$ virtual nodes per physical node. Using MurmurHash3 (which provides excellent avalanche properties), we map $N times V$ virtual nodes onto the ring.

The standard deviation of load across nodes $sigma_{load}$ scales mathematically inversely with the square root of virtual nodes:

$$ sigma_{load} approx frac{1}{sqrt{V}} $$

In production architectures like Envoy's Maglev or Ketama, $V$ is typically set between $100$ and $256$, ensuring that key distribution is uniform within a $1%$ error margin, completely eliminating hot-spots.

4. eBPF: Preemptive Health Checks at the Kernel Level

Layer 7 HTTP health checks are slow. Waiting for 3 timeouts of 5 seconds means a node stays "healthy" while dropping thousands of packets. High-performance proxies utilize eBPF to monitor kernel TCP metrics directly.

By tracing the tcp_drop kernel function or monitoring the TCP listen backlog queue, an eBPF daemon can detect a struggling backend microsecond-level precision. When the backlog queue depth $Q_d ge Q_{max}$, the eBPF program updates a BPF map. The load balancer reads this map and immediately drains traffic from the node, achieving 0-RTT eviction before a single HTTP 502 Bad Gateway is ever returned to the client.

5. Engineer's Perspective: Real-World Catastrophes

The Active-Passive Split-Brain via VRRP: High-availability load balancers (HAProxy/Keepalived) use VRRP for Virtual IP failover. In a 10G mesh, a 50ms BGP reconvergence caused a VRRP partition. Both proxies assumed the Active role. We witnessed violent MAC address flapping in the Arista switches, dropping 50% of packets. The fix? BFD (Bidirectional Forwarding Detection) tied to BGP, and an external Raft-based distributed lock (etcd) for VIP fencing.

FAQ

Should a proxy retry every 5xx response?

Absolutely not. Automatic retries of non-idempotent methods (POST) can result in double-billing incidents. Even for GET requests, uncontrolled retries amplify traffic by a factor of $R$. If an upstream service is struggling due to database locks, multiplying the traffic via proxy retries will instantly trigger a cascading failure (Retry Storm). Always implement exponential backoff, jitter, and strict failure budgets.

References

Chinese

反向代理负载均衡原理：队列、健康检查和可复现调度实验

Open as a full page

反向代理通常作为现代 Web 架构的关键入口，承担着 TLS 终止、请求路由与负载均衡的核心功能。虽然在两个后端之间简单地交替分发请求（轮询，Round Robin）在逻辑上直观易懂，但它在异构工作负载下会遭遇灾难性的失效。当服务时间不一致时，强行均分请求数量会因队头阻塞（Head-of-Line Blocking）急剧恶化尾部延迟（Tail Latency）。在本次深度解析中，我们将运用排队论（Queuing Theory）对负载均衡进行数学解剖，深入挖掘 Nginx 平滑加权轮询算法的 C 语言源码，并利用 MurmurHash3 从数学上证明一致性哈希（Consistent Hashing）与虚拟节点对减少负载方差的决定性作用。

1. 负载均衡的排队论（Queuing Theory）数学推演

为什么轮询（Round Robin）会失败？让我们将“代理-后端”系统建模为一个 $M/M/c$ 队列（泊松到达，指数服务时间，$c$ 个后端服务器）。在轮询机制下，代理盲目地分发请求，实际上将整个系统退化（解耦）为了 $c$ 个完全独立的 $M/M/1$ 队列。

根据 Little's Law 和 Pollaczek-Khinchine 公式，单个 $M/M/1$ 队列的等待时间对服务时间的方差（$sigma^2$）极其敏感。如果某个请求需要进行繁重的数据库聚合（高 $sigma^2$），它所在的那个特定的 $M/M/1$ 队列就会被饱和阻塞。这会导致分配到该后端的后续所有请求都在排队等待，而此时其他后端可能完全处于空闲状态。

相反，最小连接数（Least-Connections）算法采取动态策略，近似于构建了一个全局的 $M/M/c$ 队列，请求始终被分发给第一个可用的工作节点。在 $M/M/c$ 队列中，发生延迟的概率 $P(W > 0)$ 由爱尔兰 C 公式（Erlang's C formula）严格定义：

$$ C(c, lambda/mu) = frac{frac{(c rho)^c}{c!} frac{1}{1-rho}}{sum_{k=0}^{c-1} frac{(c rho)^k}{k!} + frac{(c rho)^c}{c!} frac{1}{1-rho}} $$

数学证明表明，$M/M/c$ 模型与 $c$ 个互不连通的 $M/M/1$ 队列相比，极大地降低了等待时间的方差，这从根本上证明了具备感知活跃负载（active-load-aware）能力的动态路由严格优于静态的轮询算法。


graph TD
    Proxy[反向代理 / eBPF XDP]
    
    subgraph M/M/c 动态排队模型 (最少连接数)
        Queue((全局虚拟队列))
        B1[后端节点 1]
        B2[后端节点 2]
        B3[后端节点 3]
        
        Proxy ==>|O(1) 复杂度分发| Queue
        Queue -->|空闲 Worker 主动拉取| B1
        Queue -->|空闲 Worker 主动拉取| B2
        Queue -->|空闲 Worker 主动拉取| B3
    end
    
    style Queue fill:#e6f3ff,stroke:#0066cc

2. 源码分析：Nginx 平滑加权轮询（SWRR）

当引入权重机制时（例如，节点 A 的性能是节点 B 的 3 倍），Nginx 绝对不会极其简单粗暴地先向 A 发送 3 个请求，然后再向 B 发送 1 个（A, A, A, B）。这种分配方式会引发毁灭性的突发微饱和（Bursty micro-saturations）。相反，Nginx 实现了平滑加权轮询（Smooth Weighted Round Robin, SWRR）算法，其核心 C 语言源码位于 ngx_http_upstream_module.c 中。


// Nginx SWRR 核心逻辑精简版
// 源码路径: src/http/ngx_http_upstream_round_robin.c
ngx_http_upstream_rr_peer_t *peer, *best = NULL;
ngx_uint_t total = 0;

for (peer = peers->peer; peer; peer = peer->next) {
    if (peer->down || peer->max_fails <= peer->fails) {
        continue;
    }
    
    // 每轮累加 effective_weight
    peer->current_weight += peer->effective_weight;
    total += peer->effective_weight;
    
    // 选出当前 current_weight 最大的 peer
    if (best == NULL || peer->current_weight > best->current_weight) {
        best = peer;
    }
}

if (best == NULL) { return NULL; }

// 核心：被选中的节点扣除 total 总权重，使其在后续轮次中让出机会
best->current_weight -= total;
return best;

通过不断累加 effective_weight，并在选中 peer 后减去 total 总权重，Nginx 在数学上保证了极其完美的交错分布（A, B, A, A），彻底消除了高权重节点上的瞬态队列积压现象。

3. 一致性哈希与虚拟节点的数学分布

在进行有状态缓存时，请求必须根据特定的 Key（如 User ID）路由到同一个后端。当节点宕机时，标准的取模哈希（$H(k) pmod n$）会完全崩溃，导致近 $100%$ 的 Key 被重新映射，进而引发灾难性的缓存雪崩。一致性哈希（Consistent Hashing）优雅地将节点和 Key 映射到一个 $[0, 2^{32}-1]$ 的闭环环空间上。

然而，纯粹的一致性哈希在物理节点分布上存在着严重的负载倾斜。如果我们只映射 $N$ 个物理节点，预期的负载方差会非常高。为了在数学层面上解决这个问题，我们为每个物理节点引入 $V$ 个虚拟节点。利用具有卓越雪崩效应的 MurmurHash3 算法，我们将 $N times V$ 个虚拟节点散列到哈希环上。

各个物理节点承受负载的标准差 $sigma_{load}$ 在数学上与虚拟节点数量的平方根成反比：

$$ sigma_{load} approx frac{1}{sqrt{V}} $$

在 Envoy 的 Maglev 算法或 Ketama 的生产级架构中，$V$ 通常被设定在 $100$ 到 $256$ 之间，这确保了 Key 的分布在一个约 $1%$ 的误差范围内实现绝对的均匀，彻底消灭了数据热点（Hot-spots）。

4. eBPF：内核态抢占式健康检查

传统的应用层（L7） HTTP 健康检查过于缓慢。等待 3 次每次 5 秒的超时，意味着一个节点在疯狂丢弃成千上万个数据包的同时，负载均衡器依然认为它是“健康的”。高性能的边界代理开始利用 eBPF 直接监控内核的 TCP 指标。

通过追踪 tcp_drop 内核函数，或持续监控 TCP 监听积压队列（listen backlog queue），eBPF 守护进程可以达到微秒级精度，瞬间察觉陷入挣扎的后端微服务。当积压队列深度 $Q_d ge Q_{max}$ 时，eBPF 程序会更新 BPF Map。负载均衡器读取该 Map 并在用户态触发 0-RTT 的流量抽离（Eviction），在客户端收到哪怕一个 HTTP 502 Bad Gateway 之前，就已完成了自我愈合。

5. 工程师视角：生产环境的史诗级灾难

基于 VRRP 的双主脑裂（Split-Brain）： 高可用负载均衡器集群（HAProxy / Keepalived）依赖 VRRP 协议漂移虚拟 IP (VIP)。在一次万兆交换机堆叠架构中，仅仅是 50ms 的 BGP 路由重收敛，就导致了 VRRP 判定出现网络分区。两台代理瞬间都认定自己是 Master。我们眼睁睁看着 Arista 交换机内爆发出剧烈的 MAC 地址震荡（MAC Flapping），吞噬了 50% 的报文。最终解决方案？将 BFD（双向转发检测）与 BGP 强绑定，并引入基于 Raft 协议的外部分布式锁（如 etcd）实现严格的 VIP 仲裁（Fencing）。

常见问题 (FAQ)

代理是否应该自动重试所有的 5xx 错误响应？

绝对不可。对非幂等方法（如 POST）的自动重试，极易直接酿成用户的重复扣款事故。即便是对于 GET 请求，无限制的重试也会让系统的总吞吐量激增 $R$ 倍。如果上游服务是因为数据库行锁竞争而阻塞，代理盲目放大的重试流量将瞬间触发级联崩溃（重试风暴，Retry Storm）。永远要配置指数退避（Exponential Backoff）、抖动（Jitter）以及严酷的失败预算（Failure Budgets）。

参考资料

Run notes

Environment: Python 3 + Matplotlib; in-memory queue simulation

Install

cd network-fundamentals-lab
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run

python src/reverse_proxy_balancing.py

Input: Eight fixed arrivals and two identical backend queues
Expected output: Produces round-robin total wait of 240 ms and least-queue total wait of 180 ms.

Install cd network-fundamentals-lab
Install python3 -m venv .venv
Install source .venv/bin/activate
Install pip install -r requirements.txt
Run python src/reverse_proxy_balancing.py

A reverse proxy frequently serves as the critical entry point for modern web architectures, handling TLS termination, request routing, and load balancing. While simply alternating requests across two backends (Round Robin) is logically straightforward, it catastrophically fails under heterogeneous workloads. When service time varies, equalizing request counts exacerbates tail latency due to Head-of-Line (HoL) blocking. In this deep-dive, we mathematically dissect load balancing utilizing Queuing Theory, explore Nginx’s C source code for Smooth Weighted Round Robin, and mathematically prove the variance reduction of Consistent Hashing with virtual nodes via MurmurHash3.

1. The Queuing Theory of Load Balancing

Why does Round Robin fail? Let’s model our proxy-to-backend system as an $M/M/c$ queue (Poisson arrivals, Exponential service time, $c$ backend servers). Under Round Robin, the proxy blindly dispatches requests, effectively decoupling the system into $c$ independent $M/M/1$ queues.

According to Little’s Law and the Pollaczek-Khinchine formula, the waiting time in an $M/M/1$ queue is highly sensitive to the variance of service times ($sigma^2$). If one request requires a heavy database aggregation (high $sigma^2$), the specific $M/M/1$ queue it occupies becomes saturated, blocking all subsequent requests assigned to that backend, even if other backends are idle.

$$ C(c, lambda/mu) = frac{frac{(c rho)^c}{c!} frac{1}{1-rho}}{sum_{k=0}^{c-1} frac{(c rho)^k}{k!} + frac{(c rho)^c}{c!} frac{1}{1-rho}} $$


graph TD
    Proxy[Reverse Proxy / eBPF XDP]
    
    subgraph M/M/c Dynamic Queueing (Least Connections)
        Queue((Global Virtual Queue))
        B1[Backend Node 1]
        B2[Backend Node 2]
        B3[Backend Node 3]
        
        Proxy ==>|O(1) Dispatch| Queue
        Queue -->|Idle Worker Pull| B1
        Queue -->|Idle Worker Pull| B2
        Queue -->|Idle Worker Pull| B3
    end
    
    style Queue fill:#e6f3ff,stroke:#0066cc

2. Source Code Analysis: Nginx Smooth Weighted Round Robin (SWRR)


// Simplified Nginx SWRR core logic
// ngx_http_upstream_round_robin.c
ngx_http_upstream_rr_peer_t *peer, *best = NULL;
ngx_uint_t total = 0;

for (peer = peers->peer; peer; peer = peer->next) {
    if (peer->down || peer->max_fails <= peer->fails) {
        continue;
    }
    
    peer->current_weight += peer->effective_weight;
    total += peer->effective_weight;
    
    if (best == NULL || peer->current_weight > best->current_weight) {
        best = peer;
    }
}

if (best == NULL) { return NULL; }

best->current_weight -= total;
return best;

3. Consistent Hashing and Virtual Node Mathematical Distribution

The standard deviation of load across nodes $sigma_{load}$ scales mathematically inversely with the square root of virtual nodes:

$$ sigma_{load} approx frac{1}{sqrt{V}} $$

In production architectures like Envoy’s Maglev or Ketama, $V$ is typically set between $100$ and $256$, ensuring that key distribution is uniform within a $1%$ error margin, completely eliminating hot-spots.

4. eBPF: Preemptive Health Checks at the Kernel Level

Layer 7 HTTP health checks are slow. Waiting for 3 timeouts of 5 seconds means a node stays “healthy” while dropping thousands of packets. High-performance proxies utilize eBPF to monitor kernel TCP metrics directly.

5. Engineer’s Perspective: Real-World Catastrophes

The Active-Passive Split-Brain via VRRP: High-availability load balancers (HAProxy/Keepalived) use VRRP for Virtual IP failover. In a 10G mesh, a 50ms BGP reconvergence caused a VRRP partition. Both proxies assumed the Active role. We witnessed violent MAC address flapping in the Arista switches, dropping 50% of packets. The fix? BFD (Bidirectional Forwarding Detection) tied to BGP, and an external Raft-based distributed lock (etcd) for VIP fencing.

FAQ

Should a proxy retry every 5xx response?

References

Search questions

FAQ

Who is this article for?

This article is for readers who want a professional-level guide to Reverse Proxy Load Balancing: Queues, Health Checks, and a Reproducible Scheduler. It takes about 13 min and focuses on Reverse Proxy, Load Balancing, Health Checks, Python.

What should I read next?

The recommended next step is Proxy Cache Revalidation: Cache-Control, ETag, and Observable Correctness, so the article connects into a longer learning route instead of ending as an isolated note.

Does this article include runnable code or companion resources?

Yes. Use the run notes, resource cards, and download links on the page to reproduce the example or inspect the companion files.

How does this article fit into the larger site?

It is connected to the article context block, learning routes, resources, and project timeline so readers can move from concept to implementation.

Article context

Network Fundamentals

A reproducible route through DNS, TCP, TLS, HTTP/3, proxy tunnels, load balancing, and shared caches with code and figures.

Level: Professional Reading time: 13 min

Reverse Proxy
Load Balancing
Health Checks
Python

Your next step

Continue: Proxy Cache Revalidation: Cache-Control, ETag, and Observable Correctness

Review the foundation Open resource

Other language version 反向代理负载均衡原理：队列、健康检查和可复现调度实验

Share summary Reverse Proxy Load Balancing: Queues, Health Checks, and a Reproducible Scheduler

Compare round robin and load-aware queue selection while reasoning about health checks and retry boundaries.

Download share card Open share center

Companion resources

Setup, no-privilege safety boundary, ten Python experiments, and three C examples.

Open resource Related article

Compares backend selection and queue waiting for round robin and least queue.

Open resource Related article

Bundles Python/C source, fixed scenarios, ten result CSVs, and protocol/proxy figures.

Open resource Related article

Project timeline

Published posts

DNS Resolution Explained: Build a TTL Cache and Packet Parser in Python A runnable DNS guide covering resolution paths, response headers, TTL cache latency, and deterministic Python/C experiments.
CIDR, Longest Prefix Match, and MTU: Calculate IP Routing Step by Step Calculate CIDR ranges, longest-prefix route choice, and MTU/MSS payload segmentation with runnable Python and C examples.
TCP Reliability and Congestion Window: A Runnable Sequence Number Experiment Track TCP sequence numbers, cumulative ACKs, loss, retransmission, and congestion-window changes with safe local experiments.
HTTPS and TLS 1.3 Handshake: Keys, Certificates, and RTT in Practice Understand TLS 1.3 message flights, certificate authentication, ephemeral key agreement, and handshake latency with a safe teaching model.
HTTP/2, HTTP/3, and CDN Caching: Read Page Speed from a Waterfall A deterministic browser-waterfall model for HTTP/2, HTTP/3, QUIC streams, and CDN cache hits or misses.
Forward Proxy vs Reverse Proxy: Connection Paths, Trust Boundaries, and Latency A reproducible guide to forward proxies, reverse proxies, tunnels, TLS boundaries, and latency segments.
HTTP CONNECT and HTTPS Proxy Tunnels: TLS Boundaries and Handshake Latency An RFC-based explanation of CONNECT tunnels, encrypted HTTPS payloads, and modeled first-request latency.
SOCKS5 Proxy Explained: Protocol Bytes, DNS Resolution Boundaries, and Leakage Risk Decode safe SOCKS5 CONNECT bytes and compare local-DNS and proxy-side hostname resolution boundaries.
Reverse Proxy Load Balancing: Queues, Health Checks, and a Reproducible Scheduler Compare round robin and load-aware queue selection while reasoning about health checks and retry boundaries.
Proxy Cache Revalidation: Cache-Control, ETag, and Observable Correctness Use an RFC 9111 shared-cache model to calculate MISS, HIT, and 304 revalidation latency and correctness boundaries.

Published resources

Network Fundamentals Lab README Setup, no-privilege safety boundary, ten Python experiments, and three C examples.
Network fundamentals full lab bundle Bundles Python/C source, fixed scenarios, ten result CSVs, and protocol/proxy figures.
DNS TTL results CSV HIT/MISS state, expiry, and latency for four fixed lookups.
CIDR and MTU results CSV Longest-prefix route and 3600-byte payload segmentation results.
TCP cwnd events CSV Per-round ACK, window, and deterministic retransmission events.
TLS 1.3 flight results CSV Message direction, timing, and teaching shared value in a fixed RTT model.
HTTP/CDN waterfall results CSV Phase timing for HTTP/2 and HTTP/3 in cold and warm cache models.
Proxy path latency results CSV Phase timing for direct access, forward-proxy tunneling, and reverse-proxy cache paths.
CONNECT/TLS timeline CSV Records CONNECT authority, tunnel establishment, and the encrypted HTTPS-request boundary.
SOCKS5 DNS boundary CSV Stores ATYP, destination bytes, request length, and modeled local DNS counts.
Proxy load-balancing queue CSV Compares backend selection and queue waiting for round robin and least queue.
Proxy cache revalidation CSV Records MISS, HIT, 304 revalidation, object age, and response latency.
Network request path visualizer Adjust TTL, prefixes, loss, handshake RTT, and cache paths in the browser.
Network fundamentals topic share card A 1200x630 SVG card for the DNS, TLS, HTTP/3, proxy tunnel, and caching topic hub.

Next notes

Add IPv6 and QUIC observation notes
Review caching and protocol benefits with real-user metrics

1. The Queuing Theory of Load Balancing

2. Source Code Analysis: Nginx Smooth Weighted Round Robin (SWRR)

3. Consistent Hashing and Virtual Node Mathematical Distribution

4. eBPF: Preemptive Health Checks at the Kernel Level

5. Engineer's Perspective: Real-World Catastrophes

FAQ

Should a proxy retry every 5xx response?

References

1. 负载均衡的排队论（Queuing Theory）数学推演

2. 源码分析：Nginx 平滑加权轮询（SWRR）

3. 一致性哈希与虚拟节点的数学分布

4. eBPF：内核态抢占式健康检查

5. 工程师视角：生产环境的史诗级灾难

常见问题 (FAQ)

代理是否应该自动重试所有的 5xx 错误响应？

参考资料

1. The Queuing Theory of Load Balancing

2. Source Code Analysis: Nginx Smooth Weighted Round Robin (SWRR)

3. Consistent Hashing and Virtual Node Mathematical Distribution

4. eBPF: Preemptive Health Checks at the Kernel Level

5. Engineer’s Perspective: Real-World Catastrophes

FAQ

Should a proxy retry every 5xx response?

References

Who is this article for?

What should I read next?

Does this article include runnable code or companion resources?

How does this article fit into the larger site?

Companion resources

Network Fundamentals Lab README

Proxy load-balancing queue CSV

Network fundamentals full lab bundle

Leave a Reply Cancel reply

Project timeline