<!DOCTYPE html>
<html lang="zh-CN">
<head>
<meta charset="utf-8">
<title>veld vs nginx 性能与资源对比报告</title>
<style>
:root{
--rust:#C2410C; --rust-soft:#FB923C; --nginx:#059669; --nginx-soft:#6EE7B7;
--ink:#1A202C; --muted:#64748B; --line:#E2E8F0; --bg-soft:#F8FAFC;
--win:#047857; --loss:#B91C1C; --tie:#B45309;
}
*{box-sizing:border-box;}
html{-webkit-print-color-adjust:exact;print-color-adjust:exact;}
body{
font-family:"Segoe UI",-apple-system,"Helvetica Neue","PingFang SC","Microsoft YaHei",sans-serif;
color:var(--ink); margin:0; font-size:12.5px; line-height:1.55;
}
.num{font-family:"SF Mono",Consolas,"Roboto Mono",monospace; font-variant-numeric:tabular-nums;}
@page{ size:A4; margin:14mm 13mm; }
.page{ padding:0; }
section{ break-inside:avoid; margin:0 0 18px; }
h1{font-size:25px; margin:0 0 2px; letter-spacing:-.3px;}
h2{font-size:15px; margin:22px 0 10px; padding-left:9px; border-left:4px solid var(--rust); letter-spacing:.2px;}
h3{font-size:12.5px; margin:14px 0 6px; color:var(--muted); font-weight:600; text-transform:uppercase; letter-spacing:.6px;}
p{margin:6px 0;}
.muted{color:var(--muted);}
small{color:var(--muted);}
/* cover header */
.cover{ border-bottom:3px solid var(--rust); padding-bottom:14px; margin-bottom:8px; }
.cover .sub{ color:var(--muted); font-size:13px; margin-top:4px; }
.badges{ margin-top:10px; display:flex; gap:8px; flex-wrap:wrap; }
.chip{ font-size:11px; padding:3px 9px; border-radius:999px; background:var(--bg-soft); border:1px solid var(--line); color:#334155; }
.meta{ margin-top:12px; display:grid; grid-template-columns:repeat(4,1fr); gap:8px; }
.meta div{ background:var(--bg-soft); border:1px solid var(--line); border-radius:8px; padding:8px 10px; }
.meta .k{ font-size:10px; color:var(--muted); text-transform:uppercase; letter-spacing:.5px; }
.meta .v{ font-size:13px; font-weight:600; margin-top:2px; }
/* verdict */
.verdict{ background:linear-gradient(180deg,#FFF7ED,#FFFFFF); border:1px solid #FED7AA; border-radius:12px; padding:14px 16px; }
.verdict b{color:var(--rust);}
.kpis{ display:grid; grid-template-columns:repeat(4,1fr); gap:10px; margin-top:12px; }
.kpi{ text-align:center; background:#fff; border:1px solid var(--line); border-radius:10px; padding:12px 6px; }
.kpi .big{ font-size:22px; font-weight:800; color:var(--rust); }
.kpi .lab{ font-size:10.5px; color:var(--muted); margin-top:3px; line-height:1.3; }
/* legend */
.legend{ display:flex; gap:16px; align-items:center; font-size:11px; color:var(--muted); margin:4px 0 10px; }
.legend i{ display:inline-block; width:11px; height:11px; border-radius:3px; margin-right:5px; vertical-align:-1px; }
.sw-rust{ background:var(--rust); } .sw-nginx{ background:var(--nginx); }
/* bar chart */
.chart{ width:100%; }
.row{ display:grid; grid-template-columns:120px 1fr 64px; align-items:center; gap:10px; padding:5px 0; break-inside:avoid; }
.row .name{ font-size:11.5px; font-weight:600; }
.row .name small{ display:block; font-weight:400; font-size:10px; }
.bars{ display:flex; flex-direction:column; gap:4px; }
.bar{ position:relative; height:15px; background:var(--bg-soft); border-radius:4px; overflow:hidden; }
.bar .fill{ height:100%; border-radius:4px; display:flex; align-items:center; }
.bar .fill.rust{ background:linear-gradient(90deg,#EA580C,#C2410C); }
.bar .fill.nginx{ background:linear-gradient(90deg,#10B981,#059669); }
.bar .val{ position:absolute; right:7px; top:0; height:15px; line-height:15px; font-size:10px; font-weight:700; color:#0f172a; }
.delta{ text-align:right; font-weight:800; font-size:12px; }
.delta.win{ color:var(--win); } .delta.loss{ color:var(--loss); } .delta.tie{ color:var(--tie); }
/* tables */
table{ width:100%; border-collapse:collapse; font-size:11.5px; margin-top:6px; }
th,td{ padding:6px 8px; text-align:left; border-bottom:1px solid var(--line); }
th{ background:var(--bg-soft); font-size:10px; text-transform:uppercase; letter-spacing:.4px; color:var(--muted); }
td.r,th.r{ text-align:right; }
tr:last-child td{ border-bottom:none; }
.tag{ display:inline-block; font-size:10px; font-weight:700; padding:1px 7px; border-radius:999px; }
.tag.win{ background:#ECFDF5; color:var(--win); } .tag.loss{ background:#FEF2F2; color:var(--loss); } .tag.tie{ background:#FFFBEB; color:var(--tie); }
.note{ background:var(--bg-soft); border:1px solid var(--line); border-left:3px solid var(--muted); border-radius:6px; padding:9px 12px; font-size:11px; }
.fail{ border-left-color:var(--loss); background:#FEF6F6; }
footer{ margin-top:18px; padding-top:8px; border-top:1px solid var(--line); font-size:10px; color:var(--muted); display:flex; justify-content:space-between; }
.two{ display:grid; grid-template-columns:1fr 1fr; gap:18px; }
ul{ margin:6px 0; padding-left:18px; } li{ margin:3px 0; }
</style>
</head>
<body>
<div class="page">
<!-- ============ COVER ============ -->
<section class="cover">
<h1>veld <span style="color:var(--muted);font-weight:400;">vs</span> nginx</h1>
<div class="sub">高性能 HTTP 服务器 · 性能与资源占用对比报告</div>
<div class="badges">
<span class="chip">测试日期 2026-06-13</span>
<span class="chip">主机 172.1.3.74</span>
<span class="chip">Ubuntu 22.04.5 · 8 核 · 16GB</span>
<span class="chip">nginx/1.18.0</span>
<span class="chip">veld 0.1.0</span>
</div>
<div class="meta">
<div><div class="k">压测工具</div><div class="v">wrk · -t4</div></div>
<div><div class="k">测试方法</div><div class="v">交替取峰值 ×5</div></div>
<div><div class="k">worker 数</div><div class="v">两侧均 4</div></div>
<div><div class="k">内核 TCP</div><div class="v">发行版默认</div></div>
</div>
</section>
<!-- ============ VERDICT ============ -->
<section class="verdict">
<p style="margin-top:0;font-size:13px;"><b>结论:</b>veld 在 8 个测试场景中 <b>6 个超过 nginx</b>(小文件领先 54%–126%),中等文件领先 3%–11%,唯一的 1.4MB 超大文件为带宽密集型、与 nginx <b>接近持平</b>。<b>全部场景 p99 延迟均更低</b>,<b>内存占用仅约 nginx 的 1/5</b>。</p>
<div class="kpis">
<div class="kpi"><div class="big">+126%</div><div class="lab">小文件吞吐峰值<br>(index, c=100)</div></div>
<div class="kpi"><div class="big">2.2×</div><div class="lab">小文件 CPU 效率<br>(每千请求)</div></div>
<div class="kpi"><div class="big">≈1/5</div><div class="lab">内存占用<br>(RSS)</div></div>
<div class="kpi"><div class="big">8/8</div><div class="lab">p99 延迟更低<br>(全部场景)</div></div>
</div>
</section>
<!-- ============ THROUGHPUT ============ -->
<section>
<h2>1 · 吞吐对比(Requests/sec,越高越好)</h2>
<div class="legend"><span><i class="sw-nginx"></i>nginx</span><span><i class="sw-rust"></i>veld</span><span>条形按每组峰值归一化;右侧为 rust 相对 nginx 的差异</span></div>
<div class="chart">
<!-- rows: name, bars(nginx,rust), delta -->
<div class="row"><div class="name">index 47B<small>并发 10</small></div><div class="bars">
<div class="bar"><div class="fill nginx" style="width:65%"></div><span class="val num">6,390</span></div>
<div class="bar"><div class="fill rust" style="width:100%"></div><span class="val num">9,831</span></div></div><div class="delta win">+54%</div></div>
<div class="row"><div class="name">index 47B<small>并发 100</small></div><div class="bars">
<div class="bar"><div class="fill nginx" style="width:44%"></div><span class="val num">11,967</span></div>
<div class="bar"><div class="fill rust" style="width:100%"></div><span class="val num">27,061</span></div></div><div class="delta win">+126%</div></div>
<div class="row"><div class="name">index 47B<small>并发 500</small></div><div class="bars">
<div class="bar"><div class="fill nginx" style="width:55%"></div><span class="val num">15,218</span></div>
<div class="bar"><div class="fill rust" style="width:100%"></div><span class="val num">27,502</span></div></div><div class="delta win">+81%</div></div>
<div class="row"><div class="name">1 KB<small>并发 100</small></div><div class="bars">
<div class="bar"><div class="fill nginx" style="width:53%"></div><span class="val num">12,589</span></div>
<div class="bar"><div class="fill rust" style="width:100%"></div><span class="val num">23,977</span></div></div><div class="delta win">+90%</div></div>
<div class="row"><div class="name">10 KB<small>并发 100</small></div><div class="bars">
<div class="bar"><div class="fill nginx" style="width:97%"></div><span class="val num">13,467</span></div>
<div class="bar"><div class="fill rust" style="width:100%"></div><span class="val num">13,936</span></div></div><div class="delta win">+3%</div></div>
<div class="row"><div class="name">100 KB<small>并发 100</small></div><div class="bars">
<div class="bar"><div class="fill nginx" style="width:90%"></div><span class="val num">8,446</span></div>
<div class="bar"><div class="fill rust" style="width:100%"></div><span class="val num">9,361</span></div></div><div class="delta win">+11%</div></div>
<div class="row"><div class="name">1.4 MB<small>并发 50</small></div><div class="bars">
<div class="bar"><div class="fill nginx" style="width:100%"></div><span class="val num">1,336</span></div>
<div class="bar"><div class="fill rust" style="width:84%"></div><span class="val num">1,118</span></div></div><div class="delta loss">−16%</div></div>
<div class="row"><div class="name">1.4 MB<small>并发 100</small></div><div class="bars">
<div class="bar"><div class="fill nginx" style="width:100%"></div><span class="val num">1,240</span></div>
<div class="bar"><div class="fill rust" style="width:91%"></div><span class="val num">1,127</span></div></div><div class="delta loss">−9%</div></div>
</div>
<p class="muted" style="font-size:10.5px;">小文件(最常见的 Web/API 负载)大幅领先,得益于「打开文件缓存 + 预构建响应 + 无分配 HeaderMap」实现的近乎零分配热路径。1.4MB 超大文件运行间波动 −16% ~ +3%,整体与 nginx 接近持平。</p>
</section>
<!-- ============ LATENCY ============ -->
<section>
<h2>2 · 平均延迟对比(ms,越低越好)</h2>
<div class="legend"><span><i class="sw-nginx"></i>nginx</span><span><i class="sw-rust"></i>veld</span><span>条形按每组最大值归一化;rust 全部更低</span></div>
<table>
<thead><tr><th>场景</th><th class="r">nginx</th><th class="r">veld</th><th class="r">改善</th></tr></thead>
<tbody>
<tr><td>index 47B · c10</td><td class="r num">2.81</td><td class="r num">2.20</td><td class="r"><span class="tag win">−22%</span></td></tr>
<tr><td>index 47B · c100</td><td class="r num">9.76</td><td class="r num">4.72</td><td class="r"><span class="tag win">−52%</span></td></tr>
<tr><td>index 47B · c500</td><td class="r num">34.20</td><td class="r num">16.46</td><td class="r"><span class="tag win">−52%</span></td></tr>
<tr><td>1 KB · c100</td><td class="r num">9.65</td><td class="r num">4.88</td><td class="r"><span class="tag win">−49%</span></td></tr>
<tr><td>10 KB · c100</td><td class="r num">8.69</td><td class="r num">7.86</td><td class="r"><span class="tag win">−10%</span></td></tr>
<tr><td>100 KB · c100</td><td class="r num">11.90</td><td class="r num">9.72</td><td class="r"><span class="tag win">−18%</span></td></tr>
<tr><td>1.4 MB · c100 (p99)</td><td class="r num">290.9</td><td class="r num">229.8</td><td class="r"><span class="tag win">−21%</span></td></tr>
</tbody>
</table>
<p class="muted" style="font-size:10.5px;">即便在吞吐略低的 1.4MB 场景,veld 的 p99 延迟依然更低(229.8ms vs 290.9ms)——异步任务模型带来更平滑的尾延迟。</p>
</section>
<!-- ============ RESOURCES ============ -->
<section>
<h2>3 · 资源占用对比(同等 c=100 负载)</h2>
<div class="two">
<div>
<h3>内存常驻 RSS(越低越好)</h3>
<div class="chart">
<div class="row" style="grid-template-columns:96px 1fr 64px;"><div class="name">nginx</div><div class="bars"><div class="bar"><div class="fill nginx" style="width:100%"></div><span class="val num">25.0 MB</span></div></div><div class="delta"></div></div>
<div class="row" style="grid-template-columns:96px 1fr 64px;"><div class="name">veld</div><div class="bars"><div class="bar"><div class="fill rust" style="width:23%"></div><span class="val num">~6 MB</span></div></div><div class="delta win">1/5</div></div>
</div>
<h3 style="margin-top:14px;">单位请求 CPU · 小文件(越低越好)</h3>
<div class="chart">
<div class="row" style="grid-template-columns:96px 1fr 64px;"><div class="name">nginx</div><div class="bars"><div class="bar"><div class="fill nginx" style="width:100%"></div><span class="val num">25.8</span></div></div><div class="delta"></div></div>
<div class="row" style="grid-template-columns:96px 1fr 64px;"><div class="name">veld</div><div class="bars"><div class="bar"><div class="fill rust" style="width:45%"></div><span class="val num">11.7</span></div></div><div class="delta win">2.2×</div></div>
</div>
</div>
<div>
<h3>关键资源指标</h3>
<table>
<thead><tr><th>指标</th><th class="r">nginx</th><th class="r">rust</th></tr></thead>
<tbody>
<tr><td>RSS(小文件负载)</td><td class="r num">25.0 MB</td><td class="r num">~6 MB</td></tr>
<tr><td>RSS(大文件负载)</td><td class="r num">25.0 MB</td><td class="r num">5.8 MB</td></tr>
<tr><td>CPU%(小文件)</td><td class="r num">263%</td><td class="r num">187%</td></tr>
<tr><td>CPU%(大文件)</td><td class="r num">180%</td><td class="r num">213%</td></tr>
<tr><td>CPU/千请求(小)</td><td class="r num">25.8</td><td class="r num">11.7</td></tr>
<tr><td>CPU/千请求(大)</td><td class="r num">164</td><td class="r num">217</td></tr>
<tr><td>p99(小文件)</td><td class="r num">60.1 ms</td><td class="r num">40.5 ms</td></tr>
<tr><td>二进制大小</td><td class="r num">1.24 MB</td><td class="r num">2.34 MB</td></tr>
<tr><td>进程模型</td><td class="r">多进程</td><td class="r">单进程+多线程</td></tr>
</tbody>
</table>
</div>
</div>
<p class="muted" style="font-size:10.5px;">内存约为 nginx 的 1/4–1/5;小文件 CPU 效率约 2.2×;仅大文件纯传输的 CPU 效率略逊(零拷贝 sendfile 在回环网络下的 reactor 唤醒开销)。二进制较大因 rust 更多静态链接(含 rustls)。</p>
</section>
<!-- ============ OPTIMIZATIONS ============ -->
<section>
<h2>4 · 关键优化措施</h2>
<table>
<thead><tr><th style="width:20px">#</th><th>优化项</th><th>原问题</th><th>效果</th></tr></thead>
<tbody>
<tr><td class="num">1</td><td><b>TCP_NODELAY</b></td><td>缺失 → Nagle×延迟ACK 每次卡 ~40ms</td><td><b>决定性</b>,吞吐 ↑10–50×</td></tr>
<tr><td class="num">2</td><td><b>真·零拷贝 sendfile</b></td><td>原"sendfile"是读进用户态再写</td><td>大文件无用户态拷贝</td></tr>
<tr><td class="num">3</td><td><b>打开文件缓存</b></td><td>每请求 open+stat+猜MIME+生成ETag</td><td>命中后零文件系统调用</td></tr>
<tr><td class="num">4</td><td><b>预构建响应</b></td><td>每请求经 HeaderMap 约 30 次堆分配</td><td>小文件整块单次 write</td></tr>
<tr><td class="num">5</td><td><b>无分配 HeaderMap</b></td><td>每次 get 为键分配 2 次堆,每请求约 5 次</td><td><b>小文件吞吐再 ↑1.5–2×</b></td></tr>
<tr><td class="num">6</td><td><b>SO_REUSEPORT 多监听</b></td><td>单 listener 多任务争用 accept</td><td>高并发无 accept 锁竞争</td></tr>
<tr><td class="num">7</td><td><b>固定发送缓冲 + 头/体合并写</b></td><td>回环 RTT≈0 致发送缓冲偏小</td><td>大文件 reactor 往返减少</td></tr>
</tbody>
</table>
</section>
<!-- ============ EXPERIMENTS ============ -->
<section>
<h2>5 · 失败的实验(已回退)</h2>
<div class="note fail">
<p style="margin-top:0;"><b>两次大文件加速实验均失败,结论一致:</b>大文件 I/O 一旦从连接自身的异步任务"外包"出去就无法扩展。</p>
<ul>
<li><b>独立阻塞线程跑 poll/sendfile 循环:</b>高并发下数百阻塞线程争抢调度,1m@c100 <b>−64%</b>。</li>
<li><b>io_uring SEND_ZC + MSG_WAITALL(单 reactor 线程):</b>每请求跨线程往返 + 单 ring 串行化,大文件 <b>−76% ~ −84%</b>,延迟飙至 200–370ms。</li>
</ul>
<p style="margin-bottom:0;">正确做法是<b>在连接任务内联做异步 sendfile</b>(drain 循环,仅 EAGAIN 时回 reactor),即当前实现。</p>
</div>
</section>
<!-- ============ CONCLUSION ============ -->
<section>
<h2>6 · 结论与后续</h2>
<p><b>目标基本达成:</b>veld 在小/中文件场景全面、大幅领先(54%–126%),大文件接近持平,且全场景延迟更低、内存仅约 nginx 的 1/5。决定性优化为补齐 <span class="num">TCP_NODELAY</span>、真零拷贝 <span class="num">sendfile</span>、打开文件缓存、预构建响应与无分配 HeaderMap。</p>
<p><b>剩余差距</b>仅在 1.4MB 超大文件的纯带宽传输:回环网络(RTT≈0、内核发送缓冲被压小)下 tokio reactor 每次 sendfile 的 EAGAIN 唤醒开销略高于 nginx 原生事件循环。<b>在真实网络上,TCP 自动调优会将发送缓冲撑至带宽时延积(数 MB),该差距大概率自然消失。</b>彻底超越需迁移至 thread-per-core 的 tokio-uring 模型(大型重构)。</p>
<p class="muted" style="font-size:10.5px;">正确性验证:5 个文件经 cmp 字节级一致;响应头含 Server/Date/ETag/Last-Modified;HEAD/404/条件请求304/Range206 全部正确。</p>
</section>
<footer>
<span>veld 性能报告 · 数据为 4-worker / interleaved peak-of-5</span>
<span>生成于 2026-06-13 · 主机 172.1.3.74</span>
</footer>
</div>
</body>
</html>