Glossary

tok/s (throughput)

Tokens generated per second after the first one.

Tokens generated per second after the first one. Single-stream numbers (one user) differ a lot from batched numbers (many concurrent users) — modern serving stacks like vLLM achieve 5–10× higher aggregate throughput with continuous batching.

Keyboard shortcuts

?: Show this overlay
/: Focus the first form field
g h: Go to / (home)
g b: Go to /best-llm-for
g c: Go to /cost
g s: Go to /self-hosted
g x: Go to /compliance
Esc: Close any overlay

Inspired by Linear and GitHub conventions. The two-key sequences (g then h) work within ~1 second.