Glossary

Mixture-of-Experts (MoE)

An architecture where each token activates only a subset of the model's parameters.

An architecture where each token only activates a subset of the model's parameters. Mixtral 8×7B has 47B total but only ~13B active per token — so it has the cost of a 13B but the quality of something between 13B and 47B.

Keyboard shortcuts

?: Show this overlay
/: Focus the first form field
g h: Go to / (home)
g b: Go to /best-llm-for
g c: Go to /cost
g s: Go to /self-hosted
g x: Go to /compliance
Esc: Close any overlay

Inspired by Linear and GitHub conventions. The two-key sequences (g then h) work within ~1 second.