Top-K Sampling

Definition

Top-K sampling limits the model's next-token choices to the K most likely options, regardless of probability mass. Top-K 50 means the model considers only the top 50 most-probable tokens at each step. Less commonly tuned than top-p in modern APIs. Both bounding methods serve similar purposes — preventing the model from picking very-low-probability tokens that produce off-topic output.

Example

Top-K 1 = greedy decoding (always pick most-likely token). Top-K 100 = wider sampling pool.

When to use

Rarely tuned in practice. Default values work for most use cases.

Definition

Example

When to use

Related terms

Stop rewriting prompts. Start shipping.