株式会社オブライト
AI2026-05-17

Top-k Sampling

Also known as: Top-k / Top-k Sampling / 上位Kサンプリング

A decoding method that restricts next-token sampling to the k highest-probability tokens. Simple to implement but produces uneven sampling width because k is fixed regardless of the probability distribution.


Overview

Top-k sampling restricts the candidate pool to the k highest-probability tokens, then samples proportionally. k=1 is equivalent to greedy decoding. Its simplicity and low compute cost make it common in local model inference.

Comparison with Top-p

Because k is fixed, Top-k is too broad when probabilities are spread evenly and too narrow when concentrated on a few tokens. Top-p addresses this dynamically. In practice, Top-p has become the preferred default in most production APIs.

Related Columns

Related Terms

Feel free to contact us

Contact Us