View a PDF of the paper titled Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs, by Minh Nguyen and 5 other authors
Abstract:Large Language Models (LLMs) generate text by sampling the next token from a probability distribution over the vocabulary at each decoding step. However, popular sampling methods like top-p (nucleus sampling) often struggle to balance quality and diversity, especially at higher temperatures, leading to incoherent or repetitive outputs. To address this challenge, we propose min-p sampling, a dynamic truncation method that adjusts the sampling threshold based on the model’s confidence by scaling according to the top token’s probability. We conduct extensive experiments on benchmarks including GPQA, GSM8K, and AlpacaEval Creative Writing, demonstrating that min-p sampling improves both the quality and diversity of generated text, particularly at high temperatures. Moreover, human evaluations reveal a clear preference for min-p sampling in terms of both text quality and diversity. Min-p sampling has been adopted by multiple open-source LLM implementations, highlighting its practical utility and potential impact.
Submission history
From: Minh Nguyen [view email]
[v1]
Mon, 1 Jul 2024 08:37:25 UTC (8,670 KB)
[v2]
Sun, 13 Oct 2024 11:21:55 UTC (8,634 KB)
[v3]
Sun, 16 Mar 2025 17:12:44 UTC (8,662 KB)
Source link
#Minp #Sampling #Creative #Coherent #LLM #Outputs