Editing Openai/6939b23e-87cc-8001-bdd9-9546d1143ce9 (section)

==== - Temperature is a parameter that affects the randomness of token selection during generation. ====
* The model produces a probability distribution over all possible next tokens. - Example: next token probabilities might be: ``<code> token1 = 0.7   token2 = 0.2   token3 = 0.1 <syntaxhighlight>- Temperature changes how “peaked” or “flat” this probability distribution is: - Low temperature (e.g., 0.1) → makes the distribution sharper → model almost always picks the highest-probability token → deterministic, conservative outputs. - High temperature (e.g., 1.0 or 1.5) → flattens the distribution → model is more likely to pick lower-probability tokens → creative, varied, sometimes hallucinated outputs.

: 

==== - Important: temperature does not change weights, layers, or neuron connections. ====
* The internal computation of the network is identical; the model still produces the same activation outputs for each neuron.
* Temperature is applied after the neuron activations are converted into probabilities for the next token. - Specifically: the model produces logits (raw numbers for each token) → divide logits by temperature → apply softmax → sample next token.

Mathematically:

</syntaxhighlight>
P(token_i) = exp(logit_i / temperature) / sum_j(exp(logit_j / temperature))

</code>``
* Lower temperature → logits become more extreme → top token dominates
* Higher temperature → logits less extreme → more chance for alternatives