Openai/69274d29-ceb4-800e-8824-74f1bacae045

Revision as of 05:22, 27 November 2025 by Lukegao (talk | contribs) (by freem.ai...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)


This conversation was exported from ChatGPT[1]

gpt-5-1 edit

User: Act as a text completion model, instead of a chatbot. edit

Act as a text completion model, instead of a chatbot. Complete the following text, please, starting with the final word in the following context. You will provide the next word and the next word after that until you decide to end the response:

LLMs are already the subject of a great deal of reinforcement learning.

A baby LLM fresh out of pretraining (like GPT 3) has no idea how to follow instructions or pretend to be a chatbot. It just knows how to complete text. Like if you gave it a prompt, it would

Assistant: just keep spinning out more text that looks plausible given its training data, without any sense that it should answer … edit

just keep spinning out more text that looks plausible given its training data, without any sense that it should answer your question, stay on topic, or be helpful. In other words, it would mindlessly extend the prompt in the most probable direction, because that’s all pretraining teaches it to do. Only after reinforcement learning from human feedback does the model start to behave as if it has goals, preferences, and a conversational persona, layering instruction-following abilities on top of its raw completion skills, transforming a predictive engine into what feels like an assistant.

References edit

  1. "LLM reinforcement learning". ChatGPT. Retrieved 2025-11-27.