Set max response tokens
Set a cut-off limit for your responses measured in tokens. If the reponse is larger than this limit, it will be truncated. Helps control cost and speed.
Term | Definition |
---|---|
Token | Tokens can be thought of as pieces of words. During processing, the language model breaks down both the input (prompt) and the output (completion) texts into smaller units called tokens. Tokens generally correspond to ~4 characters of common English text. So 100 tokens are approximately worth 75 words. Learn more with our token guide. |
Token limit | Token limit is the maximum total number of tokens that can be used in both the input (prompt) and the response (completion) when interacting with a language model. |
Prerequisites
In the GPT for Docs sidebar, click Model settings.
Type a value for Max response tokens.
Rule: max response tokens + input tokens ≤ token limit.
This means that when you set Max response tokens, you must make sure there is enough space for your input. Your input includes your prompt, custom instructions, context, and elements sent by GPT for Docs with your input (about 100 extra tokens). You can use OpenAI’s official tokenizer to estimate the number of tokens you need in your response.
You can now submit a prompt in the current document with the new maximum response size. Once a prompt is submitted, the maximum response size is saved along with all Model settings values, and is used for all prompts executed from Google documents.
Select other settings to customize how the language model operates.