What are OpenAI GPT tokens?
Tokens are pieces of text
Tokens are the basic unit that OpenAI GPT models use to compute the length of a text and split it into semantic pieces for efficient processing. They are groups of characters, which sometimes align with words, but not always. Small English words can fit into a token, but most of the time, words will need more than one token. Also, there are some tokens to represent punctuation signs or emojis. This is why the tokens count is usually different and larger than word count.
Video tutorial
How to count tokens in my text?
In order to know exactly how many tokens are in your text, you need to tokenize your text, which means running a tokenizer algorithm that will count all the tokens. Sounds difficult? It’s actually as easy as copy/pasting. You can use our free tokenizer.
There are several tokenizing algorithms (cl100k_base and o200k_base are the most popular at the moment) that have different vocabulary size (total amount of tokens). Usually vocabulary size varies between 50k and 200k.
Rule of thumb for word to token ratio
Word counts vary by language because the way text is split into tokens differs from one language to another.
On average (figures from this study):
- English: 1 word ≈ 1.3 tokens
- French: 1 word ≈ 2 tokens
- German: 1 word ≈ 2.1 tokens
- Spanish: 1 word ≈ 2.1 tokens
- Chinese: 1 word ≈ 2.5 tokens
- Russian: 1 word ≈ 3.3 tokens
- Vietnamese: 1 word ≈ 3.3 tokens
- Arabic: 1 word ≈ 4 tokens
- Hindi: 1 word ≈ 6.4 tokens
Refer to the study for other languages and models.
These figures are presented for estimation purposes only and are not guaranteed.
How many tokens for punctuation marks, special characters and emojis?
Punctuation marks (,:;?!) = 1 token Special characters (∝√∅°¬) = 1 to 3 tokens Emojis (😁🙂🤩) = 2 to 3 tokens
Word and token counts examples
Write a tagline for an ice cream shop A scoop of happiness in every cone! ⮑ 15 words · 19 tokens
Generate 3 subject lines for a cold email about Instagram growth
1. "9 Ways to Boost Your Instagram Reach & Engagement" 2. "Explode Your Instagram Following - Log in to Find Out How" 3. "Maximize Your Instagram Presence - Learn How Now" ⮑ 40 words · 58 tokens
Write a thank you email to my grandmother
Dear Grandma, I hope this letter finds you well and happy. I wanted to thank you for the thoughtful and generous gift that you sent my way. It was so thoughtful of you and I truly appreciate it. Your gift was so lovely and will certainly be used and enjoyed. It was a perfect way to brighten up my day. Thank you for all the love and support you continue to give me.It means more to me than anything else. With love and gratitude, ⮑ 93 words · 107 tokens
What is the price of a token?
The price of a token depends on the model. The price is per 1M tokens. Find the pricing table below:
What is the max_tokens parameter?
max_tokens is the maximum of new tokens that should be generated in any request to OpenAI GPT APIs. This applies to requests made through GPT for Sheets and Docs. It should always obey the following constraint: prompt_tokens + max_tokens ≤ model capacity
What to do if the response from GPT is cut?
If your response is cut, it means you didn’t provide a large enough max_tokens value. You should increase the max_tokens parameter.