GPT tokenizer playground
Tokens are the basic unit that generative AI models use to compute the length of a text. They are groups of characters, which sometimes align with words, but not always. In particular, it depends on the number of characters and includes punctuation signs or emojis. This is why the token count is usually different from the word count.Use the tool provided below to explore how a specific piece of text would be tokenized and the overall count of words, characters and tokens.
Model:
gpt-4oo200k_base
Text:
Tokenized text:
Text
Token IDs
Search tokens
Each row in the table below contains a token ID and its corresponding associated token.
0 tokens
Token IDs
Token