GPT tokenizer playground

Tokens are the basic unit that generative AI models use to compute the length of a text. They are groups of characters, which sometimes align with words, but not always. In particular, it depends on the number of characters and includes punctuation signs or emojis. This is why the token count is usually different from the word count.
Use the tool provided below to explore how a specific piece of text would be tokenized and the overall count of words, characters and tokens.
Tokenized text:
Token IDs

Search tokens

Each row in the table below contains a token ID and its corresponding associated token.
0 tokens
Token IDs