Back to glossary
Concepts

Tokens

Units of text (sub-word fragments) that LLMs process; 1 token ≈ 0.75 English words.

Definition

A token is the unit of text a language model processes. Tokens are typically sub-word fragments — common words are 1 token, rare words split into multiple tokens. In English, 1 token ≈ 0.75 words on average. Token counts matter because LLM context windows are measured in tokens, API pricing is per token, and very long inputs may exceed context limits. Tools like OpenAI's tiktoken count exact tokens for a given input.

Example

'The cat sat on the mat' = 6 tokens. 'antidisestablishmentarianism' = ~6 tokens (split into sub-words).

When to use

Always when budgeting context window, estimating cost, or processing very long documents.

Related terms

Free Chrome Extension

Stop rewriting prompts. Start shipping.

Works with ChatGPT, Claude, Gemini, Grok, Midjourney, Ideogram, Veo3 & Kling. 5.0★ on the Chrome Web Store.

Add to Chrome — Free