Periagoge
Concept
3 min readself knowledge

Token Counting and Cognitive Load: When to Start Fresh Conversations

Tokens are the linguistic units that AI systems count and charge for, and longer conversations accumulate more tokens until the system hits a limit and loses ability to remember earlier context. Knowing when to start a fresh conversation helps you avoid diminishing returns—older information becomes harder for the AI to reference, and responses may become less coherent or miss important context you established pages ago.

Hypatia
Why It Matters

A token is a small unit of text that an AI model processes. Roughly, one token equals one word, though technical terms, punctuation, and special characters sometimes compress or expand this ratio. Token counting matters because every AI model has a context limit—the maximum number of tokens it can hold in a single conversation. When you approach that limit, the AI's performance degrades: responses become less coherent, earlier context gets 'forgotten,' and precision drops.

For neurodivergent learners, especially those managing ADHD, understanding when you're approaching token saturation is critical because it determines whether to push forward or strategically start fresh. Pushing too hard into saturation is like working on task management when your working memory is already maxed—you'll create more problems than you solve.

Token Limits Across Common Tools

Different AI tools have different limits: Claude 3.5 Sonnet handles 200,000 tokens (roughly 150,000 words). GPT-4o offers 128,000 tokens. Claude 3 Opus handles 200,000. Gemini's standard version handles 32,000-128,000 tokens depending on the variant. Knowing your tool's limit matters because it determines how long you can sustain a single conversation before you hit the wall.

Most modern tools show you token count either directly or approximately. In ChatGPT, you see message count; in Claude, you can toggle a token counter in settings. This visibility lets you make strategic decisions about when to save your work and start fresh.

Strategic Token Management for Focus States

Here's where this intersects with neurodiversity: when you're in a hyperfocus or flow state, interrupting that state to start a new conversation can shatter your momentum. But continuing in a conversation where token saturation is degrading the AI's quality is equally disruptive because you'll get progressively worse responses, which will frustrate you and fracture your focus.

The optimal strategy is preemptive planning. If you know you're entering a multi-hour hyperfocus session, estimate how many tokens you'll likely consume (multiply your expected word count by roughly 1.3 to account for formatting and special characters). If your tool's limit is 128,000 tokens and you estimate needing 80,000 tokens for content, you have breathing room. But if you estimate needing 110,000 tokens and your limit is 128,000, you're walking a precarious line—one complex back-and-forth could push you over.

In that scenario, architect the session differently: prepare your prompt template, output format, and critical context in a document outside the AI. Work in the conversation for 1.5 hours or 70,000 tokens, then take a break, save your learnings to a document or Notion, and start a fresh conversation with a new context summary. This 'reset' actually protects your focus because you're not managing degrading AI performance.

The Cognitive Load Trade-off

Starting fresh requires a context-reloading cost—you have to re-establish your setup, remind the AI of your preferences, re-upload relevant documents. For someone with ADHD, this is cognitively expensive. However, continuing in a saturated conversation is also cognitively expensive because you're trying to interpret increasingly muddled AI responses. The trade-off is between upfront context cost and ongoing frustration cost.

Generally, if you're more than 85% through your tool's context limit, start fresh. If you're between 60-85%, evaluate the next 30 minutes of work: if it's high-complexity reasoning or requires detailed reference to earlier conversation, start fresh. If it's straightforward output generation, you can probably finish in the current conversation.

Claude handles saturation more gracefully than GPT-4o, so you can push Claude closer to its limit (90%+) before degradation becomes noticeable. GPT-4o degrades more sharply, so plan to reset closer to 80%.

Try this: In Claude, enable the token counter (Settings > Beta Features > Token Counter). Start a conversation and add your session contract and initial prompt. Check the token count. Now add a long document or detailed context. Check the count again. This gives you a concrete sense of how much 'space' you're using. If you have a typical 2-hour work session, type a few sample prompts and check the running token total. You'll develop intuition for when to reset before saturation hits.

Helpful guides
Hypatia
Daily Life & Decisions
Related Concepts
Peri
Questions about Token Counting and Cognitive Load: When to Start Fresh Conversations?

Peri can explain this concept, give practical examples, help you decide whether it applies to your situation, or recommend a journey if appropriate.

Ready to work on Token Counting and Cognitive Load: When to Start Fresh Conversations?

Explore related journeys or tell Peri what you're working through.