Tokens are the unit AI systems charge by—roughly four characters per token—and every API call costs money, so optimizing what you send and receive directly impacts your margins. Counting tokens before you generate means you know exactly what each proposal, email, or workflow step will cost, letting you price work profitably instead of discovering hidden expenses after the fact.
Tokens are the currency of AI APIs. Every word you input, and every word the AI outputs, consumes tokens. For freelancers operating on thin margins, misjudging tokens can turn a profitable project into a loss.
A token isn't a word—it's a chunk of text. In English, roughly 1 token ≈ 0.75 words, but that varies. The phrase "hello world" is 2 tokens. "don't" is 1 token. Emojis and code can be more or less efficient depending on the model and tokenizer.
Different models have different pricing per token. GPT-4 Turbo costs roughly $0.01 per 1,000 input tokens and $0.03 per 1,000 output tokens. Claude 3.5 Sonnet costs $0.003 input / $0.015 output. This 3-5x difference matters when you're running hundreds of requests.
Let's say you're writing client proposals at $500-1000 per proposal. Your workflow: client brief (500 tokens in) → AI generates proposal (2,000 tokens out) → you edit (manual, free). With GPT-4, that's ($0.005 + $0.06) = $0.065 per proposal. Across 20 proposals, that's $1.30 in API costs—negligible.
But if you're micro-tasking—writing individual email outreach messages—it changes. Each email might be 100 tokens in, 150 out (on Claude). That's $0.000675 per email. If you send 1,000 personalized emails monthly, that's $0.67. Seems small, until you realize profit margins on mass outreach might be 5-10%. You're eating margin.
OpenAI provides a token counter: tiktoken (free Python library). You can test how many tokens your prompts consume before executing them. Workflow: Write prompt → count tokens → calculate cost → decide if it's worth running.
For ChatGPT directly, the app doesn't show token counts, but you can monitor in the API usage dashboard after calls. For Claude, use the token counter in Anthropic's documentation or test via their API.
The hidden cost: system prompts and context windows. If you embed a 500-token system prompt in every request, that compounds. A 10-step chain with system prompts costs 2-3x more than the same chain without them. Sometimes a small system prompt + good user prompt beats a large system prompt + lazy user prompt.
Not every task needs the most expensive model. Use this heuristic:
A freelancer might use Claude Haiku for initial client research ($0.0001/1k input tokens), then Sonnet for proposal drafting ($0.003/1k), then manually review the final for client delivery. That tiering can cut API costs 50-70% compared to running everything on the best model.
If you're running similar requests repeatedly, batch processing (available on Claude and OpenAI APIs) costs 50% less but processes with 24-hour latency. For bulk proposal generation or email list processing, batching is ideal. For real-time client interactions, latency kills the deal.
Most freelancers overprompt. A prompt padded with irrelevant instructions, example context, or safety disclaimers wastes tokens. A 300-token prompt that could be 100 tokens costs 3x as much. Audit your templates monthly. Remove context that doesn't improve output quality.
Try this: Pick your top 3 recurring tasks (proposal, email, research). For each, write the prompt and use OpenAI's tokenizer (tiktoken.openai.com or Python library) to count input and output tokens. Multiply by your daily/monthly volume. That's your true API cost baseline. Now remove 20% of the words from each prompt without losing quality. Recount. That's your optimization target. If you cut 25% from prompts across 100 monthly tasks, you've just freed up API budget.
Peri can explain this concept, give practical examples, help you decide whether it applies to your situation, or recommend a journey if appropriate.
Explore related journeys or tell Peri what you're working through.