The token economy
A token is thola's unit of AI usage. Every reply the Planner sends, every diagnosis the Diagnostics agent runs, every file you upload — they all consume tokens from your monthly pool.
Why we meter
Honestly: because the LLM models behind thola cost money to run, and pricing on flat-rate "unlimited" gives you the worst version of the product. We'd rather charge you in proportion to what you ask for than throttle the model when our cost goes up.
Tokens make our cost-of-goods visible. We pass it through transparently.
What each action costs
Approximate token costs by action. Real costs vary slightly based on context size; these are typical.
| Action | Tokens (typical) |
|---|---|
| Short reply ("what's our burn rate?") | 0.1 |
| Long reply with a chart | 0.3 |
| File parse (1k row CSV) | 0.5 |
| File parse (10k row CSV) | 2 |
| OCR an invoice PDF | 1 |
| OCR a handwritten cash book page | 2 |
| Red Flag run (one round across all modules) | 0.2 |
| Playbook step (per agent-action step) | 0.2 |
| Presentation deck generation (8 slides) | 2 |
| Forecast (next quarter, with chart) | 1.5 |
| Email draft | 0.2 |
| Long analytical report (multi-page) | 5 |
The pricing maps roughly to "complexity × output length × model tier". A short factual answer is cheap. A multi-paragraph analysis with a chart is more expensive.
Where to see your usage
Profile → Usage shows:
- Current month's tokens used vs. quota
- A running ledger of every action and its token cost
- Top consumers of tokens this month (which agents, which features)
- A burn-rate forecast for the rest of the month
The ledger is searchable and exportable. If a month surprises you, you'll see why in 30 seconds.
When you run out
When you hit your monthly quota:
- At 80% — thola tells you in chat: "You've used 80% of this month's tokens."
- At 100% — non-essential agent actions pause. The chat still works, but with shorter replies and no fresh forecasts. Red Flags still fire. Tasks still execute. Imports still work but skip auto-analysis.
- Top-up available — you can buy a one-time top-up to continue full-fat usage without changing plan.
Top-ups
When you need more tokens this month but don't want to upgrade your plan permanently:
- Profile → Plan → Buy top-up
- Pick a pack: 100, 500, 2000, or 10,000 tokens
- Confirm
Top-up tokens roll over for 90 days (unlike plan tokens, which expire monthly). They are used after your plan tokens — so you don't waste a top-up if you don't end up needing it.
Tokens for teams
Tokens are pooled across the entire workspace, not per-member. If your Sales lead is running 200 prompts a day and your CFO is running 5, the lead is using most of the pool — and that's usually fine.
If you want to cap individual usage, set per-role limits under Settings → Workspace → Token policy:
- Per-role daily cap (e.g. Staff = 10 tokens/day)
- Per-member daily cap (e.g. Priya = 50 tokens/day)
- Workspace-wide hard cap (e.g. never exceed 1,000 tokens/day no matter what)
The hard cap is most useful for the kind of admin who wants no surprises.
Free vs. paid tokens
There's a distinction worth knowing:
- Plan tokens — included in your monthly subscription. Reset on the 1st of each month. Don't roll over.
- Top-up tokens — bought separately. Roll over 90 days. Used after plan tokens.
- Promo tokens — granted by support or via a referral. Same rules as top-up.
The Usage page colour-codes the three types so you can see where the burn is coming from.
The math, transparently
Behind a token cost is roughly:
token_cost ≈ (input_tokens × input_rate) + (output_tokens × output_rate)
+ (tool_calls × tool_rate)
+ (cache_misses × cache_penalty)This is a pass-through with a small margin. We do not artificially inflate token costs to upsell.
If you ever want the exact computation for a specific action, expand it in the ledger — every action shows the input tokens, output tokens, model used, and tools invoked.
Common questions
Can I disable AI features and just use thola as a dashboard? Yes. Settings → Workspace → AI has a master toggle. Forecasts, diagnostics, and chat replies pause; manual data entry and dashboard reads continue. Token consumption drops to ~0.
Why did my token consumption spike? Usually a bulk import (large CSVs cost real tokens for OCR/analysis). The ledger will show you which action contributed the spike.
Do tokens count when the agent says "I don't know"? Yes, but minimally. The classifier still ran. Those refused-replies typically cost 0.05 tokens. If you're seeing significant token burn from refused replies, that's a Planner-tuning issue — let us know.
→ Next: Payment methods