Usage Tracking
GETMonitor your API usage, token consumption, and credit spending in real-time.
Dashboard
The easiest way to view your usage is through the dashboard. You'll see:
📊 Real-time Metrics
Live view of requests, tokens, and credits used today and this month.
📈 Usage Graphs
Visualize your usage patterns over time with interactive charts.
🏷️ Model Breakdown
See which models consume the most tokens and credits.
🔑 Per-Key Usage
Track usage by API key to identify which applications use the most.
Usage API
Query your usage programmatically for custom dashboards or alerts:
// Get usage summary for current billing period
const response = await fetch('https://api.llmhub.dev/v1/usage', {
headers: {
'Authorization': 'Bearer your-api-key'
}
});
const usage = await response.json();
console.log(usage);
// {
// "period_start": "2024-01-01T00:00:00Z",
// "period_end": "2024-01-31T23:59:59Z",
// "total_requests": 15420,
// "total_tokens": 2847500,
// "total_credits_used": 142375,
// "credits_remaining": 857625,
// "daily_breakdown": [
// { "date": "2024-01-15", "requests": 520, "tokens": 98400, "credits": 4920 },
// { "date": "2024-01-14", "requests": 480, "tokens": 91200, "credits": 4560 },
// // ...
// ]
// }Usage by Model
Get a breakdown of usage per model to optimize your model selection:
// Get usage breakdown by model
const response = await fetch('https://api.llmhub.dev/v1/usage/models', {
headers: {
'Authorization': 'Bearer your-api-key'
}
});
const modelUsage = await response.json();
// {
// "data": [
// {
// "model": "gpt-4o",
// "requests": 5200,
// "prompt_tokens": 520000,
// "completion_tokens": 260000,
// "total_tokens": 780000,
// "credits_used": 78000
// },
// {
// "model": "gpt-4o-mini",
// "requests": 10220,
// "prompt_tokens": 1534000,
// "completion_tokens": 533500,
// "total_tokens": 2067500,
// "credits_used": 64375
// }
// ]
// }Per-Request Usage
Every API response includes token usage for that specific request:
// Usage information in response headers
const response = await client.chat.completions.create({
model: 'gpt-4o',
messages: [{ role: 'user', content: 'Hello!' }]
});
// Access usage from the response
console.log(response.usage);
// {
// "prompt_tokens": 8,
// "completion_tokens": 12,
// "total_tokens": 20
// }
// Credits remaining is in the header
// x-credits-remaining: 857605| Field | Description |
|---|---|
prompt_tokens | Tokens in your input messages |
completion_tokens | Tokens in the model's response |
total_tokens | Sum of prompt + completion tokens |
Usage Alerts
Set up webhooks to receive alerts when usage reaches thresholds:
// Set up usage threshold alerts via webhooks
await fetch('https://api.llmhub.dev/v1/webhooks', {
method: 'POST',
headers: {
'Authorization': 'Bearer your-api-key',
'Content-Type': 'application/json'
},
body: JSON.stringify({
url: 'https://your-app.com/api/usage-alert',
events: ['usage.threshold', 'credits.low'],
settings: {
usage_threshold_percent: 80, // Alert at 80% usage
credits_low_threshold: 100000 // Alert when < 100K credits
}
})
});See the Webhooks documentation for full details on available events.
Understanding Tokens
Tokens are the unit of text that models process. In English, one token is approximately 4 characters or ¾ of a word.
Token Examples
- • "Hello" = 1 token
- • "artificial intelligence" = 2 tokens
- • "supercalifragilisticexpialidocious" = 6 tokens
- • 1,000 words ≈ 750 tokens
What Counts as Input Tokens
- • System message
- • User messages
- • Assistant messages (conversation history)
- • Function/tool definitions
- • Image tokens (for vision models)
Cost Calculation
Credits are calculated based on tokens used and the model's pricing:
Output tokens typically cost 2-4x more than input tokens because they require more compute. See model pricing for specific rates.
Export Usage Data
CSV Export
Download detailed usage logs as CSV from the dashboard for accounting or analysis.
API Export
Use the /v1/usage/export endpoint to programmatically fetch usage data in JSON or CSV format.
Optimization Tips
Monitor Model Mix
Check if you're using expensive models (GPT-4o) for tasks that cheaper models (GPT-4o-mini) could handle.
Audit Conversation Length
Long conversation histories consume tokens. Implement summarization or truncation for extended conversations.
Identify Waste
Look for retry loops, duplicate requests, or unused responses that inflate your token count.

