Usage Tracking

GET

Monitor your API usage, token consumption, and credit spending in real-time.

Dashboard

The easiest way to view your usage is through the dashboard. You'll see:

📊 Real-time Metrics

Live view of requests, tokens, and credits used today and this month.

📈 Usage Graphs

Visualize your usage patterns over time with interactive charts.

🏷️ Model Breakdown

See which models consume the most tokens and credits.

🔑 Per-Key Usage

Track usage by API key to identify which applications use the most.

Usage API

Query your usage programmatically for custom dashboards or alerts:

TypeScript
// Get usage summary for current billing period
const response = await fetch('https://api.llmhub.dev/v1/usage', {
  headers: {
    'Authorization': 'Bearer your-api-key'
  }
});

const usage = await response.json();
console.log(usage);
// {
//   "period_start": "2024-01-01T00:00:00Z",
//   "period_end": "2024-01-31T23:59:59Z",
//   "total_requests": 15420,
//   "total_tokens": 2847500,
//   "total_credits_used": 142375,
//   "credits_remaining": 857625,
//   "daily_breakdown": [
//     { "date": "2024-01-15", "requests": 520, "tokens": 98400, "credits": 4920 },
//     { "date": "2024-01-14", "requests": 480, "tokens": 91200, "credits": 4560 },
//     // ...
//   ]
// }

Usage by Model

Get a breakdown of usage per model to optimize your model selection:

TypeScript
// Get usage breakdown by model
const response = await fetch('https://api.llmhub.dev/v1/usage/models', {
  headers: {
    'Authorization': 'Bearer your-api-key'
  }
});

const modelUsage = await response.json();
// {
//   "data": [
//     {
//       "model": "gpt-4o",
//       "requests": 5200,
//       "prompt_tokens": 520000,
//       "completion_tokens": 260000,
//       "total_tokens": 780000,
//       "credits_used": 78000
//     },
//     {
//       "model": "gpt-4o-mini",
//       "requests": 10220,
//       "prompt_tokens": 1534000,
//       "completion_tokens": 533500,
//       "total_tokens": 2067500,
//       "credits_used": 64375
//     }
//   ]
// }

Per-Request Usage

Every API response includes token usage for that specific request:

TypeScript
// Usage information in response headers
const response = await client.chat.completions.create({
  model: 'gpt-4o',
  messages: [{ role: 'user', content: 'Hello!' }]
});

// Access usage from the response
console.log(response.usage);
// {
//   "prompt_tokens": 8,
//   "completion_tokens": 12,
//   "total_tokens": 20
// }

// Credits remaining is in the header
// x-credits-remaining: 857605
FieldDescription
prompt_tokensTokens in your input messages
completion_tokensTokens in the model's response
total_tokensSum of prompt + completion tokens

Usage Alerts

Set up webhooks to receive alerts when usage reaches thresholds:

TypeScript
// Set up usage threshold alerts via webhooks
await fetch('https://api.llmhub.dev/v1/webhooks', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer your-api-key',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({
    url: 'https://your-app.com/api/usage-alert',
    events: ['usage.threshold', 'credits.low'],
    settings: {
      usage_threshold_percent: 80,  // Alert at 80% usage
      credits_low_threshold: 100000  // Alert when < 100K credits
    }
  })
});

See the Webhooks documentation for full details on available events.

Understanding Tokens

Tokens are the unit of text that models process. In English, one token is approximately 4 characters or ¾ of a word.

Token Examples

  • • "Hello" = 1 token
  • • "artificial intelligence" = 2 tokens
  • • "supercalifragilisticexpialidocious" = 6 tokens
  • • 1,000 words ≈ 750 tokens

What Counts as Input Tokens

  • • System message
  • • User messages
  • • Assistant messages (conversation history)
  • • Function/tool definitions
  • • Image tokens (for vision models)

Cost Calculation

Credits are calculated based on tokens used and the model's pricing:

credits = (prompt_tokens × input_rate) + (completion_tokens × output_rate)

Output tokens typically cost 2-4x more than input tokens because they require more compute. See model pricing for specific rates.

Export Usage Data

CSV Export

Download detailed usage logs as CSV from the dashboard for accounting or analysis.

API Export

Use the /v1/usage/export endpoint to programmatically fetch usage data in JSON or CSV format.

Optimization Tips

Monitor Model Mix

Check if you're using expensive models (GPT-4o) for tasks that cheaper models (GPT-4o-mini) could handle.

Audit Conversation Length

Long conversation histories consume tokens. Implement summarization or truncation for extended conversations.

Identify Waste

Look for retry loops, duplicate requests, or unused responses that inflate your token count.

Next Steps