Cost Analytics
AI Costs & Usage
Analytics for LLM gateway, direct API, reserved capacity, and GPU/TPU infrastructure costs
Track and analyze your organization's AI spending across 4 sections: LLM Gateway (LiteLLM), Direct PAYG, Reserved/Commitment (PTU/GSU/PT), and Infrastructure (GPU/TPU).
Supported Providers
| Provider | Services Tracked |
|---|---|
| OpenAI | GPT-4.1, GPT-4.1-mini, DALL-E, Whisper, Embeddings |
| Anthropic | Claude Opus 4.6, Sonnet 4.6, Haiku |
| Google AI | Gemini Pro, Gemini Ultra |
Cost Metrics
Usage Breakdown
| Metric | Description |
|---|---|
| Total Tokens | Input + output tokens consumed |
| Input Tokens | Tokens sent to the API |
| Output Tokens | Tokens returned from the API |
| API Calls | Number of API requests made |
Cost Calculation
AI costs are calculated based on:
- Token usage (input and output rates differ)
- Model tier (GPT-4.1 vs GPT-4.1-mini, Claude Opus 4.6 vs Haiku)
- Special features (vision, embeddings, fine-tuning)
Analytics Views
By Model
See costs broken down by specific model:
- gpt-4.1
- gpt-4.1-mini
- claude-opus-4-6
- claude-sonnet-4-6
- gemini-pro
By Time Period
- Hourly breakdown (last 24 hours)
- Daily breakdown (last 30 days)
- Weekly breakdown (last 12 weeks)
- Monthly breakdown (last 12 months)
By Application
If you've tagged your API calls, view costs by:
- Application name
- Environment (production, staging, development)
- Team or project
Cost Optimization Tips
Reduce costs by:
- Using smaller models where appropriate (GPT-4.1-mini vs GPT-4.1)
- Implementing caching for repeated queries
- Optimizing prompts to reduce token usage
- Setting usage limits per application
Setting Alerts
Configure alerts for:
- Daily spending threshold
- Monthly budget limit
- Unusual usage patterns
Navigate to Notifications to set up cost alerts.