Model usage and scheduled jobs, with 7-day and 30-day views.
Generated 2026-04-28
Figures and costs on this page are best-effort estimates from the machine that built it.
Per-day session token totals (agent JSONL) and cron job run token totals (runs store), aligned on UTC calendar days. Chart is inline SVG (no network); table follows.
Generated 2026-04-28 ยท
Source: [redacted path] ยท
Non-xAI token columns = session logs in the window (parseable timestamp). xAI: tokens + $/M from invoice preview (billing cycle); cost from POST โฆ/usage. Configure credentials (local): [REDACTED_AUTH] / XAI_TEAM_ID.
| Model | Provider | $/M In | $/M Out | Requests | Prompt Tokens | Compl. Tokens | Total Tokens | Est. Cost |
|---|---|---|---|---|---|---|---|---|
| xai/grok-4-1-fast-reasoning Grok 4.1 Fast (reasoning + non-reasoning). 2M token context. Cost: xAI Management API (POST /usage). xAI $/M from invoice (2026-04). xAI tokens from invoice (2026-04, billing cycle). | xAI | $0.12 | $0.48 | 55 | 31,343,243 | 797,782 | 32,141,025 | $1.1641 |
| deepseek/deepseek-chat deepseek-chat (V3.2). Cache hit: $0.07/M. Max 8K output. | DeepSeek | $0.27 | $1.10 | 123 | 3,279,412 | 33,457 | 3,312,869 | $0.1864 |
| deepseek/deepseek-reasoner deepseek-reasoner (R1). Thinking tokens count toward output cost. | DeepSeek | $0.55 | $2.19 | 12 | 518,026 | 5,865 | 523,891 | $0.0316 |
| google/gemini-2.5-flash Standard context (<200k). Thinking-mode output is $3.50/M. | Google AI Studio | โ | โ | 66 | 105,203 | 531 | 105,734 | $0.0000 |
| nvidia/minimax-m2.7 MiniMax official pay-as-you-go pricing for M2.7 standard. | MiniMax via NVIDIA NIM | โ | โ | 12 | 0 | 0 | 0 | $0.0000 |
| nvidia/deepseek-v4-pro DeepSeek V4-Pro: 1.6T params, 49B active. NVIDIA NIM endpoint. | NVIDIA NIM (DeepSeek V4 Pro) | โ | โ | 0 | 0 | 0 | 0 | $0.0000 |
| nvidia/deepseek-v4-flash DeepSeek published pricing for V4-Flash. NVIDIA NIM endpoint. | NVIDIA NIM (DeepSeek V4 Flash) | โ | โ | 12 | 0 | 0 | 0 | $0.0000 |
| groq-main/llama-3.3-70b-versatile Groq LPU inference. 128K context, up to 33K output. | Groq | โ | โ | 24 | 0 | 0 | 0 | $0.0000 |
| cerebras-8b/llama-3.1-8b Cerebras Llama 3.1 8B. Very fast inference on Cerebras hardware. | Cerebras | โ | โ | 18 | 0 | 0 | 0 | $0.0000 |
| TOTAL | 322 | 35,245,884 | 837,635 | 36,083,519 | $1.3821 | |||
| Model | Provider | $/M In | $/M Out | Requests | Prompt Tokens | Compl. Tokens | Total Tokens | Est. Cost |
|---|---|---|---|---|---|---|---|---|
| xai/grok-4-1-fast-reasoning Grok 4.1 Fast (reasoning + non-reasoning). 2M token context. Cost: xAI Management API (POST /usage). xAI $/M from invoice (2026-04). xAI tokens from invoice (2026-04, billing cycle). | xAI | $0.12 | $0.48 | 133 | 31,343,243 | 797,782 | 32,141,025 | $7.0066 |
| deepseek/deepseek-chat deepseek-chat (V3.2). Cache hit: $0.07/M. Max 8K output. | DeepSeek | $0.27 | $1.10 | 1,036 | 63,651,126 | 248,845 | 63,899,971 | $3.1492 |
| deepseek/deepseek-reasoner deepseek-reasoner (R1). Thinking tokens count toward output cost. | DeepSeek | $0.55 | $2.19 | 66 | 2,586,642 | 24,661 | 2,611,303 | $0.1416 |
| google/gemini-2.5-flash Standard context (<200k). Thinking-mode output is $3.50/M. | Google AI Studio | โ | โ | 115 | 829,987 | 2,693 | 743,815 | $0.0000 |
| nvidia/minimax-m2.7 MiniMax official pay-as-you-go pricing for M2.7 standard. | MiniMax via NVIDIA NIM | โ | โ | 12 | 0 | 0 | 0 | $0.0000 |
| nvidia/deepseek-v4-pro DeepSeek V4-Pro: 1.6T params, 49B active. NVIDIA NIM endpoint. | NVIDIA NIM (DeepSeek V4 Pro) | โ | โ | 0 | 0 | 0 | 0 | $0.0000 |
| nvidia/deepseek-v4-flash DeepSeek published pricing for V4-Flash. NVIDIA NIM endpoint. | NVIDIA NIM (DeepSeek V4 Flash) | โ | โ | 12 | 0 | 0 | 0 | $0.0000 |
| groq-main/llama-3.3-70b-versatile Groq LPU inference. 128K context, up to 33K output. | Groq | โ | โ | 31 | 0 | 0 | 0 | $0.0000 |
| cerebras-8b/llama-3.1-8b Cerebras Llama 3.1 8B. Very fast inference on Cerebras hardware. | Cerebras | โ | โ | 18 | 0 | 0 | 0 | $0.0000 |
| TOTAL | 1,423 | 98,410,998 | 1,073,981 | 99,396,114 | $10.2974 | |||
|