utahgpu.com
Instant, Private LLM API — Hosted in Utah
Spin up a dedicated Mistral-7B endpoint in seconds.
Enjoy ≤ 65 ms latency to the Western US, zero rate limits, and pricing that’s significantly lower than OpenAI.
🔒 Private by Default
Your prompts stay on a single-tenant RTX 3090 inside a secured Utah colocation.
⚡️ Blazingly Fast
Sub-75 ms median latency // ~135 tokens / sec throughput.
🧠 Cost Smart
After the free credit, pay just $0.20 / 100 k tokens or $0.28 / GPU-hour.
curl https://api.utahgpu.com/v1/chat/completions \
-H "Authorization: Bearer <YOUR_KEY>" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a Utah fun fact:"}
],
"max_tokens": 64
}'
Plan | What You Get | Who It's For | Price |
---|---|---|---|
Starter | - 1M tokens; 1 model; Email support | Kick-the-tires devs | Free |
Builders | - 10M tokens; 2 models; Chat & webhook support | Indie apps / side projects | $20 one-time |
Dedicated GPU | * Full access to GPU (SSH or REST, 24/7 uptime SLA) | Production or fine-tunes | $0.28 / hr |
Usage beyond included credits billed at $0.20 per additional 100 k tokens. Hourly rental billed per-second.
Thank you!
We've added you to our waitlist. You'll be notified when we're ready to onboard you!