utahgpu.com

Instant, Private LLM API — Hosted in Utah

Spin up a dedicated Mistral-7B endpoint in seconds.
Enjoy ≤ 65 ms latency to the Western US, zero rate limits, and pricing that’s significantly lower than OpenAI.


🔒 Private by Default

Your prompts stay on a single-tenant RTX 3090 inside a secured Utah colocation.

⚡️ Blazingly Fast

Sub-75 ms median latency   //  ~135 tokens / sec throughput.

🧠 Cost Smart

After the free credit, pay just $0.20 / 100 k tokens or $0.28 / GPU-hour.


curl https://api.utahgpu.com/v1/chat/completions \
-H "Authorization: Bearer <YOUR_KEY>" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Tell me a Utah fun fact:"}
],
"max_tokens": 64
}'


PlanWhat You GetWho It's ForPrice
Starter- 1M tokens; 1 model; Email supportKick-the-tires devsFree
Builders- 10M tokens; 2 models; Chat & webhook supportIndie apps / side projects$20 one-time
Dedicated GPU* Full access to GPU (SSH or REST, 24/7 uptime SLA)Production or fine-tunes$0.28 / hr

Usage beyond included credits billed at $0.20 per additional 100 k tokens. Hourly rental billed per-second.

Thank you!

We've added you to our waitlist. You'll be notified when we're ready to onboard you!