Pricing Plans (2026)
Open Source / Self-Host Free
Free
- Free model weights download (Llama 4 Scout, Maverick, Llama 3.3 70B, and more)
- Commercial use permitted under Meta's license
- Full control over data and infrastructure
- Requires own GPU hardware to run
Llama API (Limited Preview) Free
Free
- Free access during limited preview period
- Llama 4 Scout, Llama 4 Maverick, Llama 3.3 70B & 8B
- One-click API key creation
- Interactive model playgrounds
- Python & TypeScript SDKs
- OpenAI SDK compatibility
- Fine-tuning & evaluation tools (Llama 3.3 8B)
- Ultra-fast inference via Cerebras & Groq
- Inputs/outputs never used to train Meta models
Third-Party API (e.g., DeepInfra)
$0.08/per 1M input tokens (Scout)
- Llama 4 Scout from ~$0.08/1M input tokens
- Llama 4 Maverick from ~$0.50/1M tokens
- Llama 3.3 70B available free on OpenRouter
- Pay-as-you-go, no upfront costs
- Available on AWS Bedrock, Azure, Together AI, and more
Enterprise / Custom Enterprise
Custom
- Negotiated volume discounts
- Dedicated infrastructure options
- Available via AWS Bedrock, Azure AI Foundry, Google Vertex AI
- Provisioned throughput options on major clouds
- Enhanced support packages
Special Perks & Discounts
- Open-source weights freely downloadable
- No vendor lock-in — port weights anywhere
- Data privacy: API inputs/outputs not used for training or ads
- Compatible with OpenAI SDK for easy migration
- Partnered with Cerebras for world's fastest Llama inference
Verdict
Unmatched open-source value. Llama is free to download and self-host, and the official Llama API is currently in free limited preview — making it the most cost-effective frontier AI model available for developers and enterprises alike.
Related AI Assistants
Track every deal, free tier & perk automatically
uragent monitors 63+ AI tools daily — so you never miss a free tier or deal.
Get Early Access Free →