mnfst/awesome-free-llm-apis: Permanent Free LLM API List (API Keys) 😎🔑

Permanent Free LLM API List (API Keys) 😎🔑. Contribute to mnfst/awesome-free-llm-apis development by creating an account on GitHub.

Github

直达网站

站点详情介绍

LLM APIs with permanent free tiers for text inference.

Provider APIs

APIs run by the companies that train or fine-tune the models themselves.

Cohere 🇺🇸 - Command A, Command R+, Aya Expanse 32B +9 more. 20 RPM, 1K/mo.
Google Gemini 🇺🇸 - Gemini 2.5 Pro, Flash, Flash-Lite +4 more. 5-15 RPM, 100-1K RPD. ¹
Mistral AI 🇪🇺 - Mistral Large 3, Small 3.1, Ministral 8B +3 more. 1 req/s, 1B tok/mo.
Zhipu AI 🇨🇳 - GLM-4.7-Flash, GLM-4.5-Flash, GLM-4.6V-Flash. Limits undocumented.

Inference providers

Third-party platforms that host open-weight models from various sources.

Cerebras 🇺🇸 - Llama 3.3 70B, Qwen3 235B, GPT-OSS-120B +3 more. 30 RPM, 14,400 RPD.
Cloudflare Workers AI 🇺🇸 - Llama 3.3 70B, Qwen QwQ 32B +47 more. 10K neurons/day.
GitHub Models 🇺🇸 - GPT-4o, Llama 3.3 70B, DeepSeek-R1 +more. 10-15 RPM, 50-150 RPD.
Groq 🇺🇸 - Llama 3.3 70B, Llama 4 Scout, Kimi K2 +17 more. 30 RPM, 1K RPD (14,400 for Llama 3.1 8B). ²
Hugging Face 🇺🇸 - Llama 3.3 70B, Qwen2.5 72B, Mistral 7B +many more. $0.10/mo in free credits.
Kluster AI 🇺🇸 - DeepSeek-R1, Llama 4 Maverick, Qwen3-235B +2 more. Limits undocumented.
LLM7.io 🇬🇧 - DeepSeek R1, Flash-Lite, Qwen2.5 Coder +27 more. 30 RPM (120 with token).
NVIDIA NIM 🇺🇸 - Llama 3.3 70B, Mistral Large, Qwen3 235B +more. 40 RPM.
Ollama Cloud 🇺🇸 - DeepSeek-V3.2, Qwen3.5, Kimi-K2.5 +17 more. 1 concurrent model, light usage. ³
OpenRouter 🇺🇸 - DeepSeek R1, Llama 3.3 70B, GPT-OSS-120B +29 more. 20 RPM, 50 RPD (1K with $10+ in purchased credits). ⁴

Contributing

Know a free tier that’s missing? Open a PR. Include the provider, endpoint, rate limits (link to their docs), and a few notable models. Trial credits and time-limited promos don’t count.

Footnotes

RPM – requests per minute. RPD – requests per day.
“Limits undocumented” means the provider doesn’t publish their rate limits.
All endpoints are OpenAI SDK-compatible unless noted.
Each link points to the provider’s API key page.

Free tier not available in the EU, UK, or Switzerland (available regions). ↩︎
14,400 RPD only applies to Llama 3.1 8B Instant. Most other models (Llama 3.3 70B, Llama 4 Scout, Kimi K2, etc.) are limited to 1,000 RPD (rate limits). ↩︎
Ollama Cloud measures usage by GPU time, not tokens or requests. Free tier described as “light usage” with session limits resetting every 5 hours and weekly limits every 7 days. Pro (50x more) and Max (250x more) plans available. Not OpenAI SDK-compatible; uses Ollama API ↩︎
Free models default to 50 RPD. A one-time purchase of $10+ in credits unlocks 1,000 RPD for free models. OpenRouter also offers a Free Models Router (openrouter/free) and model fallbacks for chaining models in priority order. ↩︎

相关导航推荐

swufe-xiongmin/free_premium_ai: 收录可以免费使用能力较强的AI的合集，尽量筛选较为稳定的平台，但希望合理预期，免费，稳定，高性能是不可能三角A list of highly capable AI models you can use for free. We’ve tried to select the most stable platforms, but keep in mind the "impossible triangle": you can’t always have free, stable, and high performance all at once.

收录可以免费使用能力较强的AI的合集，尽量筛选较为稳定的平台，但希望合理预期，免费，稳定，高性能是不可能三角A list of highly capable AI models you can use for free. We’ve tried to select the most stable platforms, but keep in mind the "impossible triangle": you can’t always have free, stable, and high performance all at once. - swufe-xiongmin/free_premium_ai