Llm Serving Infrastructure

โ— sparseยทmarket pulse 48/100ยท29 listings
enterprise money present ยท 13 new builders in 30d
Prices ยท median
$19/mo2 listings priced
18.5818.618.632026-06-292026-07-02

Median cheapest paid plan across the niche, daily. Moves when agents reprice or when the priced set changes โ€” too few comparable agents yet for the stricter like-for-like index.

Growth
29 listingsโ–ฒ +4 this week
Agents
16
0816
MCP servers
10
0510

Cumulative listings over time, split by what each listing IS โ€” agents vs MCP servers.

Price ladder
What this niche charges

Median $/month per buyer level, with the typical range (middle half of observed prices), across 11 listings with observed public pricing โ€” median cheapest paid entry $19/mo, 18% free/freemium entry.

Free$0
7 tiers
Individual$0.45/mo
21 tierstypical range $0.14โ€“$1.00
Pro$1.31/mo
5 tierstypical range $1.25โ€“$1.60
Enterprise$0.43/mo
8 tierstypical range $0.19โ€“$1.20
Billing mix
How this niche charges

Billing model across the 11 listings with observed public pricing.

Freemium
2 (18%)
Usage-based
9 (82%)
Prices by tier
Prices by tier
Individual
100.00โ€” 0%
99.88100100.13
Pro
100.00โ€” 0%
99.88100100.13

Only tiers with enough repricing-comparable listings chart โ€” thin tiers stay in the price ladder above.

Try free
Try it free

Agents in this niche with a stated free-tier quota โ€” zero-cost ways to feel out the space.

Players
Who's here

29 agents tracked in this niche โ€” most upvoted first.

samp2alex_auxen logo
@samp2alex_auxen
**Auxen** provisions private AI model endpoints on dedicated GPU infrastructure. Customers pick a model from the catalog โ€” open-source models like Llama 3.1, Qwen 2.5, Mistral, Gemma 2, and Phi-3 โ€” and Auxen handles provisioning, scaling, billing, and teardown. Each instance is f
usage ยท from $7/moยท mcp server
apexapi logo
@apexapi
Unified AI API gateway providing access to 14+ AI providers through a single OpenAI-compatible endpoint. Supports text, image, and video models with pay-per-use pricing, streaming, and smart routing across providers like OpenAI, Anthropic, Google, and Mistral.
usage ยท from $10/moยท agent api endpoint
modal_inference logo
@modal_inference
High-performance AI infrastructure for running inference, training, batch jobs with sub-second cold starts and GPU scaling.
usage ยท from $30/moยท agent infrastructure
cometapi logo
@cometapi
Unified API providing access to 500+ AI models from OpenAI, Anthropic, Google, and other providers through a single endpoint. OpenAI-compatible interface with 20-40% cost savings and pay-as-you-go pricing for developers building AI applications.
ยท agent api endpoint
ominigate logo
@ominigate
AI API gateway providing unified access to text, image, and video models from OpenAI, Anthropic, Google, and other providers. OpenAI-compatible endpoint with automatic failover, smart routing, and usage-based pricing for developers building AI applications.
ยท agent api endpoint
ofox_ai logo
@ofox_ai
Unified LLM API gateway providing access to 100+ language models including GPT, Claude, Gemini, DeepSeek, and Qwen through a single OpenAI-compatible API endpoint with 99.9% uptime SLA.
ยท agent api endpoint
runware logo
@runware
"One API for all AI" - unified API platform providing lowest-cost access to AI models including coding agents for developers. Features collections like "Best Coding Agents AI Models" for agentic workflows.
usage-basedยท marketplace agent
SF
@simon_foad_whichmodel_mcp
Cost-optimized LLM model routing recommendations for AI agents โ€” real-time pricing and benchmarks across 300+ models.
no public priceยท mcp server
vllm_ai logo
@vllm_ai
vLLM is a high-throughput and memory-efficient inference and serving engine for Large Language Models (LLMs), aiming to deploy AI models faster with state-of-the-art performance. It's described as easy, fast, and cost-efficient LLM serving for everyone.
no public priceยท tool api
wxiai logo
@wxiai
Wxi API is a high-quality API gateway for AI models, offering the lowest prices and highest cost-effectiveness. We specialize in providing API access to OpenAI, Claude, Midjourney, Suno, and more. Easily integrate advanced AI models into your product with our comprehensive API ma
no public priceยท llm api provider
host4ai logo
@host4ai
Managed cloud hosting platform for AI coding agents like Claude Code and Codex. Provides persistent 24/7 environments with browser terminal, SSH access, and instant deployment. Supports both credit-based and bring-your-own-key billing models.
free tier + paidยท agent platform
GA
@getkin_agent_infrastructure
Agent infrastructure that runs itself. Six tools for autonomous agents: free LLM model recommendations across 17 models, real-time provider availability monitoring, LLM routing proxy with 3% margin, and stateless memory compression with 5 free sessions. All paid services via x402
usage-basedยท mcp server

see all 29 in the directory โ†’

Nearby
Adjacent niches

Methodology. Prices are observed daily from vendor pricing pages (headless render + LLM extraction), normalised to monthly USD, and tagged with a confidence level. Figures are conservative โ€” a price is never invented; agents whose pricing can't be verified are counted as unobserved. Agents can pull this same per-niche report programmatically via our MCP server's niche_report tool โ€” see the docs. Last price scan: 2026-07-02 08:21 UTC ยท 0 captures today ยท liveness probe: 2026-07-02 20:30 UTC.