Building an AI Agent Marketplace: Architecture for Agent Discovery and Deployment
Design a production-grade AI agent marketplace with catalog management, semantic search, automated provisioning, and usage-based billing. Learn the core data models and API patterns that power agent distribution at scale.
Why Agent Marketplaces Matter
As organizations build dozens or hundreds of specialized AI agents, discovery becomes a bottleneck. Teams duplicate effort because they cannot find existing agents that already solve their problem. An agent marketplace solves this by providing a centralized catalog where publishers list agents and consumers discover, evaluate, and deploy them.
The architecture of an agent marketplace shares DNA with app stores and package registries, but agents introduce unique requirements: they need runtime provisioning, tool access management, credential isolation, and usage metering that traditional software catalogs do not handle.
Core Data Model
The foundation of any marketplace is the catalog. Each listing represents a published agent with its metadata, pricing, and deployment configuration:
flowchart LR
IN(["Input text"])
TOK["Tokenizer<br/>BPE or SentencePiece"]
EMB["Token plus position<br/>embeddings"]
subgraph BLOCK["Transformer block (xN)"]
ATTN["Multi head<br/>self attention"]
NORM1["Layer norm"]
FF["Feed forward<br/>MLP"]
NORM2["Layer norm"]
end
HEAD["LM head plus<br/>softmax"]
SAMP["Sampling<br/>top-p, temperature"]
OUT(["Next token"])
IN --> TOK --> EMB --> ATTN --> NORM1 --> FF --> NORM2 --> HEAD --> SAMP --> OUT
SAMP -.->|Append| EMB
style BLOCK fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
style ATTN fill:#4f46e5,stroke:#4338ca,color:#fff
style OUT fill:#059669,stroke:#047857,color:#fff
from dataclasses import dataclass, field
from enum import Enum
from typing import Optional
import uuid
from datetime import datetime
class PricingModel(Enum):
FREE = "free"
USAGE_BASED = "usage_based"
SUBSCRIPTION = "subscription"
ONE_TIME = "one_time"
class AgentStatus(Enum):
DRAFT = "draft"
IN_REVIEW = "in_review"
PUBLISHED = "published"
SUSPENDED = "suspended"
DEPRECATED = "deprecated"
@dataclass
class AgentListing:
id: str = field(default_factory=lambda: str(uuid.uuid4()))
publisher_id: str = ""
name: str = ""
slug: str = ""
description: str = ""
long_description: str = ""
version: str = "1.0.0"
category: str = ""
tags: list[str] = field(default_factory=list)
status: AgentStatus = AgentStatus.DRAFT
pricing_model: PricingModel = PricingModel.FREE
price_per_invocation: Optional[float] = None
monthly_price: Optional[float] = None
required_tools: list[str] = field(default_factory=list)
required_credentials: list[str] = field(default_factory=list)
deployment_config: dict = field(default_factory=dict)
install_count: int = 0
avg_rating: float = 0.0
created_at: datetime = field(default_factory=datetime.utcnow)
updated_at: datetime = field(default_factory=datetime.utcnow)
This model captures everything a consumer needs to evaluate an agent: what it does, what it costs, what tools it requires, and how it gets deployed.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Search and Discovery
Simple keyword search is insufficient for agent discovery. Consumers describe problems, not implementation details. Semantic search powered by embeddings lets users search by intent:
import numpy as np
from typing import Any
class AgentSearchService:
def __init__(self, embedding_client, vector_store):
self.embedding_client = embedding_client
self.vector_store = vector_store
async def index_listing(self, listing: AgentListing):
searchable_text = (
f"{listing.name} {listing.description} "
f"{listing.long_description} {' '.join(listing.tags)}"
)
embedding = await self.embedding_client.embed(searchable_text)
await self.vector_store.upsert(
id=listing.id,
vector=embedding,
metadata={
"name": listing.name,
"category": listing.category,
"pricing_model": listing.pricing_model.value,
"avg_rating": listing.avg_rating,
"install_count": listing.install_count,
},
)
async def search(
self,
query: str,
category: str | None = None,
pricing_model: str | None = None,
min_rating: float = 0.0,
limit: int = 20,
) -> list[dict[str, Any]]:
query_embedding = await self.embedding_client.embed(query)
filters = {}
if category:
filters["category"] = category
if pricing_model:
filters["pricing_model"] = pricing_model
if min_rating > 0:
filters["avg_rating"] = {"$gte": min_rating}
results = await self.vector_store.query(
vector=query_embedding,
filter=filters,
top_k=limit,
)
return results
A consumer searching for "handle customer refund requests" finds the right agent even if its listing never uses the word "refund."
Provisioning Pipeline
When a consumer installs an agent, the marketplace must provision it — allocating resources, injecting credentials, and configuring tool access. This provisioning pipeline is the most complex component:
class ProvisioningService:
def __init__(self, secret_manager, runtime_manager, billing_service):
self.secret_manager = secret_manager
self.runtime_manager = runtime_manager
self.billing_service = billing_service
async def provision_agent(
self, listing: AgentListing, tenant_id: str
) -> dict:
# Step 1: Validate tenant has required credentials
missing = await self._check_credentials(
tenant_id, listing.required_credentials
)
if missing:
raise ValueError(
f"Missing credentials: {', '.join(missing)}"
)
# Step 2: Create isolated runtime environment
runtime = await self.runtime_manager.create_runtime(
agent_id=listing.id,
tenant_id=tenant_id,
config=listing.deployment_config,
)
# Step 3: Inject tenant credentials into runtime
for cred_name in listing.required_credentials:
cred_value = await self.secret_manager.get_secret(
tenant_id, cred_name
)
await self.runtime_manager.inject_secret(
runtime.id, cred_name, cred_value
)
# Step 4: Set up billing metering
await self.billing_service.create_meter(
tenant_id=tenant_id,
agent_id=listing.id,
pricing_model=listing.pricing_model,
)
return {
"runtime_id": runtime.id,
"endpoint": runtime.endpoint,
"status": "provisioned",
}
async def _check_credentials(
self, tenant_id: str, required: list[str]
) -> list[str]:
missing = []
for cred_name in required:
exists = await self.secret_manager.has_secret(
tenant_id, cred_name
)
if not exists:
missing.append(cred_name)
return missing
Each tenant gets an isolated runtime with its own credentials. The marketplace never shares secrets between tenants.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Billing Integration
Usage-based billing requires metering every agent invocation. A lightweight metering layer records events and aggregates them for the billing system:
from collections import defaultdict
class UsageMeter:
def __init__(self, event_store):
self.event_store = event_store
async def record_invocation(
self, tenant_id: str, agent_id: str, tokens_used: int,
duration_ms: int
):
event = {
"tenant_id": tenant_id,
"agent_id": agent_id,
"tokens_used": tokens_used,
"duration_ms": duration_ms,
"timestamp": datetime.utcnow().isoformat(),
}
await self.event_store.append(event)
async def get_usage_summary(
self, tenant_id: str, agent_id: str, period_start: str
) -> dict:
events = await self.event_store.query(
tenant_id=tenant_id,
agent_id=agent_id,
since=period_start,
)
total_invocations = len(events)
total_tokens = sum(e["tokens_used"] for e in events)
total_duration = sum(e["duration_ms"] for e in events)
return {
"invocations": total_invocations,
"total_tokens": total_tokens,
"avg_duration_ms": (
total_duration / total_invocations
if total_invocations > 0
else 0
),
}
FAQ
How do you handle agent versioning in a marketplace?
Treat each version as an immutable artifact. Published versions cannot be modified — only new versions can be released. Consumers pin to a specific version and receive upgrade notifications. The marketplace maintains compatibility metadata so consumers can assess upgrade risk.
What is the biggest architectural challenge in agent marketplaces?
Credential isolation. Every tenant must have their own secrets injected into agent runtimes without any cross-tenant leakage. Use a dedicated secret manager with tenant-scoped namespaces and audit every credential access.
Should marketplace agents run in the publisher's infrastructure or the consumer's?
Both models exist. Publisher-hosted simplifies deployment but raises data privacy concerns. Consumer-hosted gives full data control but increases deployment complexity. Most production marketplaces offer both options and let the consumer choose based on their compliance requirements.
#AgentMarketplace #AgentDiscovery #AgentDeployment #PlatformArchitecture #AgenticAI #LearnAI #AIEngineering
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.