Knowledge Graphs for AI Agents: Structured Memory with Entities and Relations
Learn how to build knowledge graph memory for AI agents — extracting entities and relationships from text, storing them in graph structures, and querying connected information for richer agent reasoning.
Why Knowledge Graphs for Agent Memory?
Vector stores excel at finding semantically similar text, but they lose structural relationships. If an agent stores "Alice manages the engineering team" and "The engineering team is building Project X," a vector search for "Who is responsible for Project X?" might not connect these two facts. A knowledge graph stores explicit relationships between entities — Alice MANAGES Engineering, Engineering BUILDS Project X — enabling multi-hop reasoning that flat memory cannot.
Knowledge graphs represent information as a network of entities (nodes) connected by typed relationships (edges). This structure mirrors how knowledge naturally relates: people belong to teams, teams own projects, projects have deadlines.
Defining the Graph Structure
Start with a simple in-memory graph that stores entities and their relationships.
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
flowchart TD
MSG(["New message"])
WORKING["Working memory<br/>rolling window"]
EPISODIC[("Episodic memory<br/>past sessions")]
SEMANTIC[("Semantic memory<br/>facts and preferences")]
SUM["Summarizer<br/>compresses old turns"]
ROUTER{"Retrieve<br/>needed memories"}
PROMPT["Assembled context"]
LLM["LLM"]
UPD["Memory updater<br/>writes new facts"]
MSG --> WORKING --> ROUTER
ROUTER -->|Past sessions| EPISODIC
ROUTER -->|User facts| SEMANTIC
EPISODIC --> SUM --> PROMPT
SEMANTIC --> PROMPT
WORKING --> PROMPT --> LLM --> UPD
UPD --> EPISODIC
UPD --> SEMANTIC
style ROUTER fill:#4f46e5,stroke:#4338ca,color:#fff
style LLM fill:#f59e0b,stroke:#d97706,color:#1f2937
style EPISODIC fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
style SEMANTIC fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
from dataclasses import dataclass, field
from typing import Dict, List, Set, Optional, Tuple
from datetime import datetime
@dataclass
class Entity:
name: str
entity_type: str # "person", "team", "project", "concept"
properties: Dict[str, str] = field(default_factory=dict)
created_at: datetime = field(default_factory=datetime.utcnow)
@dataclass
class Relation:
source: str # entity name
relation_type: str # "MANAGES", "WORKS_ON", "DEPENDS_ON"
target: str # entity name
properties: Dict[str, str] = field(default_factory=dict)
confidence: float = 1.0
created_at: datetime = field(default_factory=datetime.utcnow)
class KnowledgeGraph:
def __init__(self):
self.entities: Dict[str, Entity] = {}
self.relations: List[Relation] = []
def add_entity(self, name: str, entity_type: str, **properties) -> Entity:
if name in self.entities:
# Update existing entity with new properties
self.entities[name].properties.update(properties)
return self.entities[name]
entity = Entity(name=name, entity_type=entity_type, properties=properties)
self.entities[name] = entity
return entity
def add_relation(
self, source: str, relation_type: str, target: str,
confidence: float = 1.0, **properties
) -> Relation:
# Ensure both entities exist
if source not in self.entities:
self.add_entity(source, "unknown")
if target not in self.entities:
self.add_entity(target, "unknown")
# Avoid duplicate relations
for r in self.relations:
if r.source == source and r.relation_type == relation_type and r.target == target:
r.confidence = max(r.confidence, confidence)
r.properties.update(properties)
return r
relation = Relation(
source=source, relation_type=relation_type, target=target,
confidence=confidence, properties=properties,
)
self.relations.append(relation)
return relation
def get_neighbors(self, entity_name: str, direction: str = "both") -> List[Relation]:
"""Get all relations involving an entity."""
results = []
for r in self.relations:
if direction in ("out", "both") and r.source == entity_name:
results.append(r)
if direction in ("in", "both") and r.target == entity_name:
results.append(r)
return results
def find_path(
self, start: str, end: str, max_depth: int = 4
) -> Optional[List[Relation]]:
"""BFS to find the shortest path between two entities."""
if start not in self.entities or end not in self.entities:
return None
queue: List[Tuple[str, List[Relation]]] = [(start, [])]
visited: Set[str] = {start}
while queue:
current, path = queue.pop(0)
if current == end:
return path
if len(path) >= max_depth:
continue
for rel in self.get_neighbors(current, direction="out"):
next_node = rel.target
if next_node not in visited:
visited.add(next_node)
queue.append((next_node, path + [rel]))
return None
Extracting Entities and Relations from Text
Use the LLM to parse unstructured text into graph triples.
import openai
import json
client = openai.OpenAI()
def extract_graph_triples(text: str) -> List[Dict]:
"""Extract entities and relationships from text using an LLM."""
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{
"role": "system",
"content": (
"Extract entities and relationships from the text. "
"Return JSON with 'entities' (list of {name, type}) "
"and 'relations' (list of {source, relation, target}). "
"Use UPPER_CASE for relation types."
),
}, {
"role": "user",
"content": text,
}],
response_format={"type": "json_object"},
max_tokens=500,
)
return json.loads(response.choices[0].message.content)
def update_graph_from_text(graph: KnowledgeGraph, text: str):
"""Parse text and add extracted knowledge to the graph."""
extracted = extract_graph_triples(text)
for ent in extracted.get("entities", []):
graph.add_entity(ent["name"], ent.get("type", "unknown"))
for rel in extracted.get("relations", []):
graph.add_relation(rel["source"], rel["relation"], rel["target"])
# Example
graph = KnowledgeGraph()
update_graph_from_text(graph, (
"Alice manages the backend team. The backend team is building "
"the payment service. The payment service depends on Stripe API."
))
# Now we can query: Who is connected to the payment service?
neighbors = graph.get_neighbors("payment service")
# Returns: backend team BUILDS payment service, payment service DEPENDS_ON Stripe API
Querying the Graph for Agent Context
When the agent receives a question, extract the relevant entities and pull their neighborhood from the graph.
def build_graph_context(graph: KnowledgeGraph, query: str, depth: int = 2) -> str:
"""Build a context string from graph data relevant to a query."""
# Extract entities mentioned in the query
extracted = extract_graph_triples(query)
entity_names = [e["name"] for e in extracted.get("entities", [])]
relevant_relations = []
visited_entities = set()
def explore(entity_name: str, current_depth: int):
if current_depth > depth or entity_name in visited_entities:
return
visited_entities.add(entity_name)
for rel in graph.get_neighbors(entity_name):
relevant_relations.append(rel)
next_entity = (
rel.target if rel.source == entity_name else rel.source
)
explore(next_entity, current_depth + 1)
for name in entity_names:
if name in graph.entities:
explore(name, 0)
if not relevant_relations:
return ""
lines = ["Relevant knowledge from memory:"]
for rel in relevant_relations:
lines.append(f"- {rel.source} {rel.relation_type} {rel.target}")
return "\n".join(lines)
This approach lets the agent answer complex questions like "Who is ultimately responsible for the Stripe integration?" by traversing the chain: Alice MANAGES backend team, backend team BUILDS payment service, payment service DEPENDS_ON Stripe API.
FAQ
When should I use a knowledge graph instead of a vector store?
Use a knowledge graph when your agent needs to reason about relationships between entities — organizational structures, dependency chains, causal links. Use a vector store when you need to find semantically similar text passages. Many production systems use both: a vector store for retrieval and a knowledge graph for reasoning.
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
Do I need a dedicated graph database like Neo4j?
For prototypes and agents with fewer than 10,000 entities, the in-memory Python implementation above works well. For production systems with large graphs, use Neo4j, Amazon Neptune, or even PostgreSQL with recursive CTEs. The query performance and built-in graph algorithms justify the infrastructure cost at scale.
How do I handle conflicting information in the graph?
Add a confidence score and timestamp to each relation. When conflicting facts arrive, keep both but mark the older one with lower confidence. When querying, prefer high-confidence and recent relations. You can also add a "CONTRADICTS" relation type to explicitly model conflicts.
#KnowledgeGraphs #EntityExtraction #GraphDatabases #StructuredMemory #AgenticAI #LearnAI #AIEngineering
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.