Chroma DB Tutorial: Local-First Vector Database for Prototyping and Development
Get started with Chroma DB for local vector search — learn to create collections, add documents with auto-embedding, query by similarity, and persist data to disk for rapid AI prototyping.
Why Chroma DB for Local Development
When you are building a RAG pipeline or semantic search feature, the last thing you want is to configure cloud credentials and manage remote infrastructure just to test your embedding logic. Chroma DB solves this by running entirely in-process — import it, create a collection, and start querying. No server, no API keys, no network calls.
Chroma is an open-source embedding database that prioritizes developer experience. It handles embedding generation automatically (using built-in embedding functions), stores vectors alongside your documents and metadata, and persists everything to a local directory. When you are ready for production, Chroma also offers a client-server mode.
Installation
Install Chroma with pip:
flowchart TD
DOC(["Document"])
CHUNK["Chunker<br/>recursive plus overlap"]
EMB["Embedding model"]
META["Attach metadata<br/>source, page, tenant"]
INDEX[("HNSW or IVF index<br/>in vector store")]
Q(["Query"])
QEMB["Embed query"]
SEARCH["ANN search<br/>cosine similarity"]
FILTER["Metadata filter<br/>tenant or date"]
HITS(["Top-k chunks"])
DOC --> CHUNK --> EMB --> META --> INDEX
Q --> QEMB --> SEARCH
INDEX --> SEARCH --> FILTER --> HITS
style INDEX fill:#4f46e5,stroke:#4338ca,color:#fff
style HITS fill:#059669,stroke:#047857,color:#fff
pip install chromadb
That is all you need. No Docker containers, no external services.
Creating a Client and Collection
A collection in Chroma is analogous to a table. It holds documents, their embeddings, metadata, and IDs:
import chromadb
# In-memory client (data lost when process exits)
client = chromadb.Client()
# Persistent client (data saved to disk)
client = chromadb.PersistentClient(path="./chroma_data")
# Create or get a collection
collection = client.get_or_create_collection(
name="articles",
metadata={"hnsw:space": "cosine"}
)
The hnsw:space parameter sets the distance metric. Options are cosine, l2 (Euclidean), and ip (inner product).
Hear it before you finish reading
Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.
Adding Documents
Chroma can automatically generate embeddings for your documents using its default embedding function (a local sentence-transformer model):
collection.add(
documents=[
"Vector databases store high-dimensional embeddings.",
"PostgreSQL is a relational database management system.",
"Transformers use attention mechanisms for sequence modeling.",
],
metadatas=[
{"source": "tutorial", "topic": "databases"},
{"source": "docs", "topic": "databases"},
{"source": "paper", "topic": "ml"},
],
ids=["doc-1", "doc-2", "doc-3"]
)
No separate embedding API call needed. Chroma downloads and runs a lightweight model (all-MiniLM-L6-v2 by default) the first time you add documents.
Querying by Similarity
Pass a query string and Chroma handles embedding and similarity search:
results = collection.query(
query_texts=["How do vector databases work?"],
n_results=2
)
for i, doc in enumerate(results["documents"][0]):
distance = results["distances"][0][i]
metadata = results["metadatas"][0][i]
print(f"Result {i+1} (distance: {distance:.4f}): {doc}")
print(f" Metadata: {metadata}")
You can also query with pre-computed embeddings using query_embeddings instead of query_texts.
Using Custom Embedding Functions
Swap the default model for OpenAI, Cohere, or any custom function:
from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction
openai_ef = OpenAIEmbeddingFunction(
api_key="sk-...",
model_name="text-embedding-3-small"
)
collection = client.get_or_create_collection(
name="articles_openai",
embedding_function=openai_ef
)
Or create your own embedding function:
from chromadb import Documents, EmbeddingFunction, Embeddings
class MyEmbeddingFunction(EmbeddingFunction):
def __call__(self, input: Documents) -> Embeddings:
# Your custom embedding logic here
return [compute_embedding(doc) for doc in input]
Filtering with Where Clauses
Combine semantic search with metadata filters:
Still reading? Stop comparing — try CallSphere live.
CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.
results = collection.query(
query_texts=["database technology"],
n_results=5,
where={"topic": {"$eq": "databases"}},
where_document={"$contains": "vector"}
)
The where clause filters on metadata fields. The where_document clause filters on document text content. Both narrow results before ranking by similarity.
Updating and Deleting
Update existing documents by ID:
collection.update(
ids=["doc-1"],
documents=["Updated content about vector databases and embeddings."],
metadatas=[{"source": "tutorial", "topic": "databases", "version": 2}]
)
Delete by ID or by filter:
collection.delete(ids=["doc-3"])
collection.delete(where={"topic": "ml"})
Persistence and Data Management
With PersistentClient, data is automatically saved to the specified directory. Inspect your collections:
# List all collections
print(client.list_collections())
# Get collection count
print(collection.count())
# Peek at stored data
print(collection.peek(limit=3))
FAQ
Does Chroma DB work without an internet connection?
Yes. The default embedding model downloads once and is cached locally. After that initial download, Chroma runs completely offline. If you use OpenAI or another cloud embedding function, those API calls require internet access, but the database itself is fully local.
How does Chroma DB compare to Pinecone for production use?
Chroma excels at prototyping, local development, and small-to-medium workloads. Pinecone is better suited for large-scale production with managed infrastructure, SLAs, and automatic scaling. Many teams prototype with Chroma locally and migrate to a managed solution when they need cloud-scale performance.
Can I use Chroma DB with LangChain or LlamaIndex?
Yes. Both frameworks have built-in Chroma integrations. In LangChain, use Chroma.from_documents() to create a vectorstore. In LlamaIndex, use ChromaVectorStore as your storage backend. The integration handles collection management and querying automatically.
#ChromaDB #VectorDatabase #Prototyping #Embeddings #Python #AgenticAI #LearnAI #AIEngineering
Try CallSphere AI Voice Agents
See how AI voice agents work for your industry. Live demo available -- no signup required.