Chroma DB Tutorial: Local-First Vector Database for Prototyping and Development

Why Chroma DB for Local Development

When you are building a RAG pipeline or semantic search feature, the last thing you want is to configure cloud credentials and manage remote infrastructure just to test your embedding logic. Chroma DB solves this by running entirely in-process — import it, create a collection, and start querying. No server, no API keys, no network calls.

Chroma is an open-source embedding database that prioritizes developer experience. It handles embedding generation automatically (using built-in embedding functions), stores vectors alongside your documents and metadata, and persists everything to a local directory. When you are ready for production, Chroma also offers a client-server mode.

Installation

Install Chroma with pip:

flowchart TD
    DOC(["Document"])
    CHUNK["Chunker<br/>recursive plus overlap"]
    EMB["Embedding model"]
    META["Attach metadata<br/>source, page, tenant"]
    INDEX[("HNSW or IVF index<br/>in vector store")]
    Q(["Query"])
    QEMB["Embed query"]
    SEARCH["ANN search<br/>cosine similarity"]
    FILTER["Metadata filter<br/>tenant or date"]
    HITS(["Top-k chunks"])
    DOC --> CHUNK --> EMB --> META --> INDEX
    Q --> QEMB --> SEARCH
    INDEX --> SEARCH --> FILTER --> HITS
    style INDEX fill:#4f46e5,stroke:#4338ca,color:#fff
    style HITS fill:#059669,stroke:#047857,color:#fff

pip install chromadb

That is all you need. No Docker containers, no external services.

Creating a Client and Collection

A collection in Chroma is analogous to a table. It holds documents, their embeddings, metadata, and IDs:

import chromadb

# In-memory client (data lost when process exits)
client = chromadb.Client()

# Persistent client (data saved to disk)
client = chromadb.PersistentClient(path="./chroma_data")

# Create or get a collection
collection = client.get_or_create_collection(
    name="articles",
    metadata={"hnsw:space": "cosine"}
)

The hnsw:space parameter sets the distance metric. Options are cosine, l2 (Euclidean), and ip (inner product).

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

Adding Documents

Chroma can automatically generate embeddings for your documents using its default embedding function (a local sentence-transformer model):

collection.add(
    documents=[
        "Vector databases store high-dimensional embeddings.",
        "PostgreSQL is a relational database management system.",
        "Transformers use attention mechanisms for sequence modeling.",
    ],
    metadatas=[
        {"source": "tutorial", "topic": "databases"},
        {"source": "docs", "topic": "databases"},
        {"source": "paper", "topic": "ml"},
    ],
    ids=["doc-1", "doc-2", "doc-3"]
)

No separate embedding API call needed. Chroma downloads and runs a lightweight model (all-MiniLM-L6-v2 by default) the first time you add documents.

Querying by Similarity

Pass a query string and Chroma handles embedding and similarity search:

results = collection.query(
    query_texts=["How do vector databases work?"],
    n_results=2
)

for i, doc in enumerate(results["documents"][0]):
    distance = results["distances"][0][i]
    metadata = results["metadatas"][0][i]
    print(f"Result {i+1} (distance: {distance:.4f}): {doc}")
    print(f"  Metadata: {metadata}")

You can also query with pre-computed embeddings using query_embeddings instead of query_texts.

Using Custom Embedding Functions

Swap the default model for OpenAI, Cohere, or any custom function:

from chromadb.utils.embedding_functions import OpenAIEmbeddingFunction

openai_ef = OpenAIEmbeddingFunction(
    api_key="sk-...",
    model_name="text-embedding-3-small"
)

collection = client.get_or_create_collection(
    name="articles_openai",
    embedding_function=openai_ef
)

Or create your own embedding function:

from chromadb import Documents, EmbeddingFunction, Embeddings

class MyEmbeddingFunction(EmbeddingFunction):
    def __call__(self, input: Documents) -> Embeddings:
        # Your custom embedding logic here
        return [compute_embedding(doc) for doc in input]

Filtering with Where Clauses

Combine semantic search with metadata filters:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

results = collection.query(
    query_texts=["database technology"],
    n_results=5,
    where={"topic": {"$eq": "databases"}},
    where_document={"$contains": "vector"}
)

The where clause filters on metadata fields. The where_document clause filters on document text content. Both narrow results before ranking by similarity.

Updating and Deleting

Update existing documents by ID:

collection.update(
    ids=["doc-1"],
    documents=["Updated content about vector databases and embeddings."],
    metadatas=[{"source": "tutorial", "topic": "databases", "version": 2}]
)

Delete by ID or by filter:

collection.delete(ids=["doc-3"])
collection.delete(where={"topic": "ml"})

Persistence and Data Management

With PersistentClient, data is automatically saved to the specified directory. Inspect your collections:

# List all collections
print(client.list_collections())

# Get collection count
print(collection.count())

# Peek at stored data
print(collection.peek(limit=3))

FAQ

Does Chroma DB work without an internet connection?

Yes. The default embedding model downloads once and is cached locally. After that initial download, Chroma runs completely offline. If you use OpenAI or another cloud embedding function, those API calls require internet access, but the database itself is fully local.

How does Chroma DB compare to Pinecone for production use?

Chroma excels at prototyping, local development, and small-to-medium workloads. Pinecone is better suited for large-scale production with managed infrastructure, SLAs, and automatic scaling. Many teams prototype with Chroma locally and migrate to a managed solution when they need cloud-scale performance.

Can I use Chroma DB with LangChain or LlamaIndex?

Yes. Both frameworks have built-in Chroma integrations. In LangChain, use Chroma.from_documents() to create a vectorstore. In LlamaIndex, use ChromaVectorStore as your storage backend. The integration handles collection management and querying automatically.

#ChromaDB #VectorDatabase #Prototyping #Embeddings #Python #AgenticAI #LearnAI #AIEngineering

Chroma DB Tutorial: Local-First Vector Database for Prototyping and Development

Why Chroma DB for Local Development

Installation

Creating a Client and Collection

Adding Documents

Querying by Similarity

Using Custom Embedding Functions

Filtering with Where Clauses

Updating and Deleting

Persistence and Data Management

FAQ

Does Chroma DB work without an internet connection?

How does Chroma DB compare to Pinecone for production use?

Can I use Chroma DB with LangChain or LlamaIndex?

Try CallSphere AI Voice Agents

Related Articles You May Like

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

Smolagents: Hugging Face's Code-First Agent Framework Reviewed

Vector DB Sharding Strategies for Hundreds of Millions of Vectors

Vector Index Algorithms Compared: HNSW, IVF, ScaNN, DiskANN

Real-Time Vector Indexing: Streaming Updates Without Downtime

Cost Math for Vector Databases at Scale: Storage, Compute, and Egress