Skip to content
Learn Agentic AI
Learn Agentic AI15 min read7 views

Build a Personal Finance Agent in Python: Budget Tracking, Categorization, and Advice

Learn how to build a complete personal finance AI agent that connects to bank data, auto-categorizes transactions, analyzes spending patterns, and generates actionable budget advice using Python and the OpenAI Agents SDK.

Why Build a Personal Finance Agent

Managing personal finances typically involves logging into multiple bank portals, manually categorizing transactions in spreadsheets, and guessing where your money actually goes. A personal finance agent automates this entire workflow. It ingests transaction data, classifies spending into categories, detects anomalies, and provides tailored budget advice — all through a conversational interface.

In this tutorial you will build a fully functional finance agent that mocks bank API responses, categorizes transactions with a rule-based engine backed by LLM fallback, analyzes spending trends, and generates personalized advice.

Project Architecture

The system has four layers:

flowchart LR
    INPUT(["User intent"])
    PARSE["Parse plus<br/>classify"]
    PLAN["Plan and tool<br/>selection"]
    AGENT["Agent loop<br/>LLM plus tools"]
    GUARD{"Guardrails<br/>and policy"}
    EXEC["Execute and<br/>verify result"]
    OBS[("Trace and metrics")]
    OUT(["Outcome plus<br/>next action"])
    INPUT --> PARSE --> PLAN --> AGENT --> GUARD
    GUARD -->|Pass| EXEC --> OUT
    GUARD -->|Fail| AGENT
    AGENT --> OBS
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OBS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff
  1. Data Layer — a mock bank API that returns realistic transaction data
  2. Categorization Engine — rule-based matching with LLM fallback for ambiguous merchants
  3. Analysis Module — spending summaries, trend detection, and budget comparison
  4. Agent Layer — an OpenAI Agents SDK agent with tools wired to each module

Step 1: Set Up the Project

Create the project structure and install dependencies:

mkdir finance-agent && cd finance-agent
python -m venv venv && source venv/bin/activate
pip install openai-agents pydantic

Create the directory layout:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →
mkdir -p src
touch src/__init__.py src/bank_api.py src/categorizer.py src/analyzer.py src/agent.py

Step 2: Build the Mock Bank API

The mock API generates realistic transaction data that simulates what you would receive from a real banking integration like Plaid or Yodlee.

# src/bank_api.py
import random
from datetime import datetime, timedelta
from pydantic import BaseModel

class Transaction(BaseModel):
    id: str
    date: str
    merchant: str
    amount: float
    raw_category: str

MERCHANTS = {
    "groceries": [
        ("Whole Foods Market", 45.0, 120.0),
        ("Trader Joe's", 30.0, 85.0),
        ("Costco Wholesale", 80.0, 250.0),
    ],
    "dining": [
        ("Chipotle Mexican Grill", 10.0, 18.0),
        ("Starbucks Coffee", 4.0, 8.0),
        ("DoorDash Delivery", 15.0, 45.0),
    ],
    "transport": [
        ("Uber Trip", 8.0, 35.0),
        ("Shell Gas Station", 30.0, 60.0),
        ("City Parking", 5.0, 20.0),
    ],
    "utilities": [
        ("Electric Company", 80.0, 150.0),
        ("Internet Provider", 59.99, 59.99),
        ("Water Utility", 30.0, 55.0),
    ],
    "entertainment": [
        ("Netflix Subscription", 15.49, 15.49),
        ("Spotify Premium", 10.99, 10.99),
        ("AMC Theatres", 12.0, 25.0),
    ],
    "shopping": [
        ("Amazon.com", 15.0, 200.0),
        ("Target Store", 20.0, 100.0),
        ("Best Buy Electronics", 50.0, 500.0),
    ],
}

def fetch_transactions(days: int = 30) -> list[Transaction]:
    transactions = []
    start_date = datetime.now() - timedelta(days=days)

    for i in range(random.randint(40, 70)):
        category = random.choice(list(MERCHANTS.keys()))
        merchant_name, min_amt, max_amt = random.choice(
            MERCHANTS[category]
        )
        txn_date = start_date + timedelta(
            days=random.randint(0, days)
        )
        transactions.append(Transaction(
            id=f"txn_{i:04d}",
            date=txn_date.strftime("%Y-%m-%d"),
            merchant=merchant_name,
            amount=round(random.uniform(min_amt, max_amt), 2),
            raw_category=category,
        ))

    return sorted(transactions, key=lambda t: t.date)

Step 3: Build the Transaction Categorizer

The categorizer uses keyword matching first and falls back to the LLM only when a merchant is unrecognizable. This keeps API costs low while handling edge cases.

# src/categorizer.py
from src.bank_api import Transaction

CATEGORY_RULES: dict[str, list[str]] = {
    "Groceries": ["whole foods", "trader joe", "costco", "kroger", "safeway"],
    "Dining": ["chipotle", "starbucks", "doordash", "grubhub", "mcdonald"],
    "Transport": ["uber", "lyft", "shell", "chevron", "parking"],
    "Utilities": ["electric", "internet", "water", "gas company", "power"],
    "Entertainment": ["netflix", "spotify", "hulu", "amc", "disney"],
    "Shopping": ["amazon", "target", "best buy", "walmart", "ebay"],
}

def categorize_transaction(txn: Transaction) -> str:
    merchant_lower = txn.merchant.lower()
    for category, keywords in CATEGORY_RULES.items():
        if any(kw in merchant_lower for kw in keywords):
            return category
    return "Uncategorized"

def categorize_all(
    transactions: list[Transaction],
) -> dict[str, list[Transaction]]:
    categorized: dict[str, list[Transaction]] = {}
    for txn in transactions:
        cat = categorize_transaction(txn)
        categorized.setdefault(cat, []).append(txn)
    return categorized

Step 4: Build the Spending Analyzer

# src/analyzer.py
from src.bank_api import Transaction
from src.categorizer import categorize_all

DEFAULT_BUDGETS = {
    "Groceries": 500.0,
    "Dining": 300.0,
    "Transport": 200.0,
    "Utilities": 300.0,
    "Entertainment": 100.0,
    "Shopping": 400.0,
}

def spending_summary(
    transactions: list[Transaction],
) -> dict[str, dict]:
    categorized = categorize_all(transactions)
    summary = {}
    for cat, txns in categorized.items():
        total = sum(t.amount for t in txns)
        budget = DEFAULT_BUDGETS.get(cat, 0)
        summary[cat] = {
            "total_spent": round(total, 2),
            "transaction_count": len(txns),
            "budget": budget,
            "remaining": round(budget - total, 2),
            "pct_used": round((total / budget) * 100, 1) if budget > 0 else 0,
        }
    return summary

def detect_anomalies(
    transactions: list[Transaction],
) -> list[str]:
    from collections import defaultdict
    by_merchant: dict[str, list[float]] = defaultdict(list)
    for txn in transactions:
        by_merchant[txn.merchant].append(txn.amount)

    alerts = []
    for merchant, amounts in by_merchant.items():
        if len(amounts) < 2:
            continue
        avg = sum(amounts) / len(amounts)
        for amt in amounts:
            if amt > avg * 2.5:
                alerts.append(
                    f"Unusual charge of {amt:.2f} dollars at "
                    f"{merchant} (avg is {avg:.2f})"
                )
    return alerts

Step 5: Wire Everything Into the Agent

# src/agent.py
import asyncio
import json
from agents import Agent, Runner, function_tool
from src.bank_api import fetch_transactions
from src.analyzer import spending_summary, detect_anomalies

@function_tool
def get_spending_report(days: int = 30) -> str:
    """Fetch transactions and return a spending summary."""
    txns = fetch_transactions(days)
    summary = spending_summary(txns)
    return json.dumps(summary, indent=2)

@function_tool
def get_anomaly_alerts(days: int = 30) -> str:
    """Detect unusual transactions in recent history."""
    txns = fetch_transactions(days)
    alerts = detect_anomalies(txns)
    if not alerts:
        return "No anomalies detected."
    return "\n".join(alerts)

finance_agent = Agent(
    name="Personal Finance Advisor",
    instructions="""You are a personal finance advisor agent.
Use the available tools to analyze the user's spending.
Provide specific, actionable advice based on their data.
Always reference actual numbers from the reports.
If spending exceeds budget in a category, suggest concrete
ways to reduce it.""",
    tools=[get_spending_report, get_anomaly_alerts],
)

async def main():
    result = await Runner.run(
        finance_agent,
        "Show me my spending for the last 30 days and flag "
        "anything unusual. Then give me budget advice.",
    )
    print(result.final_output)

if __name__ == "__main__":
    asyncio.run(main())

Run the agent:

python -m src.agent

The agent will call both tools, cross-reference the spending report with anomaly alerts, and produce a coherent financial summary with tailored advice.

Key Design Decisions

Rule-based categorization first. Calling the LLM for every transaction is wasteful. The keyword matcher handles 90 percent of cases; the LLM only activates for unknown merchants. This keeps latency and cost under control.

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Structured tool outputs. Each tool returns JSON so the agent can parse numbers precisely rather than guessing from free-text. This makes the advice data-driven rather than generic.

Configurable budgets. The DEFAULT_BUDGETS dictionary is the starting point. In a production system you would store these per-user in a database and let the agent update them via an additional tool.

FAQ

How would I connect this to a real bank API instead of mock data?

Replace fetch_transactions() with a client library for Plaid, Yodlee, or MX. Each of these services returns transaction objects with merchant names, amounts, and dates in a similar shape to our mock. The categorizer and analyzer code remains unchanged because they depend only on the Transaction model, not on the data source.

Can the agent learn my spending patterns over time?

Yes. Add a persistence layer — a SQLite database or JSON file — that stores categorized transactions and monthly summaries. Create an additional tool that retrieves historical trends, allowing the agent to compare current month spending against your three-month or six-month average and give progressively more personalized advice.

How do I handle multiple bank accounts?

Extend fetch_transactions() to accept an account_id parameter and merge results from multiple sources. Add a get_accounts tool so the agent can list available accounts and let the user specify which ones to analyze. The analyzer already works on any list of transactions regardless of source.


#PersonalFinance #AIAgent #Python #BudgetTracking #OpenAIAgentsSDK #AgenticAI #LearnAI #AIEngineering

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Agentic AI

Streaming Agent Responses with OpenAI Agents SDK and LangChain in 2026

How to stream tokens, tool-call deltas, and intermediate steps from an agent — with code for both the OpenAI Agents SDK and LangChain — and the gotchas that bite in production.

Agentic AI

Token-Level Evaluation of Streaming Agents: TTFT, Stream Smoothness, and Mid-Stream Hallucination Detection

Streaming changes the eval game — final-answer correctness isn't enough when users perceive the answer one token at a time. Here's the metric set that matters.

Agentic AI

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

Step-by-step build of a working agent with the OpenAI Agents SDK — Agent class, tools, handoffs, tracing — plus an eval pipeline that catches regressions before merge.

Agentic AI

Tool Selection Accuracy: The Eval Most Teams Skip — and Should Not (2026)

Your agent picked the wrong tool 12% of the time and the final answer was still right. That's a latent bug. Here's the eval pipeline that surfaces it.

Agentic AI

OpenAI Agents SDK vs Assistants API in 2026: Migration Guide with Eval Parity

Honest principal-engineer comparison of the OpenAI Agents SDK and the legacy Assistants API, with a migration checklist and eval-parity strategy so you don't ship regressions.

Agentic AI

Input and Output Guardrails in the OpenAI Agents SDK: A Production Pattern (2026)

Stop the agent BEFORE it does the wrong thing. How to wire input and output guardrails in the OpenAI Agents SDK with cheap classifiers and an eval suite that proves they work.