Building a Medical Image Analysis Agent: X-Ray, Scan, and Lab Report Reading

Critical Disclaimer

This article is for educational purposes only. Medical image analysis AI must go through rigorous clinical validation, regulatory approval (FDA 510(k) or equivalent), and institutional review before any use in clinical decision-making. The code examples here demonstrate technical concepts and must never be used for actual medical diagnosis. Always consult qualified healthcare professionals for medical decisions.

Why Medical Image Analysis Matters

Radiologists in the United States read an average of one image every 3-4 seconds during a typical workday. AI assistants can help by flagging potential findings for human review, prioritizing urgent cases in the reading queue, and reducing the chance that subtle abnormalities are missed during high-volume shifts.

flowchart LR
    CALLER(["Patient or Caregiver"])
    subgraph TEL["Telephony"]
        SIP["Twilio SIP and PSTN"]
    end
    subgraph BRAIN["Healthcare AI Agent"]
        STT["Streaming STT<br/>Deepgram or Whisper"]
        NLU{"Intent and<br/>Entity Extraction"}
        TOOLS["Tool Calls"]
        TTS["Streaming TTS<br/>ElevenLabs or Rime"]
    end
    subgraph DATA["Live Data Plane"]
        CRM[("CRM and Notes")]
        CAL[("Calendar and<br/>Schedule")]
        KB[("Knowledge Base<br/>and Policies")]
    end
    subgraph OUT["Outcomes"]
        O1(["Appointment booked"])
        O2(["Prescription refill request"])
        O3(["Triage to clinician"])
    end
    CALLER --> SIP --> STT --> NLU
    NLU -->|Lookup| TOOLS
    TOOLS <--> CRM
    TOOLS <--> CAL
    TOOLS <--> KB
    NLU --> TTS --> SIP --> CALLER
    NLU -->|Resolved| O1
    NLU -->|Schedule| O2
    NLU -->|Escalate| O3
    style CALLER fill:#f1f5f9,stroke:#64748b,color:#0f172a
    style NLU fill:#4f46e5,stroke:#4338ca,color:#fff
    style O1 fill:#059669,stroke:#047857,color:#fff
    style O2 fill:#0ea5e9,stroke:#0369a1,color:#fff
    style O3 fill:#f59e0b,stroke:#d97706,color:#1f2937

The technical pipeline for medical image analysis includes DICOM image loading and preprocessing, region-of-interest detection, finding classification, structured report generation, and confidence-based routing for human review.

Working with Medical Images (DICOM)

Medical images use the DICOM format, which contains both pixel data and rich metadata:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live →

Try Live Demo →

import pydicom
import numpy as np
from dataclasses import dataclass

@dataclass
class MedicalImage:
    pixel_array: np.ndarray
    modality: str          # "CR", "CT", "MR", etc.
    body_part: str
    patient_id: str
    study_date: str
    window_center: float
    window_width: float
    metadata: dict

def load_dicom(file_path: str) -> MedicalImage:
    """Load a DICOM file and extract image with metadata."""
    ds = pydicom.dcmread(file_path)

    pixels = ds.pixel_array.astype(np.float32)

    # Apply rescale slope and intercept for Hounsfield units (CT)
    if hasattr(ds, "RescaleSlope"):
        pixels = pixels * ds.RescaleSlope + ds.RescaleIntercept

    return MedicalImage(
        pixel_array=pixels,
        modality=getattr(ds, "Modality", "Unknown"),
        body_part=getattr(ds, "BodyPartExamined", "Unknown"),
        patient_id=getattr(ds, "PatientID", "Anonymous"),
        study_date=getattr(ds, "StudyDate", "Unknown"),
        window_center=float(getattr(ds, "WindowCenter", 0)),
        window_width=float(getattr(ds, "WindowWidth", 1)),
        metadata={
            "rows": ds.Rows,
            "columns": ds.Columns,
            "bits_stored": ds.BitsStored,
            "photometric": getattr(ds, "PhotometricInterpretation", ""),
        },
    )

Image Preprocessing for Analysis

Medical images need windowing (adjusting contrast to highlight specific tissue types) and normalization:

def apply_windowing(
    image: MedicalImage,
    window_center: float | None = None,
    window_width: float | None = None,
) -> np.ndarray:
    """Apply windowing to enhance specific tissue visibility."""
    wc = window_center or image.window_center
    ww = window_width or image.window_width

    pixels = image.pixel_array.copy()
    lower = wc - ww / 2
    upper = wc + ww / 2

    pixels = np.clip(pixels, lower, upper)
    pixels = ((pixels - lower) / (upper - lower) * 255).astype(np.uint8)

    return pixels

# Common window presets for chest X-rays and CT
WINDOW_PRESETS = {
    "lung": {"center": -600, "width": 1500},
    "mediastinum": {"center": 40, "width": 400},
    "bone": {"center": 400, "width": 1800},
    "soft_tissue": {"center": 50, "width": 350},
}

def preprocess_for_analysis(
    image: MedicalImage,
    preset: str = "soft_tissue"
) -> np.ndarray:
    """Preprocess medical image with appropriate windowing."""
    params = WINDOW_PRESETS.get(preset, WINDOW_PRESETS["soft_tissue"])

    windowed = apply_windowing(
        image,
        window_center=params["center"],
        window_width=params["width"],
    )

    # Normalize to 0-1 range
    normalized = windowed.astype(np.float32) / 255.0

    return normalized

Finding Detection with Region Proposals

Use a region proposal approach to identify areas of interest for further analysis:

import cv2

@dataclass
class Finding:
    region: tuple         # (x, y, w, h)
    finding_type: str     # "opacity", "nodule", "fracture", etc.
    confidence: float
    description: str
    severity: str         # "normal", "mild", "moderate", "severe"
    requires_review: bool

def detect_regions_of_interest(
    image: np.ndarray,
    sensitivity: float = 0.5,
) -> list[dict]:
    """Detect regions that may contain findings."""
    img_uint8 = (image * 255).astype(np.uint8)

    # Bilateral filter preserves edges while smoothing
    filtered = cv2.bilateralFilter(img_uint8, 9, 75, 75)

    # Detect potential abnormalities via intensity analysis
    mean_intensity = np.mean(filtered)
    std_intensity = np.std(filtered)

    # Threshold for unusual intensity regions
    threshold = mean_intensity + sensitivity * std_intensity
    _, binary = cv2.threshold(filtered, int(threshold), 255, cv2.THRESH_BINARY)

    contours, _ = cv2.findContours(
        binary, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE
    )

    regions = []
    for contour in contours:
        area = cv2.contourArea(contour)
        if area < 100:
            continue

        x, y, w, h = cv2.boundingRect(contour)
        region_pixels = image[y:y+h, x:x+w]

        regions.append({
            "bbox": (x, y, w, h),
            "area": area,
            "mean_intensity": float(np.mean(region_pixels)),
            "std_intensity": float(np.std(region_pixels)),
        })

    return regions

LLM-Powered Finding Classification

Send detected regions and their features to an LLM for clinical interpretation. This is where the disclaimers matter most:

from openai import OpenAI
from pydantic import BaseModel

class FindingReport(BaseModel):
    findings: list[Finding]
    overall_impression: str
    recommendation: str
    confidence_level: str
    disclaimer: str

def classify_findings(
    regions: list[dict],
    image_metadata: dict,
    modality: str,
    body_part: str,
) -> FindingReport:
    """Classify detected regions using an LLM."""
    client = OpenAI()

    region_desc = "\n".join(
        f"Region {i+1}: bbox={r['bbox']}, area={r['area']:.0f}, "
        f"mean_intensity={r['mean_intensity']:.3f}"
        for i, r in enumerate(regions)
    )

    response = client.beta.chat.completions.parse(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": (
                "You are a medical image analysis assistant. Analyze the "
                "detected regions from a medical image and provide "
                "findings. ALWAYS include the disclaimer that this is an "
                "AI-assisted analysis that requires review by a qualified "
                "radiologist. NEVER provide a definitive diagnosis. "
                "Use language like 'suggestive of', 'consistent with', "
                "'cannot exclude'. Set requires_review=true for any "
                "finding with confidence below 0.8."
            )},
            {"role": "user", "content": (
                f"Modality: {modality}\n"
                f"Body part: {body_part}\n"
                f"Image size: {image_metadata.get('rows')}x"
                f"{image_metadata.get('columns')}\n"
                f"Detected regions:\n{region_desc}"
            )},
        ],
        response_format=FindingReport,
    )

    return response.choices[0].message.parsed

Structured Report Generation

Generate a standardized radiology-style report:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

Try Live Demo → Book 30-min Walkthrough See Pricing

from datetime import datetime

def generate_structured_report(
    finding_report: FindingReport,
    image: MedicalImage,
) -> str:
    """Generate a structured clinical report."""
    report = f"""
MEDICAL IMAGE ANALYSIS REPORT
{'=' * 50}

DISCLAIMER: {finding_report.disclaimer}

PATIENT ID: {image.patient_id}
STUDY DATE: {image.study_date}
MODALITY: {image.modality}
BODY PART: {image.body_part}
ANALYSIS DATE: {datetime.utcnow().strftime("%Y-%m-%d %H:%M UTC")}

FINDINGS:
"""

    for i, finding in enumerate(finding_report.findings, 1):
        review_flag = " [REQUIRES HUMAN REVIEW]" if finding.requires_review else ""
        report += f"""
  {i}. {finding.finding_type.upper()}{review_flag}
     Location: {finding.region}
     Severity: {finding.severity}
     Confidence: {finding.confidence:.0%}
     Description: {finding.description}
"""

    report += f"""
IMPRESSION:
  {finding_report.overall_impression}

RECOMMENDATION:
  {finding_report.recommendation}

CONFIDENCE LEVEL: {finding_report.confidence_level}

{'=' * 50}
AI-ASSISTED ANALYSIS — NOT A CLINICAL DIAGNOSIS
This report must be reviewed by a qualified radiologist.
"""

    return report

Confidence-Based Routing

Route findings based on confidence to appropriate review queues:

def route_for_review(finding_report: FindingReport) -> dict:
    """Route findings to appropriate review queues."""
    urgent = [f for f in finding_report.findings
              if f.severity in ("moderate", "severe") and f.confidence > 0.6]
    review = [f for f in finding_report.findings if f.requires_review]
    routine = [f for f in finding_report.findings
               if not f.requires_review and f.severity in ("normal", "mild")]

    return {
        "urgent_queue": len(urgent) > 0,
        "urgent_findings": len(urgent),
        "review_findings": len(review),
        "routine_findings": len(routine),
        "recommended_priority": (
            "STAT" if urgent else "PRIORITY" if review else "ROUTINE"
        ),
    }

FAQ

What regulatory approvals are needed for medical AI?

In the United States, medical AI software typically requires FDA 510(k) clearance or De Novo classification. The EU requires CE marking under the Medical Device Regulation (MDR). These processes involve clinical validation studies, risk analysis, quality management systems, and post-market surveillance plans. The regulatory path can take 6-24 months and significant investment.

How do I handle patient data privacy?

All medical image processing must comply with HIPAA (US), GDPR (EU), or equivalent regulations. De-identify DICOM images by removing patient name, ID, and other PHI from metadata before processing. Never send identifiable patient data to external APIs. Use on-premise or private cloud deployments with encryption at rest and in transit.

Can general-purpose vision models replace specialized medical AI models?

General models like GPT-4o can describe what they see in medical images, but they lack the clinical training data and validation needed for reliable diagnosis. Specialized models trained on curated medical datasets with radiologist annotations significantly outperform general models. The best approach combines specialized detection models with LLMs for report generation.

#MedicalAI #XRayAnalysis #HealthcareAI #ClinicalAI #DICOM #Radiology #AgenticAI #Python

Building a Medical Image Analysis Agent: X-Ray, Scan, and Lab Report Reading

Critical Disclaimer

Why Medical Image Analysis Matters

Working with Medical Images (DICOM)

Image Preprocessing for Analysis

Finding Detection with Region Proposals

LLM-Powered Finding Classification

Structured Report Generation

Confidence-Based Routing

FAQ

What regulatory approvals are needed for medical AI?

How do I handle patient data privacy?

Can general-purpose vision models replace specialized medical AI models?

Try CallSphere AI Voice Agents

Related Articles You May Like

From Saint Paul to Statewide MN: A Smooth CallSphere Voice & Chat Rollout for Healthcare Clinics

Massachusetts Healthcare Operators' Guide to Dropping CallSphere Voice & Chat Onto Existing Practice Systems

Why Tacoma Doctors Are Wiring CallSphere AI Agents Into Athena, Epic & DrChrono Without Touching Their Workflow

From Arlington to Statewide VA: A Smooth CallSphere Voice & Chat Rollout for Healthcare Clinics

Michigan Healthcare Operators' Guide to Dropping CallSphere Voice & Chat Onto Existing Practice Systems

Why Columbus Doctors Are Wiring CallSphere AI Agents Into Athena, Epic & DrChrono Without Touching Their Workflow