Skip to content
Learn Agentic AI
Learn Agentic AI13 min read17 views

Playwright Selectors Deep Dive: CSS, XPath, Text, and Role-Based Element Finding

Explore every Playwright selector engine in depth — CSS, XPath, text, role-based, and custom selectors — with best practices for building resilient AI agent locators that survive page changes.

Selectors Are the Eyes of Your AI Agent

The most common reason browser automation scripts break is fragile selectors. A class name changes, a div gets restructured, and suddenly your AI agent cannot find the button it needs to click. Playwright addresses this with multiple selector engines and a locator API designed for resilience.

This post covers every selector strategy available in Playwright, with guidance on which to use for AI agents that need to work reliably across page updates.

CSS Selectors

CSS selectors are the most familiar and widely used. Playwright supports the full CSS selector specification:

flowchart LR
    INPUT(["User intent"])
    PARSE["Parse plus<br/>classify"]
    PLAN["Plan and tool<br/>selection"]
    AGENT["Agent loop<br/>LLM plus tools"]
    GUARD{"Guardrails<br/>and policy"}
    EXEC["Execute and<br/>verify result"]
    OBS[("Trace and metrics")]
    OUT(["Outcome plus<br/>next action"])
    INPUT --> PARSE --> PLAN --> AGENT --> GUARD
    GUARD -->|Pass| EXEC --> OUT
    GUARD -->|Fail| AGENT
    AGENT --> OBS
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OBS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://example.com")

    # By ID
    page.locator("#main-content").text_content()

    # By class
    page.locator(".article-title").text_content()

    # By tag and class
    page.locator("div.container").text_content()

    # By attribute
    page.locator('[data-testid="submit-btn"]').click()
    page.locator('input[type="email"]').fill("[email protected]")

    # Descendant selector
    page.locator("nav ul li a").first.click()

    # Direct child
    page.locator("ul > li:first-child").text_content()

    # Nth child
    page.locator("table tr:nth-child(3) td:nth-child(2)").text_content()

    # Attribute contains
    page.locator('[class*="btn-primary"]').click()

    # Attribute starts with
    page.locator('[href^="/products"]').click()

    browser.close()

CSS selectors are fast and well-understood, but they are tightly coupled to the DOM structure. When the page layout changes, CSS selectors break.

XPath Selectors

XPath provides more expressive querying power, especially for navigating up the DOM tree (something CSS cannot do):

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →
# Basic XPath
page.locator("xpath=//h1").text_content()

# XPath with attribute
page.locator('xpath=//input[@name="email"]').fill("[email protected]")

# XPath with text content
page.locator('xpath=//button[contains(text(), "Submit")]').click()

# Navigate to parent
page.locator('xpath=//span[@class="price"]/parent::div').text_content()

# Navigate to sibling
page.locator(
    'xpath=//label[text()="Email"]/following-sibling::input'
).fill("[email protected]")

# XPath with multiple conditions
page.locator(
    'xpath=//div[@class="product" and @data-available="true"]'
).all()

# XPath with position
page.locator("xpath=(//table//tr)[3]").text_content()

XPath is powerful for complex DOM traversal, but it is verbose and even more fragile than CSS when the page structure changes. Use it as a last resort when other selector strategies cannot reach the element.

Text Selectors

Text selectors find elements by their visible text content. This is one of the most resilient strategies because button labels and link text change less frequently than class names or DOM structure:

# Exact text match (case-sensitive)
page.get_by_text("Sign In").click()

# Substring match (default behavior)
page.get_by_text("Learn More").click()

# Exact match only
page.get_by_text("Submit", exact=True).click()

# Using the locator API with text= prefix
page.locator("text=Contact Us").click()

# Text with regex
page.locator("text=/total:.*\$\d+/i").text_content()

Text selectors are excellent for AI agents because they match what a human sees on the page. If the button says "Submit Order," the text selector get_by_text("Submit Order") will find it regardless of the underlying HTML structure.

Role-based selectors use ARIA roles and accessible names to find elements. This is the most resilient selector strategy because it mirrors how assistive technologies and humans identify elements:

# Buttons
page.get_by_role("button", name="Submit")
page.get_by_role("button", name="Cancel")

# Links
page.get_by_role("link", name="Documentation")

# Headings
page.get_by_role("heading", name="Welcome", level=1)

# Form inputs by label
page.get_by_role("textbox", name="Email")
page.get_by_role("checkbox", name="I agree")
page.get_by_role("combobox", name="Country")

# Navigation landmarks
page.get_by_role("navigation").get_by_role("link", name="Home")

# Table cells
page.get_by_role("row", name="Alice").get_by_role("cell").nth(2)

# Tabs
page.get_by_role("tab", name="Settings").click()
page.get_by_role("tabpanel").text_content()

Role-based selectors are the best default choice for AI agents. They are semantic, resilient to styling changes, and align with accessibility standards that most modern websites follow.

Label, Placeholder, and Alt Text Selectors

These selectors target form elements and images by their human-readable attributes:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

# Form fields by label
page.get_by_label("Email address").fill("[email protected]")
page.get_by_label("Password").fill("secret")

# By placeholder
page.get_by_placeholder("Search products...").fill("laptop")

# Images by alt text
page.get_by_alt_text("Company Logo").click()

# By title attribute
page.get_by_title("Close dialog").click()

Chaining and Filtering Locators

For AI agents dealing with complex pages, chaining locators narrows down to the right element:

# Chain locators to narrow scope
nav = page.get_by_role("navigation")
nav.get_by_role("link", name="Products").click()

# Filter by text
page.get_by_role("listitem").filter(has_text="Python").click()

# Filter by child element
page.get_by_role("listitem").filter(
    has=page.get_by_role("button", name="Buy")
).first.click()

# Combine CSS with role-based
page.locator(".product-card").filter(
    has_text="Premium Plan"
).get_by_role("button", name="Select").click()

# Nth element when multiple match
page.get_by_role("listitem").nth(0).text_content()
page.get_by_role("listitem").first.text_content()
page.get_by_role("listitem").last.text_content()

Building a Selector Strategy for AI Agents

When building AI agents, follow this priority order for selectors:

def find_element(page, description: str):
    """
    AI agent element finder — tries selectors in order of resilience.
    """
    strategies = [
        # 1. Test IDs — most stable (if available)
        lambda: page.get_by_test_id(description),
        # 2. Role-based — semantic and resilient
        lambda: page.get_by_role("button", name=description),
        # 3. Label — great for form fields
        lambda: page.get_by_label(description),
        # 4. Text — matches visual content
        lambda: page.get_by_text(description, exact=True),
        # 5. Placeholder
        lambda: page.get_by_placeholder(description),
    ]

    for strategy in strategies:
        try:
            locator = strategy()
            if locator.count() > 0:
                return locator.first
        except Exception:
            continue

    raise Exception(f"Could not find element: {description}")

FAQ

Which selector type is best for AI agents that interact with unknown websites?

Role-based selectors (get_by_role) combined with text selectors (get_by_text) provide the best coverage for unknown pages. Role selectors work because they align with how browsers and screen readers interpret the page, which website developers must maintain for accessibility compliance. Text selectors work because they match what a human sees. Together, they can locate most interactive elements without prior knowledge of the DOM structure.

How do I handle pages where elements have dynamic class names?

Frameworks like React, Vue, and CSS-in-JS libraries generate class names like css-1a2b3c that change on every build. Avoid using these as selectors entirely. Instead, prefer data-testid attributes, role-based locators, or text-based locators. If you control the application, add stable data-testid attributes to key interactive elements.

Can Playwright selectors find elements inside shadow DOM?

Yes. Playwright automatically pierces open shadow DOM boundaries by default. If you use page.locator("button"), it will find buttons inside shadow DOM elements without any special syntax. This is a significant advantage over Selenium, which requires explicit shadow DOM traversal.


#PlaywrightSelectors #CSSSelectors #XPath #AIAgents #WebAutomation #RoleBasedSelectors #DOMTraversal

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Agentic AI

LangGraph Checkpointers in Production: Durable, Resumable Agents with Eval Replay

Use LangGraph's checkpointer to make agents resumable across crashes and human-in-the-loop pauses, then replay any checkpoint into your eval pipeline.

Agentic AI

Browser Agents with LangGraph + Playwright: Visual Evaluation Pipelines That Don't Lie

Build a browser agent with LangGraph and Playwright that does multi-step web tasks, then ground-truth its work with visual diffs and DOM-based evaluators.

Agentic AI

LangGraph State-Machine Architecture: A Principal-Engineer Deep Dive (2026)

How LangGraph's StateGraph, channels, and reducers actually work — with a working multi-step agent, eval hooks at every node, and the patterns that survive production.

Agentic AI

OpenAI Computer-Use Agents (CUA) in Production: Build + Evaluate a Real Workflow (2026)

Build a working computer-use agent with the OpenAI Computer Use tool — clicks, types, scrolls a real browser — then evaluate task success on a benchmark suite.

Agentic AI

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

Step-by-step build of a working agent with the OpenAI Agents SDK — Agent class, tools, handoffs, tracing — plus an eval pipeline that catches regressions before merge.

Agentic AI

Multi-Agent Handoffs with the OpenAI Agents SDK: The Pattern That Actually Scales (2026)

Handoffs done right — when one agent should hand control to another, how to preserve context, and how to evaluate the handoff decision itself.