Skip to content
Learn Agentic AI
Learn Agentic AI12 min read25 views

Playwright Page Interactions: Clicking, Typing, and Navigating with Python

Master Playwright's interaction API for AI agents — learn how to click buttons, fill forms, select dropdowns, use keyboard and mouse actions, and implement reliable waiting strategies.

Beyond Navigation: Interacting with Pages

Once your AI agent can navigate to a page and locate elements, the next step is interacting with those elements — clicking buttons, filling forms, selecting options from dropdowns, and handling keyboard shortcuts. Playwright provides a rich interaction API that automatically waits for elements to be actionable before performing actions, which eliminates the flaky timing issues that plague other automation tools.

This post covers every major interaction method with practical examples you can use in your AI agents.

Clicking Elements

Playwright's click() method automatically waits for the element to be visible, stable (not animating), enabled, and not obscured by other elements:

flowchart LR
    INPUT(["User intent"])
    PARSE["Parse plus<br/>classify"]
    PLAN["Plan and tool<br/>selection"]
    AGENT["Agent loop<br/>LLM plus tools"]
    GUARD{"Guardrails<br/>and policy"}
    EXEC["Execute and<br/>verify result"]
    OBS[("Trace and metrics")]
    OUT(["Outcome plus<br/>next action"])
    INPUT --> PARSE --> PLAN --> AGENT --> GUARD
    GUARD -->|Pass| EXEC --> OUT
    GUARD -->|Fail| AGENT
    AGENT --> OBS
    style AGENT fill:#4f46e5,stroke:#4338ca,color:#fff
    style GUARD fill:#f59e0b,stroke:#d97706,color:#1f2937
    style OBS fill:#ede9fe,stroke:#7c3aed,color:#1e1b4b
    style OUT fill:#059669,stroke:#047857,color:#fff
from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    browser = p.chromium.launch()
    page = browser.new_page()
    page.goto("https://example.com")

    # Click a button by role
    page.get_by_role("button", name="Submit").click()

    # Click a link by text
    page.get_by_text("Learn More").click()

    # Double click
    page.locator("#editable-field").dblclick()

    # Right click (context menu)
    page.locator("#item").click(button="right")

    # Click at specific position within an element
    page.locator("#canvas").click(position={"x": 100, "y": 200})

    # Force click — bypass actionability checks (use sparingly)
    page.locator("#hidden-button").click(force=True)

    browser.close()

The force=True option should be reserved for edge cases where Playwright's actionability checks conflict with unusual page behavior. In most situations, if an element is not clickable, that is a real problem your agent should handle rather than force through.

Filling Forms

Form filling is one of the most common tasks for AI agents. Playwright provides specialized methods for different input types:

Hear it before you finish reading

Talk to a live CallSphere AI voice agent in your browser — 60 seconds, no signup.

Try Live Demo →
# Text input — clears existing content first
page.get_by_label("Username").fill("ai_agent_user")
page.get_by_label("Password").fill("secure_password_123")

# Type character by character (simulates real typing)
page.get_by_label("Search").type("machine learning", delay=50)

# Clear a field
page.get_by_label("Email").clear()

# Fill and press Enter in one flow
page.get_by_label("Search").fill("agentic AI")
page.get_by_label("Search").press("Enter")

The difference between fill() and type() matters for AI agents. fill() sets the value instantly (fast, reliable), while type() simulates individual keystrokes with an optional delay (slower, but triggers keystroke event listeners that some sites rely on for validation or autocomplete).

Selecting from Dropdowns

Playwright handles both native HTML <select> elements and custom dropdown components:

# Native <select> — by value
page.get_by_label("Country").select_option("us")

# By visible text
page.get_by_label("Country").select_option(label="United States")

# By index
page.get_by_label("Country").select_option(index=2)

# Multiple selection
page.get_by_label("Skills").select_option(["python", "javascript", "rust"])

# Custom dropdown (not a <select>) — click to open, then click option
page.locator(".custom-dropdown-trigger").click()
page.locator(".dropdown-option", has_text="United States").click()

Checkbox and Radio Button Interactions

# Check a checkbox
page.get_by_label("I agree to terms").check()

# Uncheck
page.get_by_label("Subscribe to newsletter").uncheck()

# Set to a specific state (check if unchecked, noop if already checked)
page.get_by_label("Enable notifications").set_checked(True)

# Verify state
is_checked = page.get_by_label("I agree to terms").is_checked()
print(f"Terms accepted: {is_checked}")

Keyboard Actions

AI agents sometimes need to trigger keyboard shortcuts or special keys:

# Press a single key
page.keyboard.press("Escape")
page.keyboard.press("Tab")
page.keyboard.press("Enter")

# Keyboard shortcuts
page.keyboard.press("Control+a")  # Select all
page.keyboard.press("Control+c")  # Copy
page.keyboard.press("Control+v")  # Paste

# Type a string (fires keydown, keypress, keyup for each char)
page.keyboard.type("Hello, World!", delay=100)

# Hold and release keys
page.keyboard.down("Shift")
page.keyboard.press("ArrowDown")
page.keyboard.press("ArrowDown")
page.keyboard.up("Shift")

Mouse Actions

For complex interactions like drag-and-drop or hover menus:

# Hover to reveal a tooltip or dropdown
page.locator(".user-menu").hover()
page.locator(".dropdown-item", has_text="Settings").click()

# Drag and drop
page.locator("#source-item").drag_to(page.locator("#target-area"))

# Manual mouse movement
page.mouse.move(100, 200)
page.mouse.down()
page.mouse.move(300, 400)
page.mouse.up()

Waiting Strategies for Reliable Interactions

Playwright auto-waits before actions, but sometimes your agent needs explicit waits:

Still reading? Stop comparing — try CallSphere live.

CallSphere ships complete AI voice agents per industry — 14 tools for healthcare, 10 agents for real estate, 4 specialists for salons. See how it actually handles a call before you book a demo.

# Wait for an element to appear
page.wait_for_selector(".loading-spinner", state="hidden")

# Wait for an element to be visible
page.locator(".results-panel").wait_for(state="visible")

# Wait for a specific condition with a custom timeout
page.get_by_role("button", name="Download").wait_for(
    state="visible",
    timeout=10000
)

# Wait for a function to return true
page.wait_for_function("document.querySelector('.data-loaded') !== null")

# Wait for navigation after a click
with page.expect_navigation():
    page.get_by_text("Next Page").click()

Complete Form Automation Example

Here is a complete example that demonstrates a realistic AI agent form-filling workflow:

from playwright.sync_api import sync_playwright

def fill_contact_form():
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=True)
        page = browser.new_page()
        page.goto("https://httpbin.org/forms/post")

        # Fill text fields
        page.get_by_label("Customer name").fill("AI Agent Demo")
        page.get_by_label("Telephone").fill("555-0100")
        page.get_by_label("E-mail address").fill("[email protected]")

        # Select pizza size (radio buttons)
        page.get_by_label("Medium").check()

        # Select toppings (checkboxes)
        page.get_by_label("Bacon").check()
        page.get_by_label("Onion").check()

        # Fill delivery time
        page.get_by_label("Preferred delivery time").fill("19:30")

        # Add special instructions
        page.get_by_label("Delivery instructions").fill(
            "Ring doorbell twice. Leave at door if no answer."
        )

        # Submit the form
        page.get_by_role("button", name="Submit order").click()

        # Wait for and capture the response
        page.wait_for_load_state("networkidle")
        print("Form submitted successfully")
        print(f"Response URL: {page.url}")

        browser.close()

fill_contact_form()

Assertions for Verification

After performing actions, your AI agent should verify the results:

from playwright.sync_api import expect

# Verify text content
expect(page.locator(".success-message")).to_have_text("Form submitted")

# Verify visibility
expect(page.locator(".error-banner")).not_to_be_visible()

# Verify input value
expect(page.get_by_label("Email")).to_have_value("[email protected]")

# Verify URL after navigation
expect(page).to_have_url("**/success**")

FAQ

When should an AI agent use type() instead of fill()?

Use type() when the website relies on keystroke events for functionality like autocomplete suggestions, real-time validation, or search-as-you-type features. Use fill() for everything else because it is faster and more reliable. A good heuristic is to start with fill() and switch to type() only if the site does not respond correctly.

How does Playwright handle elements that are not yet on the page?

Playwright's locator API is lazy — it does not query the DOM until you perform an action. When you call page.get_by_role("button", name="Submit").click(), Playwright waits up to 30 seconds (configurable) for the button to appear, become visible, and be actionable before clicking. If the element never appears, it throws a TimeoutError that your agent can catch and handle.

Can Playwright interact with iframes?

Yes. Use page.frame_locator() to target elements inside iframes. For example, page.frame_locator("#payment-iframe").get_by_label("Card number").fill("4242..."). Each iframe is treated as a separate frame, and Playwright handles cross-origin iframes transparently.


#Playwright #FormAutomation #AIAgents #Python #WebInteraction #BrowserTesting #ClickAndType

Share

Try CallSphere AI Voice Agents

See how AI voice agents work for your industry. Live demo available -- no signup required.

Related Articles You May Like

Agentic AI

LangGraph Checkpointers in Production: Durable, Resumable Agents with Eval Replay

Use LangGraph's checkpointer to make agents resumable across crashes and human-in-the-loop pauses, then replay any checkpoint into your eval pipeline.

Agentic AI

Browser Agents with LangGraph + Playwright: Visual Evaluation Pipelines That Don't Lie

Build a browser agent with LangGraph and Playwright that does multi-step web tasks, then ground-truth its work with visual diffs and DOM-based evaluators.

Agentic AI

OpenAI Computer-Use Agents (CUA) in Production: Build + Evaluate a Real Workflow (2026)

Build a working computer-use agent with the OpenAI Computer Use tool — clicks, types, scrolls a real browser — then evaluate task success on a benchmark suite.

Agentic AI

LangGraph State-Machine Architecture: A Principal-Engineer Deep Dive (2026)

How LangGraph's StateGraph, channels, and reducers actually work — with a working multi-step agent, eval hooks at every node, and the patterns that survive production.

Agentic AI

Building Your First Agent with the OpenAI Agents SDK in 2026: A Hands-On Walkthrough

Step-by-step build of a working agent with the OpenAI Agents SDK — Agent class, tools, handoffs, tracing — plus an eval pipeline that catches regressions before merge.

Agentic AI

Multi-Agent Handoffs with the OpenAI Agents SDK: The Pattern That Actually Scales (2026)

Handoffs done right — when one agent should hand control to another, how to preserve context, and how to evaluate the handoff decision itself.