Playwright Testing Playbook | Boots On The Ground AI

01

Why Playwright for Boots Portal

What Playwright Actually Is

Playwright is an open-source end-to-end testing framework built by Microsoft. Unlike unit tests (which test functions) or integration tests (which test APIs), Playwright launches a real browser, clicks real buttons, fills real forms, and asserts what a real user would see. It is the closest thing to having a human QA tester running through your app 24/7.

Why Playwright Over Cypress or Selenium

Capability	Playwright	Cypress	Selenium
Multi-browser (Chrome, Firefox, Safari)	✓ Native	Chrome/Edge only	Yes but heavy
Multi-tab / multi-context testing	✓ Native	No	Limited
Auto-wait for elements	Built-in	Built-in	Manual waits
API testing built-in	✓ request context	cy.request only	No
Parallel execution	Native sharding	Paid (Cypress Cloud)	Grid required
Trace viewer / debugging	Trace viewer + codegen	Time travel UI	Logs only
Network interception	Full route control	cy.intercept	No native
Multi-tenant context isolation	BrowserContext per tenant	Not supported	Separate drivers
Language support	JS, TS, Python, Java, .NET	JS/TS only	All major
Speed	Fast (WebSocket)	Medium (in-browser)	Slow (HTTP bridge)

🔐

Why This Matters for Your P0

Playwright's BrowserContext feature lets you simulate two completely separate users (Tenant A and Tenant B) in the same test, each with their own cookies, auth state, and storage. This is exactly what you need to verify multi-tenant isolation — one test can prove that Tenant A's proposals never leak into Tenant B's dashboard.

02

Installation & Project Setup

Install Playwright in Boots Portal

Run these commands from your Boots Portal project root:

terminalbash

# Install Playwright and browsers
npm init playwright@latest

# When prompted, select:
#  TypeScript
#  tests folder: e2e/
#  GitHub Actions CI: Yes
#  Install browsers: Yes

# This creates:
#  playwright.config.ts
#  e2e/ folder
#  .github/workflows/playwright.yml

Project Structure

boots-proposals/ folder structuretree

boots-proposals/
├── e2e/
│   ├── fixtures/          # Reusable test setup (auth, tenants)
│   │   ├── auth.setup.ts  # Login + save session state
│   │   └── tenant.fixture.ts  # Multi-tenant context factory
│   ├── pages/             # Page Object Models
│   │   ├── login.page.ts
│   │   ├── dashboard.page.ts
│   │   ├── proposal.page.ts
│   │   └── brain.page.ts
│   ├── tests/
│   │   ├── auth.spec.ts
│   │   ├── proposal-crud.spec.ts
│   │   ├── multi-tenant.spec.ts    # P0 regression tests
│   │   ├── brain-generation.spec.ts
│   │   └── sharing.spec.ts
│   └── utils/
│       ├── test-data.ts   # Factory functions for test data
│       └── api-helpers.ts # Direct Supabase API calls
├── playwright.config.ts
└── .env.test              # Test environment variables

playwright.config.ts — Production Ready

playwright.config.tstypescript

import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
  testDir: './e2e/tests',
  fullyParallel: true,
  forbidOnly: !!process.env.CI,
  retries: process.env.CI ? 2 : 0,
  workers: process.env.CI ? 1 : undefined,
  reporter: [
    ['html', { open: 'never' }],
    ['list'],
    ...(process.env.CI ? [['github']] : []),
  ],
  use: {
    baseURL: process.env.BASE_URL || 'http://localhost:3000',
    trace: 'on-first-retry',
    screenshot: 'only-on-failure',
    video: 'retain-on-failure',
  },
  projects: [
    { name: 'setup', testMatch: /.*\.setup\.ts/ },
    {
      name: 'chromium',
      use: { ...devices['Desktop Chrome'] },
      dependencies: ['setup'],
    },
    {
      name: 'tenant-isolation',
      testMatch: '**/multi-tenant.spec.ts',
      use: { ...devices['Desktop Chrome'] },
    },
  ],
  webServer: {
    command: 'npm run dev',
    url: 'http://localhost:3000',
    reuseExistingServer: !process.env.CI,
  },
});

03

Terminal Command Reference

These are the commands you will use daily. Bookmark this page.

Essential Commands

npx playwright test

Run all tests headless — use for CI/CD and quick validation

npx playwright test --headed

Run with visible browser — use when debugging or watching behavior

npx playwright test --ui

Open interactive test runner — use during development

npx playwright test --debug

Step through with inspector — use for debugging specific failures

npx playwright test multi-tenant

Run only multi-tenant tests — use after P0-related code changes

npx playwright test --project=tenant-isolation

Run isolation project only — targeted regression testing

npx playwright show-report

Open HTML test report — run after any test run

npx playwright show-trace trace.zip

Open trace viewer — use for debugging failed tests

npx playwright codegen http://localhost:3000

Record actions as code — fastest way to write new tests

npx playwright test --grep @p0

Run only P0-tagged tests — run this before EVERY deploy

Power Combos

terminalbash

# Run multi-tenant tests with full trace + video
npx playwright test multi-tenant --trace on --video on

# Run a specific test by name
npx playwright test -g 'should not leak proposals between tenants'

# Run with retries for flaky investigation
npx playwright test --retries=3 --reporter=html

# Update screenshots for visual regression
npx playwright test --update-snapshots

# Run on all browsers
npx playwright test --project=chromium --project=firefox --project=webkit

# Run in parallel with 4 workers
npx playwright test --workers=4

package.json Scripts — Add These Now

package.jsonjson

{
  "scripts": {
    "test:unit": "vitest run",
    "test:e2e": "npx playwright test",
    "test:e2e:ui": "npx playwright test --ui",
    "test:e2e:p0": "npx playwright test --grep @p0",
    "test:all": "npm run test:unit && npm run test:e2e",
    "test:ci": "npm run test:unit && npx playwright test --project=chromium"
  }
}

🛡️

Pre-Deploy Safety Net

Before every git push or Vercel deploy, run: npm run test:e2e:p0. This runs only your critical multi-tenant isolation tests in under 30 seconds. Make this muscle memory.

04

Multi-Tenant Isolation Testing

🚨

P0 Context — Why This Section Exists

A multi-tenant data leak was discovered in Boots Portal. The tests in this section are designed to catch this exact class of bug and prevent it from ever reaching production again. These tests must run on every deploy.

The Core Pattern: Two Tenants, One Test

Playwright's BrowserContext lets you simulate completely separate users in the same test. Each context has its own cookies, localStorage, and session — just like two different people opening your app in two different browsers.

e2e/tests/multi-tenant.spec.tstypescript

import { test, expect } from '@playwright/test';

test.describe('Multi-Tenant Isolation @p0', () => {
  test('Tenant A proposals must NOT appear for Tenant B', async ({ browser }) => {
    // Create two completely separate browser contexts
    const tenantAContext = await browser.newContext();
    const tenantBContext = await browser.newContext();
    const tenantAPage = await tenantAContext.newPage();
    const tenantBPage = await tenantBContext.newPage();

    // Login as Tenant A
    await tenantAPage.goto('/login');
    await tenantAPage.getByLabel('Email').fill('tenantA@test.com');
    await tenantAPage.getByLabel('Password').fill('testpass123');
    await tenantAPage.getByRole('button', { name: 'Sign In' }).click();
    await tenantAPage.waitForURL('/dashboard');

    // Tenant A creates a proposal
    await tenantAPage.getByRole('link', { name: 'New Proposal' }).click();
    await tenantAPage.getByLabel('Meeting Notes').fill(
      'SECRET: Client wants $50K budget for AI chatbot'
    );
    await tenantAPage.getByRole('button', { name: 'Generate' }).click();
    await tenantAPage.waitForSelector('[data-testid="proposal-card"]');

    // Login as Tenant B
    await tenantBPage.goto('/login');
    await tenantBPage.getByLabel('Email').fill('tenantB@test.com');
    await tenantBPage.getByLabel('Password').fill('testpass456');
    await tenantBPage.getByRole('button', { name: 'Sign In' }).click();
    await tenantBPage.waitForURL('/dashboard');

    // CRITICAL ASSERTION: Tenant B must NOT see Tenant A's data
    const allText = await tenantBPage.textContent('body');
    expect(allText).not.toContain('SECRET');
    expect(allText).not.toContain('$50K budget');
    expect(allText).not.toContain('tenantA');

    // Cleanup
    await tenantAContext.close();
    await tenantBContext.close();
  });
});

API-Level Isolation Tests (Supabase RLS)

e2e/tests/multi-tenant-api.spec.tstypescript

import { test, expect } from '@playwright/test';

test.describe('API Multi-Tenant Isolation @p0', () => {
  test('Supabase RLS blocks cross-tenant reads', async ({ request }) => {
    // Authenticate as Tenant A via API
    const authA = await request.post('/api/auth/login', {
      data: { email: 'tenantA@test.com', password: 'testpass123' }
    });
    const tokenA = (await authA.json()).access_token;

    // Create proposal as Tenant A
    const createRes = await request.post('/api/proposals', {
      headers: { Authorization: `Bearer ${tokenA}` },
      data: { title: 'Secret Proposal', notes: 'Confidential data' }
    });
    const proposalId = (await createRes.json()).id;

    // Authenticate as Tenant B
    const authB = await request.post('/api/auth/login', {
      data: { email: 'tenantB@test.com', password: 'testpass456' }
    });
    const tokenB = (await authB.json()).access_token;

    // Tenant B tries to read Tenant A's proposal
    const readRes = await request.get(`/api/proposals/${proposalId}`, {
      headers: { Authorization: `Bearer ${tokenB}` }
    });

    // MUST be 403 or 404, NEVER 200
    expect([403, 404]).toContain(readRes.status());

    // Tenant B lists all proposals — must not include Tenant A's
    const listRes = await request.get('/api/proposals', {
      headers: { Authorization: `Bearer ${tokenB}` }
    });
    const proposals = await listRes.json();
    const ids = proposals.map(p => p.id);
    expect(ids).not.toContain(proposalId);
  });
});

What to Test for Multi-Tenant

Test Case	Priority	Method
Proposals don't leak between tenants	P0	Browser + API
Brain/client intelligence is tenant-scoped	P0	API
Shared proposal links respect permissions	P0	Browser
Dashboard data filtered by org_id	P0	Browser + API
Contract generation scoped to tenant	P1	API
Report builder data isolated	P1	Browser
Search results tenant-scoped	P1	API
Audit logs show correct tenant actions	P2	API

05

Playwright Superpowers

These capabilities go far beyond basic click-and-assert testing. Each one saves hours of manual work.

Codegen — Record Tests by Clicking

This is the fastest way to write tests. Playwright opens a browser, you click through your app, and it writes the test code in real time.

terminalbash

# Open codegen pointed at your local dev server
npx playwright codegen http://localhost:3000

# With device emulation
npx playwright codegen --device='iPhone 14' http://localhost:3000

# Save directly to a file
npx playwright codegen --output e2e/tests/recorded.spec.ts http://localhost:3000

Trace Viewer — Time-Travel Debugging

When a test fails, the trace viewer lets you step through every action, see screenshots at each step, inspect the DOM, view network requests, and see console logs.

terminalbash

# Run tests with traces
npx playwright test --trace on

# Open the trace file
npx playwright show-trace test-results/trace.zip

# Or drag-and-drop the zip to:
# https://trace.playwright.dev

API Testing — Skip the Browser

e2e/tests/api.spec.tstypescript

test('API: create and retrieve proposal', async ({ request }) => {
  const response = await request.post('/api/proposals', {
    data: { title: 'Test', notes: 'Meeting notes here' }
  });
  expect(response.ok()).toBeTruthy();
  const body = await response.json();
  expect(body.id).toBeDefined();
});

Network Interception — Mock & Monitor

e2e/tests/error-handling.spec.tstypescript

test('handles AI generation failure gracefully', async ({ page }) => {
  // Mock the AI endpoint to return an error
  await page.route('**/api/generate', route =>
    route.fulfill({ status: 500, body: 'AI service down' })
  );

  await page.goto('/proposals/new');
  await page.getByRole('button', { name: 'Generate' }).click();

  // Assert graceful error handling
  await expect(page.getByText('generation failed')).toBeVisible();
});

Authentication State Reuse

Login once, reuse the session across all tests. This dramatically speeds up your test suite.

e2e/fixtures/auth.setup.tstypescript

import { test as setup, expect } from '@playwright/test';

setup('authenticate', async ({ page }) => {
  await page.goto('/login');
  await page.getByLabel('Email').fill('test@bootsagentai.com');
  await page.getByLabel('Password').fill('testpassword');
  await page.getByRole('button', { name: 'Sign In' }).click();
  await page.waitForURL('/dashboard');

  // Save the authenticated state — reused by all tests
  await page.context().storageState({ path: '.auth/user.json' });
});

💡

Codegen + Claude Code Combo

Use npx playwright codegen to record a rough test, then paste it into Claude Code and ask it to refactor using Page Object Model patterns and add proper assertions. Speed of recording + quality of AI-assisted code.

06

Page Object Model

Page Objects keep your tests clean and maintainable. When the UI changes, you update one file instead of every test.

e2e/pages/proposal.page.tstypescript

import { Page, Locator, expect } from '@playwright/test';

export class ProposalPage {
  readonly page: Page;
  readonly meetingNotesInput: Locator;
  readonly generateButton: Locator;
  readonly streamingStatus: Locator;
  readonly proposalOutput: Locator;
  readonly proposalCards: Locator;

  constructor(page: Page) {
    this.page = page;
    this.meetingNotesInput = page.getByLabel('Meeting Notes');
    this.generateButton = page.getByRole('button', { name: /generate/i });
    this.streamingStatus = page.locator('[data-testid="streaming-status"]');
    this.proposalOutput = page.locator('[data-testid="proposal-output"]');
    this.proposalCards = page.locator('[data-testid="proposal-card"]');
  }

  async generateProposal(notes: string) {
    await this.meetingNotesInput.fill(notes);
    await this.generateButton.click();
    // Wait for streaming to complete (90 second timeout for AI generation)
    await this.streamingStatus.waitFor({ state: 'hidden', timeout: 90000 });
    await expect(this.proposalOutput).toBeVisible();
  }

  async getProposalCount(): Promise<number> {
    return await this.proposalCards.count();
  }

  async getProposalText(): Promise<string> {
    return await this.proposalOutput.textContent() || '';
  }
}

07

CI/CD Integration

GitHub Actions Workflow

.github/workflows/playwright.ymlyaml

name: Playwright Tests

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    timeout-minutes: 30
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22
      - run: npm ci
      - run: npx playwright install --with-deps
      - run: npx playwright test
      - uses: actions/upload-artifact@v4
        if: ${{ !cancelled() }}
        with:
          name: playwright-report
          path: playwright-report/
          retention-days: 30

Pre-Push Git Hook (Husky)

.husky/pre-pushbash

#!/bin/sh
echo "Running P0 isolation tests..."
npx playwright test --grep @p0 --project=chromium
echo "All P0 tests passed. Safe to push."

08

Quick Reference Cheat Sheet

Selector Priority (Best to Worst)

1getByRolepage.getByRole('button', { name: 'Generate' })
2getByLabelpage.getByLabel('Meeting Notes')
3getByPlaceholderpage.getByPlaceholder('Enter notes...')
4getByTextpage.getByText('Proposal Generated')
5getByTestIdpage.getByTestId('proposal-card')
6CSS/XPathpage.locator('.btn-primary') — avoid this

Common Assertions

Assertion	Use For
`expect(locator).toBeVisible()`	Element is on screen
`expect(locator).toHaveText('...')`	Element has specific text
`expect(locator).toHaveCount(n)`	Number of matching elements
`expect(page).toHaveURL('...')`	Current page URL
`expect(page).toHaveTitle('...')`	Page title
`expect(response).toBeOK()`	API returned 2xx
`expect(locator).not.toBeVisible()`	Element is hidden/removed

09

Your Testing Workflow

Complete testing hierarchy showing where Playwright fits with your existing Vitest tests.

Layer	Tool	What It Tests	Speed	Run When
Unit	Vitest	Individual functions, utils	< 1 sec	Every save (watch mode)
Integration	Vitest	API routes, DB queries, RLS	1–5 sec	Before commit
E2E Critical	Playwright `@p0`	Multi-tenant isolation, auth	10–30 sec	Before every push
E2E Full	Playwright	All user journeys	2–5 min	CI on every PR
Visual	Playwright screenshots	UI regression	1–3 min	Weekly or on UI changes

Pre-Deploy Command Sequence

terminalbash

# Step 1: Unit tests (fast feedback)
npm run test:unit

# Step 2: P0 isolation tests (must pass — no exceptions)
npx playwright test --grep @p0

# Step 3: Full E2E suite (if time allows)
npx playwright test

# Step 4: Check the report
npx playwright show-report

# Step 5: Deploy with confidence
git push

⚡ The Golden Rule

Never deploy without running npm run test:e2e:p0 first. This single command runs your multi-tenant isolation tests and takes under 30 seconds. It is the cheapest insurance policy against shipping another P0.

10

Implementation Checklist

Work through these in order. Each step builds on the previous one. Click items to check them off.

Phase 1

Emergency — This Week

Install Playwright (npm init playwright@latest)
Write multi-tenant isolation test (Section 4.1)
Write API-level RLS test (Section 4.2)
Add @p0 tag to all isolation tests
Run and verify all P0 tests pass

Phase 2

Foundation — Next 2 Weeks

Create Page Object Models (login, dashboard, proposal)
Set up auth state reuse (Section 5.6)
Add auth flow tests (login, logout, session expiry)
Add proposal CRUD tests (create, edit, delete)
Set up GitHub Actions CI (Section 7.1)

Phase 3

Coverage — Month 2

Add Brain generation flow tests (streaming UI)
Add proposal sharing/publishing tests
Add contract generation tests
Add visual regression baselines
Add mobile device emulation tests

Phase 4

Enterprise-Ready — Month 3

Add pre-push git hook for P0 tests
Set up test data factories for reproducible tenant creation
Add performance benchmarking (proposal gen <60 sec)
Document test coverage for SOC 2 audit trail
Add kill switch verification tests

PlaywrightTesting

Playwright
Testing