Playwright is an open-source end-to-end testing framework built by Microsoft. Unlike unit tests (which test functions) or integration tests (which test APIs), Playwright launches a real browser, clicks real buttons, fills real forms, and asserts what a real user would see. It is the closest thing to having a human QA tester running through your app 24/7.
| Capability | Playwright | Cypress | Selenium |
|---|---|---|---|
| Multi-browser (Chrome, Firefox, Safari) | ✓ Native | Chrome/Edge only | Yes but heavy |
| Multi-tab / multi-context testing | ✓ Native | No | Limited |
| Auto-wait for elements | Built-in | Built-in | Manual waits |
| API testing built-in | ✓ request context | cy.request only | No |
| Parallel execution | Native sharding | Paid (Cypress Cloud) | Grid required |
| Trace viewer / debugging | Trace viewer + codegen | Time travel UI | Logs only |
| Network interception | Full route control | cy.intercept | No native |
| Multi-tenant context isolation | BrowserContext per tenant | Not supported | Separate drivers |
| Language support | JS, TS, Python, Java, .NET | JS/TS only | All major |
| Speed | Fast (WebSocket) | Medium (in-browser) | Slow (HTTP bridge) |
Playwright's BrowserContext feature lets you simulate two completely separate users (Tenant A and Tenant B) in the same test, each with their own cookies, auth state, and storage. This is exactly what you need to verify multi-tenant isolation — one test can prove that Tenant A's proposals never leak into Tenant B's dashboard.
Run these commands from your Boots Portal project root:
# Install Playwright and browsers npm init playwright@latest # When prompted, select: # TypeScript # tests folder: e2e/ # GitHub Actions CI: Yes # Install browsers: Yes # This creates: # playwright.config.ts # e2e/ folder # .github/workflows/playwright.yml
boots-proposals/ ├── e2e/ │ ├── fixtures/ # Reusable test setup (auth, tenants) │ │ ├── auth.setup.ts # Login + save session state │ │ └── tenant.fixture.ts # Multi-tenant context factory │ ├── pages/ # Page Object Models │ │ ├── login.page.ts │ │ ├── dashboard.page.ts │ │ ├── proposal.page.ts │ │ └── brain.page.ts │ ├── tests/ │ │ ├── auth.spec.ts │ │ ├── proposal-crud.spec.ts │ │ ├── multi-tenant.spec.ts # P0 regression tests │ │ ├── brain-generation.spec.ts │ │ └── sharing.spec.ts │ └── utils/ │ ├── test-data.ts # Factory functions for test data │ └── api-helpers.ts # Direct Supabase API calls ├── playwright.config.ts └── .env.test # Test environment variables
import { defineConfig, devices } from '@playwright/test'; export default defineConfig({ testDir: './e2e/tests', fullyParallel: true, forbidOnly: !!process.env.CI, retries: process.env.CI ? 2 : 0, workers: process.env.CI ? 1 : undefined, reporter: [ ['html', { open: 'never' }], ['list'], ...(process.env.CI ? [['github']] : []), ], use: { baseURL: process.env.BASE_URL || 'http://localhost:3000', trace: 'on-first-retry', screenshot: 'only-on-failure', video: 'retain-on-failure', }, projects: [ { name: 'setup', testMatch: /.*\.setup\.ts/ }, { name: 'chromium', use: { ...devices['Desktop Chrome'] }, dependencies: ['setup'], }, { name: 'tenant-isolation', testMatch: '**/multi-tenant.spec.ts', use: { ...devices['Desktop Chrome'] }, }, ], webServer: { command: 'npm run dev', url: 'http://localhost:3000', reuseExistingServer: !process.env.CI, }, });
These are the commands you will use daily. Bookmark this page.
# Run multi-tenant tests with full trace + video npx playwright test multi-tenant --trace on --video on # Run a specific test by name npx playwright test -g 'should not leak proposals between tenants' # Run with retries for flaky investigation npx playwright test --retries=3 --reporter=html # Update screenshots for visual regression npx playwright test --update-snapshots # Run on all browsers npx playwright test --project=chromium --project=firefox --project=webkit # Run in parallel with 4 workers npx playwright test --workers=4
{
"scripts": {
"test:unit": "vitest run",
"test:e2e": "npx playwright test",
"test:e2e:ui": "npx playwright test --ui",
"test:e2e:p0": "npx playwright test --grep @p0",
"test:all": "npm run test:unit && npm run test:e2e",
"test:ci": "npm run test:unit && npx playwright test --project=chromium"
}
}
Before every git push or Vercel deploy, run: npm run test:e2e:p0. This runs only your critical multi-tenant isolation tests in under 30 seconds. Make this muscle memory.
A multi-tenant data leak was discovered in Boots Portal. The tests in this section are designed to catch this exact class of bug and prevent it from ever reaching production again. These tests must run on every deploy.
Playwright's BrowserContext lets you simulate completely separate users in the same test. Each context has its own cookies, localStorage, and session — just like two different people opening your app in two different browsers.
import { test, expect } from '@playwright/test'; test.describe('Multi-Tenant Isolation @p0', () => { test('Tenant A proposals must NOT appear for Tenant B', async ({ browser }) => { // Create two completely separate browser contexts const tenantAContext = await browser.newContext(); const tenantBContext = await browser.newContext(); const tenantAPage = await tenantAContext.newPage(); const tenantBPage = await tenantBContext.newPage(); // Login as Tenant A await tenantAPage.goto('/login'); await tenantAPage.getByLabel('Email').fill('tenantA@test.com'); await tenantAPage.getByLabel('Password').fill('testpass123'); await tenantAPage.getByRole('button', { name: 'Sign In' }).click(); await tenantAPage.waitForURL('/dashboard'); // Tenant A creates a proposal await tenantAPage.getByRole('link', { name: 'New Proposal' }).click(); await tenantAPage.getByLabel('Meeting Notes').fill( 'SECRET: Client wants $50K budget for AI chatbot' ); await tenantAPage.getByRole('button', { name: 'Generate' }).click(); await tenantAPage.waitForSelector('[data-testid="proposal-card"]'); // Login as Tenant B await tenantBPage.goto('/login'); await tenantBPage.getByLabel('Email').fill('tenantB@test.com'); await tenantBPage.getByLabel('Password').fill('testpass456'); await tenantBPage.getByRole('button', { name: 'Sign In' }).click(); await tenantBPage.waitForURL('/dashboard'); // CRITICAL ASSERTION: Tenant B must NOT see Tenant A's data const allText = await tenantBPage.textContent('body'); expect(allText).not.toContain('SECRET'); expect(allText).not.toContain('$50K budget'); expect(allText).not.toContain('tenantA'); // Cleanup await tenantAContext.close(); await tenantBContext.close(); }); });
import { test, expect } from '@playwright/test'; test.describe('API Multi-Tenant Isolation @p0', () => { test('Supabase RLS blocks cross-tenant reads', async ({ request }) => { // Authenticate as Tenant A via API const authA = await request.post('/api/auth/login', { data: { email: 'tenantA@test.com', password: 'testpass123' } }); const tokenA = (await authA.json()).access_token; // Create proposal as Tenant A const createRes = await request.post('/api/proposals', { headers: { Authorization: `Bearer ${tokenA}` }, data: { title: 'Secret Proposal', notes: 'Confidential data' } }); const proposalId = (await createRes.json()).id; // Authenticate as Tenant B const authB = await request.post('/api/auth/login', { data: { email: 'tenantB@test.com', password: 'testpass456' } }); const tokenB = (await authB.json()).access_token; // Tenant B tries to read Tenant A's proposal const readRes = await request.get(`/api/proposals/${proposalId}`, { headers: { Authorization: `Bearer ${tokenB}` } }); // MUST be 403 or 404, NEVER 200 expect([403, 404]).toContain(readRes.status()); // Tenant B lists all proposals — must not include Tenant A's const listRes = await request.get('/api/proposals', { headers: { Authorization: `Bearer ${tokenB}` } }); const proposals = await listRes.json(); const ids = proposals.map(p => p.id); expect(ids).not.toContain(proposalId); }); });
| Test Case | Priority | Method |
|---|---|---|
| Proposals don't leak between tenants | P0 | Browser + API |
| Brain/client intelligence is tenant-scoped | P0 | API |
| Shared proposal links respect permissions | P0 | Browser |
| Dashboard data filtered by org_id | P0 | Browser + API |
| Contract generation scoped to tenant | P1 | API |
| Report builder data isolated | P1 | Browser |
| Search results tenant-scoped | P1 | API |
| Audit logs show correct tenant actions | P2 | API |
These capabilities go far beyond basic click-and-assert testing. Each one saves hours of manual work.
This is the fastest way to write tests. Playwright opens a browser, you click through your app, and it writes the test code in real time.
# Open codegen pointed at your local dev server npx playwright codegen http://localhost:3000 # With device emulation npx playwright codegen --device='iPhone 14' http://localhost:3000 # Save directly to a file npx playwright codegen --output e2e/tests/recorded.spec.ts http://localhost:3000
When a test fails, the trace viewer lets you step through every action, see screenshots at each step, inspect the DOM, view network requests, and see console logs.
# Run tests with traces npx playwright test --trace on # Open the trace file npx playwright show-trace test-results/trace.zip # Or drag-and-drop the zip to: # https://trace.playwright.dev
test('API: create and retrieve proposal', async ({ request }) => { const response = await request.post('/api/proposals', { data: { title: 'Test', notes: 'Meeting notes here' } }); expect(response.ok()).toBeTruthy(); const body = await response.json(); expect(body.id).toBeDefined(); });
test('handles AI generation failure gracefully', async ({ page }) => { // Mock the AI endpoint to return an error await page.route('**/api/generate', route => route.fulfill({ status: 500, body: 'AI service down' }) ); await page.goto('/proposals/new'); await page.getByRole('button', { name: 'Generate' }).click(); // Assert graceful error handling await expect(page.getByText('generation failed')).toBeVisible(); });
Login once, reuse the session across all tests. This dramatically speeds up your test suite.
import { test as setup, expect } from '@playwright/test'; setup('authenticate', async ({ page }) => { await page.goto('/login'); await page.getByLabel('Email').fill('test@bootsagentai.com'); await page.getByLabel('Password').fill('testpassword'); await page.getByRole('button', { name: 'Sign In' }).click(); await page.waitForURL('/dashboard'); // Save the authenticated state — reused by all tests await page.context().storageState({ path: '.auth/user.json' }); });
Use npx playwright codegen to record a rough test, then paste it into Claude Code and ask it to refactor using Page Object Model patterns and add proper assertions. Speed of recording + quality of AI-assisted code.
Page Objects keep your tests clean and maintainable. When the UI changes, you update one file instead of every test.
import { Page, Locator, expect } from '@playwright/test'; export class ProposalPage { readonly page: Page; readonly meetingNotesInput: Locator; readonly generateButton: Locator; readonly streamingStatus: Locator; readonly proposalOutput: Locator; readonly proposalCards: Locator; constructor(page: Page) { this.page = page; this.meetingNotesInput = page.getByLabel('Meeting Notes'); this.generateButton = page.getByRole('button', { name: /generate/i }); this.streamingStatus = page.locator('[data-testid="streaming-status"]'); this.proposalOutput = page.locator('[data-testid="proposal-output"]'); this.proposalCards = page.locator('[data-testid="proposal-card"]'); } async generateProposal(notes: string) { await this.meetingNotesInput.fill(notes); await this.generateButton.click(); // Wait for streaming to complete (90 second timeout for AI generation) await this.streamingStatus.waitFor({ state: 'hidden', timeout: 90000 }); await expect(this.proposalOutput).toBeVisible(); } async getProposalCount(): Promise<number> { return await this.proposalCards.count(); } async getProposalText(): Promise<string> { return await this.proposalOutput.textContent() || ''; } }
name: Playwright Tests
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
test:
timeout-minutes: 30
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
- run: npm ci
- run: npx playwright install --with-deps
- run: npx playwright test
- uses: actions/upload-artifact@v4
if: ${{ !cancelled() }}
with:
name: playwright-report
path: playwright-report/
retention-days: 30
#!/bin/sh
echo "Running P0 isolation tests..."
npx playwright test --grep @p0 --project=chromium
echo "All P0 tests passed. Safe to push."
- 1getByRolepage.getByRole('button', { name: 'Generate' })
- 2getByLabelpage.getByLabel('Meeting Notes')
- 3getByPlaceholderpage.getByPlaceholder('Enter notes...')
- 4getByTextpage.getByText('Proposal Generated')
- 5getByTestIdpage.getByTestId('proposal-card')
- 6CSS/XPathpage.locator('.btn-primary') — avoid this
| Assertion | Use For |
|---|---|
expect(locator).toBeVisible() | Element is on screen |
expect(locator).toHaveText('...') | Element has specific text |
expect(locator).toHaveCount(n) | Number of matching elements |
expect(page).toHaveURL('...') | Current page URL |
expect(page).toHaveTitle('...') | Page title |
expect(response).toBeOK() | API returned 2xx |
expect(locator).not.toBeVisible() | Element is hidden/removed |
Complete testing hierarchy showing where Playwright fits with your existing Vitest tests.
| Layer | Tool | What It Tests | Speed | Run When |
|---|---|---|---|---|
| Unit | Vitest | Individual functions, utils | < 1 sec | Every save (watch mode) |
| Integration | Vitest | API routes, DB queries, RLS | 1–5 sec | Before commit |
| E2E Critical | Playwright @p0 |
Multi-tenant isolation, auth | 10–30 sec | Before every push |
| E2E Full | Playwright | All user journeys | 2–5 min | CI on every PR |
| Visual | Playwright screenshots | UI regression | 1–3 min | Weekly or on UI changes |
# Step 1: Unit tests (fast feedback) npm run test:unit # Step 2: P0 isolation tests (must pass — no exceptions) npx playwright test --grep @p0 # Step 3: Full E2E suite (if time allows) npx playwright test # Step 4: Check the report npx playwright show-report # Step 5: Deploy with confidence git push
Never deploy without running npm run test:e2e:p0 first. This single command runs your multi-tenant isolation tests and takes under 30 seconds. It is the cheapest insurance policy against shipping another P0.
Work through these in order. Each step builds on the previous one. Click items to check them off.
- Install Playwright (
npm init playwright@latest) - Write multi-tenant isolation test (Section 4.1)
- Write API-level RLS test (Section 4.2)
- Add
@p0tag to all isolation tests - Run and verify all P0 tests pass
- Create Page Object Models (login, dashboard, proposal)
- Set up auth state reuse (Section 5.6)
- Add auth flow tests (login, logout, session expiry)
- Add proposal CRUD tests (create, edit, delete)
- Set up GitHub Actions CI (Section 7.1)
- Add Brain generation flow tests (streaming UI)
- Add proposal sharing/publishing tests
- Add contract generation tests
- Add visual regression baselines
- Add mobile device emulation tests
- Add pre-push git hook for P0 tests
- Set up test data factories for reproducible tenant creation
- Add performance benchmarking (proposal gen <60 sec)
- Document test coverage for SOC 2 audit trail
- Add kill switch verification tests