Boots On The Ground AI — Platform Playbook

Section 01

The Problem We Solve

Small businesses lose deals because proposals take too long, sound generic, and carry no memory of what worked before.

4–8 Hours Per Proposal

A solo consultant or small team spends half a workday writing each proposal manually. That's time not spent closing deals or delivering work.

No Memory, No Learning

Every proposal starts from scratch. What worked for a similar client last quarter? What tone landed the deal? Nobody remembers, nobody tracks it.

Enterprise Tools Don't Fit

PandaDoc, Proposify, and Qwilr cost $500+/mo, require team onboarding, and weren't built for AI-native workflows. Overkill for a 1-10 person shop.

The gap: There's no proposal tool that's affordable for small businesses, learns from past deals, and generates genuinely strategic proposals — not just templates with variable fields. That's what we built.

<2min

Generation Time

7

AI Agents

50+

API Endpoints

4

Pricing Tiers

Section 02

Business Requirements

What the system must do, why it must do it, and the constraints that shaped every decision.

Multi-Tenant SaaS

Every organization's data is completely isolated. One codebase, one database, infinite tenants. Row-Level Security enforced at the database layer — not application code.

AI That Matches Your Voice

Generated proposals must sound like the business, not like ChatGPT. The system ingests org knowledge (voice, services, pricing philosophy) and uses it in every generation.

Intelligence That Compounds

Each proposal teaches the system. Client psychographics, deal outcomes, and strategy patterns are stored and retrieved via vector search for future proposals.

Production-Grade Security

Auth with JWT cookies, RLS on every table, rate limiting on every endpoint, audit logging on every action. No shortcuts, no "we'll add that later."

Tiered Billing

Free tier for trial, paid tiers with real limits. Stripe-powered checkout, webhook-driven plan updates, usage tracking per org.

White-Label Ready

Each org can customize branding — colors, logo, proposal templates. The platform itself is white-labelable from a single config file.

Section 03

What We Built

A complete feature inventory of the Boots Proposal System, organized by domain.

Document Generation

AI Proposal Generation

Live

Multi-method creation: AI chat, notes-to-proposal, or brain pipeline. Full proposal with cover, summary, features, architecture, timeline, and pricing.

Contract Generation

Live

Generate service agreements from scratch or convert approved proposals. Legal section templates with customizable terms.

Report / Assessment

Live

Technical assessment reports with protocol parsing. Professional formatting with executive summary.

Rich Editors

Live

Full-featured editors for proposals, contracts, and reports with tabbed sections, live preview, image upload, and auto-save.

Template Builder

Live

Create and manage reusable templates. System templates across industries. Save any proposal as a template.

Public Share Links

Live

Generate shareable URLs for proposals, contracts, and reports. Read-only, no auth required, with view tracking.

Intelligence Layer

Client Intelligence

Live

Psychographic profiles per client: preferred tone, key themes, risks, pricing strategy. Persisted across proposals with deal history.

Vector Search (pgvector)

Live

OpenAI embeddings (1536-dim) stored in PostgreSQL. Cosine similarity search finds similar past clients to inform new proposals.

7-Agent Brain Pipeline

Live

Context → Research → Strategist → Writer → Reviewer → Commit → Memory. Feedback loops, resume-from-step, and cached outputs.

Org Brain

Live

Knowledge base per org: voice guidance, service descriptions, pricing context, company profile. Feeds every AI generation.

Prospect Research

Live

Auto-scrapes prospect websites, extracts industry/services/tone profile, caches results. Never blocks the pipeline on failure.

Deal Outcome Tracking

Foundation

Track won/lost/pending on proposals. Atomic update propagated to client intelligence deal history for future strategy.

Platform & Operations

Multi-Tenant Auth

Live

Supabase Auth with JWT cookies, Google OAuth, email/password, OTP. Role hierarchy: owner > admin > member > viewer.

Stripe Billing

Live

4 tiers (free/starter/pro/enterprise). Checkout sessions, customer portal, webhook-driven plan updates, usage enforcement.

Audit Logging

Live

Every action logged: user, org, action type, resource, IP, user agent. Fire-and-forget — never crashes the main operation.

Rate Limiting

Live

Upstash Redis distributed rate limiting. Per-endpoint configs: AI generation (10/min), auth (10/min), standard (60/min).

Error Tracking

Live

Sentry on client, server, and edge. Source maps, full traces, console log forwarding. User-friendly error messages only.

Magic Onboarding

Live

AI-powered onboarding: scrapes your website, extracts company profile, generates org brain entries. Ready to generate in minutes.

Section 04

Architecture Overview

Three deployment targets, one shared database, zero single points of failure in the generation pipeline.

System Architecture

Vercel · IAD1

Next.js 16 App App Router · React 19 · TypeScript

50+ API Routes Auth · CRUD · AI · Billing · Admin

Edge Middleware Auth gate · Onboarding · Rate limiting

Rich Editors Proposal · Contract · Report

Supabase · Cloud

PostgreSQL + RLS Multi-tenant isolation · 25+ tables

pgvector Extension 1536-dim embeddings · HNSW index

Auth Service JWT · OAuth · OTP · Cookie sessions

Realtime Agent progress streaming to UI

Storage Org logos · Proposal images

Fly.io · ORD

boots-proposal-worker TypeScript · Job processing · Retry logic

boots-brain Python FastAPI · 7-agent pipeline

Shared State via Supabase Jobs table · Agent runs · Proposals

External Service Integrations

AI Models

Anthropic Claude

Opus · Sonnet · Haiku

Embeddings

OpenAI

text-embedding-3-small

Payments

Stripe

Checkout · Webhooks

Rate Limiting

Upstash Redis

Distributed · IP-based

Errors

Sentry

Client · Server · Edge

Email

Resend

Transactional · Leads

Request Flow: Proposal Generation

User clicks
"Generate"

→

Middleware
Auth Gate

→

API Route
Rate Limit

→

Insert Job
to Supabase

→

Worker/Brain
Claims Job

→

7-Agent
Pipeline

→

Save Result
to Supabase

→

UI Updates
via Realtime

Section 05

Tech Stack

Every technology was chosen for a specific reason. No resume-driven development.

N

Next.js 16

Framework

App Router, React Server Components, streaming, edge middleware, Turbopack. The full-stack React framework.

S

Supabase

Database + Auth

PostgreSQL with RLS, pgvector, Realtime, Auth, Storage. One platform replaces 5 services.

P

Python FastAPI

Brain Service

Multi-agent orchestration, async pipeline, ML ecosystem access. Where TypeScript doesn't fit.

C

Claude (Anthropic)

AI Models

Opus for strategy, Sonnet for writing, Haiku for validation. Multi-model by design.

V

pgvector

Vector Search

Embeddings in PostgreSQL. No separate vector DB. HNSW index for fast cosine similarity.

$

Stripe

Billing

Checkout sessions, subscription management, webhook-driven plan updates. Test mode ready.

F

Fly.io

Worker Hosting

Docker containers with no timeout limits. Background jobs that run for 5+ minutes. ORD region.

G

GitHub Actions

CI/CD

Automated quality gates: lint, typecheck, test, build. Conditional deploys to Vercel + Fly.io.

Section 06

Design Decisions

The "why" behind every architectural choice. These aren't arbitrary — each one solved a real problem.

TypeScript for the Web App, Python for the Brain

Options: All TypeScript vs Hybrid TS + Python

The brain service orchestrates 7 agents with complex state management, async execution, and structured JSON output. Python's Anthropic SDK, asyncio, and the ML ecosystem (embeddings, NLP) are significantly more mature for this use case. The web app, API routes, and worker stay TypeScript for type-safe React and shared types with the database. They communicate via Supabase as shared state — no direct HTTP calls between services.

Supabase over Firebase / Prisma / Raw PostgreSQL

Options: Firebase vs Prisma + PG vs Supabase

Supabase gives us PostgreSQL with RLS for multi-tenancy, pgvector for embeddings, Realtime for streaming agent progress, Auth for user management, and Storage for file uploads — all in one platform. Firebase lacks RLS and pgvector. Prisma adds an ORM layer we don't need when Supabase's client SDK is type-safe from generated types.

pgvector in Supabase over Pinecone / Weaviate

Options: Separate vector DB vs pgvector extension

Our vector data (client intelligence embeddings) is tightly coupled to relational data (org_id, client_name, deal_history). A separate vector DB means syncing two databases. pgvector keeps vectors in the same PostgreSQL instance, queried with the same RLS policies, in the same transaction. At our scale (<100K vectors per org), HNSW index performance is more than sufficient.

Separate Brain Service over In-Worker Agents

Options: Agents in TS worker vs Dedicated Python brain

The 7-agent pipeline needs 60-300 seconds and complex state management (resume-from-step, cached outputs, feedback loops). Running this in the TypeScript worker would require reimplementing Python's asyncio patterns and the Anthropic Python SDK's superior structured output support. The brain polls Supabase independently — if it goes down, the worker still processes simple jobs.

Poll-Based Job Queue over Message Broker

Options: RabbitMQ / SQS vs PostgreSQL polling

Both the worker and brain poll Supabase using FOR UPDATE SKIP LOCKED — an atomic claim pattern that prevents duplicate processing and supports horizontal scaling. No message broker to manage, no dead letter queues to monitor, no additional infrastructure. The jobs table IS the queue. Debuggable with a SQL query.

Return-Based Auth Errors over Throw-Based

Options: throw new AuthError() vs return Response(401)

requireSession() returns SessionContext | Response. API routes check if (ctx instanceof Response) return ctx. This eliminates try/catch boilerplate, makes the error path explicit in the type system, and prevents unhandled auth errors from leaking stack traces.

Language Boundary Map

TypeScript

Next.js App (Vercel)

50+ API Routes

Edge Middleware

Worker (Fly.io)

React Components

SUPABASE
SHARED STATE

Python

FastAPI Brain (Fly.io)

7-Agent Orchestrator

Memory Manager

Embedding Generator

Prospect Scraper

Section 07

Database Architecture

25+ tables, Row-Level Security on every one, pgvector for embeddings. Color-coded by domain.

Core / Auth

Content

Intelligence

Billing

System

Entity Relationship Diagram

organizations

id PK

name text

plan text

branding jsonb

stripe_customer_id text

chatbot_instructions text

profiles

id PK

email text

full_name text

phone text

onboarding_complete bool

org_members

org_id FK → organizations

user_id FK → profiles

role text

invited_by uuid

clients

id PK

org_id FK → organizations

name text

email, phone text

status text

proposals

id PK

org_id FK → organizations

client_id FK → clients

title, status, slug text

executive_summary text

features, tech_stack jsonb

brain_status text

deal_outcome text

contracts

id PK

org_id FK → organizations

client_name text

title, status, slug text

scope, terms text

templates

id PK

org_id FK → organizations

name, type text

content jsonb

is_system bool

services

id PK

org_id FK → organizations

name text

description text

client_intelligence

org_id FK → organizations

client_name text (UNIQUE)

psychographics jsonb

deal_history jsonb[]

embedding vector(1536)

org_brain

id PK

org_id FK → organizations

content_type text

content_text text

embedding vector(1536)

agent_runs

id PK

proposal_id FK → proposals

status, current_step text

internal_log jsonb[]

cached_writer_output jsonb

resume_from text

prospect_intelligence

org_id FK → organizations

prospect_name text (UNIQUE)

website_url text

profile jsonb

cached_at timestamptz

jobs

id PK

org_id FK → organizations

job_type text

status text

payload, result jsonb

actual_tokens int

audit_log

user_id FK → profiles

org_id FK → organizations

action text

resource_type text

ip, user_agent text

system_settings

key PK

value jsonb

description text

invoices

id PK

org_id FK → organizations

stripe_invoice_id text

amount, status text

Row-Level Security (RLS): Every table has RLS policies that filter by org_id. When a user queries proposals, they only see their organization's proposals — enforced at the database layer, not application code. Even a bug in the API route can't leak data across tenants.

Section 08

Brain Service — Multi-Agent Orchestration

A 7-agent Python pipeline that researches prospects, builds strategy, writes proposals, self-reviews, and remembers what it learned.

Agent Pipeline Flow

1

Context Node

Loads org brain, client intelligence (psychographics + deal history), runs pgvector similarity search for similar past clients. Detects cold start vs returning client.

2

Research Node

Scrapes prospect website via Jina Reader. Extracts industry, services, tone, size signals, differentiator. Caches in prospect_intelligence. Never blocks pipeline on failure.

3

Strategist Node

Claude Sonnet analyzes all context: client history + research + org knowledge. Outputs: positioning, tone, key_themes, pricing_strategy, risks_to_address. The strategic brain.

Feedback Loop

4

Writer Node

Claude Sonnet generates full ProposalData JSON: cover, summary, problem, features, architecture, tech stack, timeline, pricing, about, next steps. Matches org voice from brain.

5

Reviewer Node

Claude Haiku validates completeness against proposal heuristics. Flags embarrassing gaps. Soft pass on parse failure — never blocks delivery. Loops back to Writer if gaps found.

6

Commit Node

Writes final proposal JSON to Supabase. Sets brain_status = 'content_generated'. Marks needs_human_review if reviewer didn't pass.

7

Memory Node

Non-fatal write-back. Upserts client_intelligence with psychographics and deal entry. Generates OpenAI embedding for future vector search. Pipeline succeeds even if this fails.

Production Safety Patterns

300s Wall-Clock Timeout

The entire pipeline is wrapped in asyncio.wait_for() with a 300-second hard limit. If any agent hangs, the pipeline is killed cleanly.

Resume-From-Step

Writer output is cached in cached_writer_output. If the reviewer or commit fails, the next attempt skips directly to the cached checkpoint.

Non-Fatal Memory

The Memory Node runs after the proposal is committed. If embedding generation or intelligence upsert fails, the proposal is still delivered.

Atomic Job Claiming

FOR UPDATE SKIP LOCKED prevents duplicate processing. Multiple brain instances can poll simultaneously without conflict.

Multi-Model Strategy: The brain uses different Claude models for different tasks. Sonnet for writing and strategy (creative, long-form). Haiku for review (fast validation, 80% cheaper). This optimizes for both quality and cost.

Section 09

Client Intelligence — Memory That Compounds

Every proposal teaches the system. Client psychographics, deal outcomes, and strategy patterns are stored and retrieved via vector search.

Intelligence Flywheel

B.

Client
Intelligence

Generate Proposal Brain pipeline runs

Extract Psychographics Tone, themes, strategy

Store + Embed pgvector 1536-dim

Track Outcome Won / Lost / Pending

Next Proposal Same or similar client

Vector Search Find similar clients

Load History Prior deals inform strategy

Smarter Strategy Personalized positioning

Data Model

Psychographics (JSONB)

"preferred_tone": "professional but approachable"
"key_themes": ["scalability", "cost savings"]
"risks_noted": ["budget conscious"]
"pricing_strategy_used": "value-based"
        

Deal History (JSONB Array)

[{
  "proposal_id": "uuid-abc...",
  "date": "2026-03-15",
  "positioning": "technical partner",
  "key_themes": ["automation"],
  "outcome": "won"
}]
        

How vector search works: When a new proposal is created, the brain calls match_client_intelligence() with the requirements text embedded via OpenAI's text-embedding-3-small (1536 dimensions). PostgreSQL's pgvector extension finds the most similar past clients using cosine similarity, returning their psychographics and deal history. The strategist uses this to personalize the approach.

Day 1

Cold start. No history. Generic but on-brand proposal using org brain knowledge.

Day 30

5 proposals generated. Vector search finds similar clients. Strategy starts personalizing.

Day 90

Returning client with full context. Knows their tone, themes, what won before. Near-automatic.

Section 10

Job Architecture

Two job processors, one shared queue in PostgreSQL, atomic claiming, and production-grade reliability.

Job Lifecycle State Machine

Pending

→

Claimed

→

Processing

→

Completed

Failed

Cancelled

TypeScript Worker (Fly.io)

Job types: generate_proposal, generate_contract, parse_readme, extract_text

Polling: 100ms interval via claim_next_job() RPC

Timeout: 60s hard limit per job

Retry: Transient errors (rate_limit, timeout, ai_overloaded, network) auto-retry up to max_retries

Python Brain (Fly.io)

Job types: Brain proposals (7-agent pipeline)

Polling: claim_next_brain_proposal() RPC

Timeout: 300s wall-clock limit

Progress: Real-time updates via agent_runs table + Supabase Realtime

Kill Switches: Three levels of circuit breakers. Global kill switch in system_settings stops all processing. Per-org kill switch stops one tenant. Per-job cancellation marks individual jobs. The worker checks every 5 seconds during processing.

Section 11

Auth & Multi-Tenancy

Supabase Auth with JWT cookies, RLS-enforced tenant isolation, and a four-tier role hierarchy.

Authentication Flow

User Login

→

Supabase Auth
JWT + Cookie

→

Middleware
getUser()

→

Profile +
Org Lookup

→

SessionContext
userId, orgId, role

→

RLS Filters
by org_id

Three Supabase Client Types

Client	Used By	RLS
`createBrowserClient()`	React components	Enforced
`createSupabaseServer()`	API routes, server components	Enforced
`supabaseAdmin()`	Webhooks, migrations only	Bypassed

Role Hierarchy

Role	Can Do
Owner	Everything + delete org + manage billing
Admin	Everything + manage team members
Member	Create, edit, publish documents
Viewer	Read-only access to all documents

Defense in depth: Auth is checked at THREE layers. (1) Middleware validates JWT and blocks unauthenticated requests. (2) API routes call requireSession() and optionally requireRole(). (3) PostgreSQL RLS policies filter every query by org_id. Even if layers 1 and 2 fail, RLS prevents data leaks.

Section 12

Billing Integration

Stripe-powered with four tiers, real usage enforcement, and webhook-driven plan management.

Feature	Free	Starter	Professional	Enterprise
AI Proposals / Month	3	25	100	Unlimited
Total Proposals	5	50	500	Unlimited
Total Contracts	3	25	250	Unlimited
Total Reports	2	15	100	Unlimited
Clients	5	50	500	Unlimited
Team Members	1	3	10	Unlimited
Custom Branding	—	✓	✓	✓
Remove Watermark	—	—	✓	✓

Section 13

CI/CD Pipeline

Automated quality gates on every PR. No code reaches production without passing all four checks.

Deploy Pipeline

PR Created

→

Lint

→

Typecheck

→

Test

→

Build

→

Merge
to Main

→

Vercel
Auto-Deploy

Quality Gate

Every PR runs: npm run lint, npm run typecheck, npm test, npm run build. All four must pass. No build bypasses allowed.

Type Sync

GitHub Actions auto-generates Supabase TypeScript types on merge. database.types.ts stays in sync with the actual schema.

Smoke Tests

After Vercel deploys, Playwright runs against the live preview URL. Health check + critical path validation.

Section 14

Enterprise Readiness

What exists today for enterprise deployments, and what's on the roadmap.

What We Have

Multi-Tenant Data Isolation

RLS on every table. Org data never crosses boundaries.

Role-Based Access Control

4-tier hierarchy with per-route enforcement.

Audit Logging

Every action captured: who, what, when, from where.

Rate Limiting

Distributed, per-endpoint, IP-based via Upstash Redis.

Kill Switches

3-level circuit breakers: global, per-org, per-job.

Error Sanitization

No raw API errors reach users. Sentry captures everything.

What's Missing

SSO / SAML

Currently Google OAuth only. Enterprise SSO requires Supabase Pro.

SOC 2 Compliance

Architecture supports it, but formal audit not yet conducted.

API Access

No public API for integrations yet. All access via web UI.

SLA Monitoring

Sentry for errors, but no uptime SLA or status page.

Multi-Region

Single region (us-east). Multi-region requires Supabase + Fly.io config.

Advanced Analytics

Basic AI usage dashboard. No win-rate analysis or pipeline metrics yet.

Section 15

Architecture Scorecard

Honest grades against industry best practices. No vanity metrics.

A

AI Integration

Multi-model strategy, prompt caching, embeddings, persistent memory, multi-agent orchestration with feedback loops. Cutting edge.

A

Developer Experience

TypeScript strict, auto-generated types, CI/CD quality gates, agent-governed workflow, comprehensive project documentation.

A-

Security

RLS, rate limiting, audit logging, input validation, error sanitization. Missing: SAML/SSO, formal penetration testing.

A-

Data Architecture

pgvector in PostgreSQL, JSONB for flexibility, RLS for isolation. Missing: data retention policies, backup verification.

B+

Scalability

Stateless API, async jobs, atomic claiming supports horizontal scaling. Single region, no auto-scaling on Fly.io.

B+

Reliability

Error tracking, retry logic, timeout protection, kill switches. No chaos testing, no dead letter queue.

B

Observability

Sentry errors, audit logs, console forwarding. Missing: structured metrics, APM dashboards, distributed tracing.

B

Testing

Vitest + Playwright framework established. Coverage exists but has gaps. E2E tests for critical paths.

Section 16

Lessons Learned

Hard-won knowledge from building this system. Each one cost real time and debugging sessions.

1

Build-Time vs Runtime Are Different Worlds

Next.js prerenders pages at build time where env vars may not exist. Runtime handles real requests where they MUST exist. We learned this the hard way when a "helpful" placeholder URL masked a broken Supabase connection in production. Rule: runtime code must throw on missing config. Build-time code can use placeholders.

2

Return-Based Auth Beats Throw-Based

requireSession() returns SessionContext | Response instead of throwing. This eliminated try/catch boilerplate in every API route, made auth errors explicit in the type system, and prevented stack traces from leaking to users.

3

Reviewer Parse Failures Must Never Block Delivery

The Reviewer agent validates proposals with Claude Haiku. If Haiku returns malformed JSON (it happens), the proposal should still be delivered. We implemented a soft pass pattern — parse failure = pass with needs_human_review flag set.

4

Memory Writes Must Be Non-Fatal

The Memory Node (step 7 in the brain pipeline) runs AFTER the proposal is committed. If the OpenAI embedding API is down, the user still gets their proposal. Intelligence accumulation is valuable but never worth blocking delivery.

5

Promise.allSettled Over Promise.all

When fetching usage counts across 6 tables, one failed query shouldn't crash the entire response. Promise.allSettled() gives you partial results. We only use Promise.all() when ALL results are truly required to proceed.

6

One Change Per Branch, Always

Batching unrelated changes makes root cause analysis impossible when something breaks. We enforce: one logical change per branch, one branch per PR. It feels slower but catches issues faster.

7

Vercel and Fly.io Are Separate Worlds

Secrets set in Vercel don't propagate to Fly.io. The worker and brain have their own env vars managed via fly secrets set. We've been burned by updating one and forgetting the other. Now it's in the deployment checklist.

8

Solo Founder Architecture: Optimize for Debuggability

As a solo builder, the ability to debug at 2 AM matters more than theoretical scalability. PostgreSQL polling is debuggable with a SQL query. Message brokers add infrastructure I'd have to monitor alone. Every architectural choice was filtered through: "Can I debug this by myself?"

Section 17

Frequently Asked Questions

How long does AI proposal generation take?

The TypeScript worker generates a proposal in 15-30 seconds using Claude Sonnet. The full brain pipeline (7 agents) takes 60-120 seconds depending on research complexity. Users see real-time progress updates via Supabase Realtime.

Why Python for the brain when everything else is TypeScript?

The brain orchestrates 7 agents with complex state management, async execution, and structured JSON output. Python's Anthropic SDK, asyncio, and the ML ecosystem are significantly more mature for this. TypeScript handles web, API, and simple jobs. They communicate via Supabase as shared state — no direct HTTP calls.

How does the intelligence get smarter over time?

Every proposal triggers a memory write: client psychographics (tone, themes, strategy) and a deal history entry are stored with a 1536-dimensional embedding. Future proposals for the same or similar clients retrieve this context via pgvector cosine similarity search. The more proposals you create, the more personalized they become.

What happens if the brain service goes down?

The TypeScript worker continues processing simple jobs (contracts, text extraction). Brain proposals stay in "queued" status until the brain comes back. No data is lost — Supabase is the system of record. Users can still use the standard AI generation path via the worker.

How is this different from PandaDoc or Proposify?

Those are template-filling tools. Boots generates proposals from scratch using AI that knows your business (org brain), remembers your clients (intelligence), and learns from outcomes (deal tracking). It's an AI-native proposal generator, not a document template editor.

Can this be self-hosted?

The architecture supports it. Next.js runs anywhere Node.js runs. The worker and brain are Docker containers. Supabase can be self-hosted. You'd need to manage your own Stripe, Anthropic, and OpenAI API keys. Not currently offered as a product, but architecturally feasible.

How are secrets managed?

17 environment variables across three deployment targets. Vercel manages web app secrets. Fly.io manages worker and brain secrets separately via fly secrets set. No secrets in source code, no .env files committed. Sensitive vars in Vercel cannot be viewed after setting.