Boots On The Ground AI
A comprehensive technical and business reference for the AI-powered proposal generation platform built for small businesses.
The Problem We Solve
Small businesses lose deals because proposals take too long, sound generic, and carry no memory of what worked before.
4–8 Hours Per Proposal
A solo consultant or small team spends half a workday writing each proposal manually. That's time not spent closing deals or delivering work.
No Memory, No Learning
Every proposal starts from scratch. What worked for a similar client last quarter? What tone landed the deal? Nobody remembers, nobody tracks it.
Enterprise Tools Don't Fit
PandaDoc, Proposify, and Qwilr cost $500+/mo, require team onboarding, and weren't built for AI-native workflows. Overkill for a 1-10 person shop.
Business Requirements
What the system must do, why it must do it, and the constraints that shaped every decision.
Multi-Tenant SaaS
Every organization's data is completely isolated. One codebase, one database, infinite tenants. Row-Level Security enforced at the database layer — not application code.
AI That Matches Your Voice
Generated proposals must sound like the business, not like ChatGPT. The system ingests org knowledge (voice, services, pricing philosophy) and uses it in every generation.
Intelligence That Compounds
Each proposal teaches the system. Client psychographics, deal outcomes, and strategy patterns are stored and retrieved via vector search for future proposals.
Production-Grade Security
Auth with JWT cookies, RLS on every table, rate limiting on every endpoint, audit logging on every action. No shortcuts, no "we'll add that later."
Tiered Billing
Free tier for trial, paid tiers with real limits. Stripe-powered checkout, webhook-driven plan updates, usage tracking per org.
White-Label Ready
Each org can customize branding — colors, logo, proposal templates. The platform itself is white-labelable from a single config file.
What We Built
A complete feature inventory of the Boots Proposal System, organized by domain.
Document Generation
AI Proposal Generation
LiveMulti-method creation: AI chat, notes-to-proposal, or brain pipeline. Full proposal with cover, summary, features, architecture, timeline, and pricing.
Contract Generation
LiveGenerate service agreements from scratch or convert approved proposals. Legal section templates with customizable terms.
Report / Assessment
LiveTechnical assessment reports with protocol parsing. Professional formatting with executive summary.
Rich Editors
LiveFull-featured editors for proposals, contracts, and reports with tabbed sections, live preview, image upload, and auto-save.
Template Builder
LiveCreate and manage reusable templates. System templates across industries. Save any proposal as a template.
Public Share Links
LiveGenerate shareable URLs for proposals, contracts, and reports. Read-only, no auth required, with view tracking.
Intelligence Layer
Client Intelligence
LivePsychographic profiles per client: preferred tone, key themes, risks, pricing strategy. Persisted across proposals with deal history.
Vector Search (pgvector)
LiveOpenAI embeddings (1536-dim) stored in PostgreSQL. Cosine similarity search finds similar past clients to inform new proposals.
7-Agent Brain Pipeline
LiveContext → Research → Strategist → Writer → Reviewer → Commit → Memory. Feedback loops, resume-from-step, and cached outputs.
Org Brain
LiveKnowledge base per org: voice guidance, service descriptions, pricing context, company profile. Feeds every AI generation.
Prospect Research
LiveAuto-scrapes prospect websites, extracts industry/services/tone profile, caches results. Never blocks the pipeline on failure.
Deal Outcome Tracking
FoundationTrack won/lost/pending on proposals. Atomic update propagated to client intelligence deal history for future strategy.
Platform & Operations
Multi-Tenant Auth
LiveSupabase Auth with JWT cookies, Google OAuth, email/password, OTP. Role hierarchy: owner > admin > member > viewer.
Stripe Billing
Live4 tiers (free/starter/pro/enterprise). Checkout sessions, customer portal, webhook-driven plan updates, usage enforcement.
Audit Logging
LiveEvery action logged: user, org, action type, resource, IP, user agent. Fire-and-forget — never crashes the main operation.
Rate Limiting
LiveUpstash Redis distributed rate limiting. Per-endpoint configs: AI generation (10/min), auth (10/min), standard (60/min).
Error Tracking
LiveSentry on client, server, and edge. Source maps, full traces, console log forwarding. User-friendly error messages only.
Magic Onboarding
LiveAI-powered onboarding: scrapes your website, extracts company profile, generates org brain entries. Ready to generate in minutes.
Architecture Overview
Three deployment targets, one shared database, zero single points of failure in the generation pipeline.
Request Flow: Proposal Generation
"Generate"
Auth Gate
Rate Limit
to Supabase
Claims Job
Pipeline
to Supabase
via Realtime
Tech Stack
Every technology was chosen for a specific reason. No resume-driven development.
Next.js 16
App Router, React Server Components, streaming, edge middleware, Turbopack. The full-stack React framework.
Supabase
PostgreSQL with RLS, pgvector, Realtime, Auth, Storage. One platform replaces 5 services.
Python FastAPI
Multi-agent orchestration, async pipeline, ML ecosystem access. Where TypeScript doesn't fit.
Claude (Anthropic)
Opus for strategy, Sonnet for writing, Haiku for validation. Multi-model by design.
pgvector
Embeddings in PostgreSQL. No separate vector DB. HNSW index for fast cosine similarity.
Stripe
Checkout sessions, subscription management, webhook-driven plan updates. Test mode ready.
Fly.io
Docker containers with no timeout limits. Background jobs that run for 5+ minutes. ORD region.
GitHub Actions
Automated quality gates: lint, typecheck, test, build. Conditional deploys to Vercel + Fly.io.
Design Decisions
The "why" behind every architectural choice. These aren't arbitrary — each one solved a real problem.
TypeScript for the Web App, Python for the Brain
The brain service orchestrates 7 agents with complex state management, async execution, and structured JSON output. Python's Anthropic SDK, asyncio, and the ML ecosystem (embeddings, NLP) are significantly more mature for this use case. The web app, API routes, and worker stay TypeScript for type-safe React and shared types with the database. They communicate via Supabase as shared state — no direct HTTP calls between services.
Supabase over Firebase / Prisma / Raw PostgreSQL
Supabase gives us PostgreSQL with RLS for multi-tenancy, pgvector for embeddings, Realtime for streaming agent progress, Auth for user management, and Storage for file uploads — all in one platform. Firebase lacks RLS and pgvector. Prisma adds an ORM layer we don't need when Supabase's client SDK is type-safe from generated types.
pgvector in Supabase over Pinecone / Weaviate
Our vector data (client intelligence embeddings) is tightly coupled to relational data (org_id, client_name, deal_history). A separate vector DB means syncing two databases. pgvector keeps vectors in the same PostgreSQL instance, queried with the same RLS policies, in the same transaction. At our scale (<100K vectors per org), HNSW index performance is more than sufficient.
Separate Brain Service over In-Worker Agents
The 7-agent pipeline needs 60-300 seconds and complex state management (resume-from-step, cached outputs, feedback loops). Running this in the TypeScript worker would require reimplementing Python's asyncio patterns and the Anthropic Python SDK's superior structured output support. The brain polls Supabase independently — if it goes down, the worker still processes simple jobs.
Poll-Based Job Queue over Message Broker
Both the worker and brain poll Supabase using FOR UPDATE SKIP LOCKED — an atomic claim pattern that prevents duplicate processing and supports horizontal scaling. No message broker to manage, no dead letter queues to monitor, no additional infrastructure. The jobs table IS the queue. Debuggable with a SQL query.
Return-Based Auth Errors over Throw-Based
requireSession() returns SessionContext | Response. API routes check if (ctx instanceof Response) return ctx. This eliminates try/catch boilerplate, makes the error path explicit in the type system, and prevents unhandled auth errors from leaking stack traces.
SHARED STATE
Database Architecture
25+ tables, Row-Level Security on every one, pgvector for embeddings. Color-coded by domain.
org_id. When a user queries proposals, they only see their organization's proposals — enforced at the database layer, not application code. Even a bug in the API route can't leak data across tenants.
Brain Service — Multi-Agent Orchestration
A 7-agent Python pipeline that researches prospects, builds strategy, writes proposals, self-reviews, and remembers what it learned.
Context Node
Loads org brain, client intelligence (psychographics + deal history), runs pgvector similarity search for similar past clients. Detects cold start vs returning client.
Research Node
Scrapes prospect website via Jina Reader. Extracts industry, services, tone, size signals, differentiator. Caches in prospect_intelligence. Never blocks pipeline on failure.
Strategist Node
Claude Sonnet analyzes all context: client history + research + org knowledge. Outputs: positioning, tone, key_themes, pricing_strategy, risks_to_address. The strategic brain.
Writer Node
Claude Sonnet generates full ProposalData JSON: cover, summary, problem, features, architecture, tech stack, timeline, pricing, about, next steps. Matches org voice from brain.
Reviewer Node
Claude Haiku validates completeness against proposal heuristics. Flags embarrassing gaps. Soft pass on parse failure — never blocks delivery. Loops back to Writer if gaps found.
Commit Node
Writes final proposal JSON to Supabase. Sets brain_status = 'content_generated'. Marks needs_human_review if reviewer didn't pass.
Memory Node
Non-fatal write-back. Upserts client_intelligence with psychographics and deal entry. Generates OpenAI embedding for future vector search. Pipeline succeeds even if this fails.
Production Safety Patterns
300s Wall-Clock Timeout
The entire pipeline is wrapped in asyncio.wait_for() with a 300-second hard limit. If any agent hangs, the pipeline is killed cleanly.
Resume-From-Step
Writer output is cached in cached_writer_output. If the reviewer or commit fails, the next attempt skips directly to the cached checkpoint.
Non-Fatal Memory
The Memory Node runs after the proposal is committed. If embedding generation or intelligence upsert fails, the proposal is still delivered.
Atomic Job Claiming
FOR UPDATE SKIP LOCKED prevents duplicate processing. Multiple brain instances can poll simultaneously without conflict.
Client Intelligence — Memory That Compounds
Every proposal teaches the system. Client psychographics, deal outcomes, and strategy patterns are stored and retrieved via vector search.
Intelligence
Data Model
Psychographics (JSONB)
Deal History (JSONB Array)
match_client_intelligence() with the requirements text embedded via OpenAI's text-embedding-3-small (1536 dimensions). PostgreSQL's pgvector extension finds the most similar past clients using cosine similarity, returning their psychographics and deal history. The strategist uses this to personalize the approach.
Day 1
Cold start. No history. Generic but on-brand proposal using org brain knowledge.
Day 30
5 proposals generated. Vector search finds similar clients. Strategy starts personalizing.
Day 90
Returning client with full context. Knows their tone, themes, what won before. Near-automatic.
Job Architecture
Two job processors, one shared queue in PostgreSQL, atomic claiming, and production-grade reliability.
TypeScript Worker (Fly.io)
Job types: generate_proposal, generate_contract, parse_readme, extract_text
Polling: 100ms interval via claim_next_job() RPC
Timeout: 60s hard limit per job
Retry: Transient errors (rate_limit, timeout, ai_overloaded, network) auto-retry up to max_retries
Python Brain (Fly.io)
Job types: Brain proposals (7-agent pipeline)
Polling: claim_next_brain_proposal() RPC
Timeout: 300s wall-clock limit
Progress: Real-time updates via agent_runs table + Supabase Realtime
Auth & Multi-Tenancy
Supabase Auth with JWT cookies, RLS-enforced tenant isolation, and a four-tier role hierarchy.
JWT + Cookie
getUser()
Org Lookup
userId, orgId, role
by org_id
Three Supabase Client Types
| Client | Used By | RLS |
|---|---|---|
createBrowserClient() | React components | Enforced |
createSupabaseServer() | API routes, server components | Enforced |
supabaseAdmin() | Webhooks, migrations only | Bypassed |
Role Hierarchy
| Role | Can Do |
|---|---|
| Owner | Everything + delete org + manage billing |
| Admin | Everything + manage team members |
| Member | Create, edit, publish documents |
| Viewer | Read-only access to all documents |
requireSession() and optionally requireRole(). (3) PostgreSQL RLS policies filter every query by org_id. Even if layers 1 and 2 fail, RLS prevents data leaks.
Billing Integration
Stripe-powered with four tiers, real usage enforcement, and webhook-driven plan management.
| Feature | Free | Starter | Professional | Enterprise |
|---|---|---|---|---|
| AI Proposals / Month | 3 | 25 | 100 | Unlimited |
| Total Proposals | 5 | 50 | 500 | Unlimited |
| Total Contracts | 3 | 25 | 250 | Unlimited |
| Total Reports | 2 | 15 | 100 | Unlimited |
| Clients | 5 | 50 | 500 | Unlimited |
| Team Members | 1 | 3 | 10 | Unlimited |
| Custom Branding | — | ✓ | ✓ | ✓ |
| Remove Watermark | — | — | ✓ | ✓ |
CI/CD Pipeline
Automated quality gates on every PR. No code reaches production without passing all four checks.
to Main
Auto-Deploy
Quality Gate
Every PR runs: npm run lint, npm run typecheck, npm test, npm run build. All four must pass. No build bypasses allowed.
Type Sync
GitHub Actions auto-generates Supabase TypeScript types on merge. database.types.ts stays in sync with the actual schema.
Smoke Tests
After Vercel deploys, Playwright runs against the live preview URL. Health check + critical path validation.
Enterprise Readiness
What exists today for enterprise deployments, and what's on the roadmap.
What We Have
Multi-Tenant Data Isolation
RLS on every table. Org data never crosses boundaries.
Role-Based Access Control
4-tier hierarchy with per-route enforcement.
Audit Logging
Every action captured: who, what, when, from where.
Rate Limiting
Distributed, per-endpoint, IP-based via Upstash Redis.
Kill Switches
3-level circuit breakers: global, per-org, per-job.
Error Sanitization
No raw API errors reach users. Sentry captures everything.
What's Missing
SSO / SAML
Currently Google OAuth only. Enterprise SSO requires Supabase Pro.
SOC 2 Compliance
Architecture supports it, but formal audit not yet conducted.
API Access
No public API for integrations yet. All access via web UI.
SLA Monitoring
Sentry for errors, but no uptime SLA or status page.
Multi-Region
Single region (us-east). Multi-region requires Supabase + Fly.io config.
Advanced Analytics
Basic AI usage dashboard. No win-rate analysis or pipeline metrics yet.
Architecture Scorecard
Honest grades against industry best practices. No vanity metrics.
AI Integration
Multi-model strategy, prompt caching, embeddings, persistent memory, multi-agent orchestration with feedback loops. Cutting edge.
Developer Experience
TypeScript strict, auto-generated types, CI/CD quality gates, agent-governed workflow, comprehensive project documentation.
Security
RLS, rate limiting, audit logging, input validation, error sanitization. Missing: SAML/SSO, formal penetration testing.
Data Architecture
pgvector in PostgreSQL, JSONB for flexibility, RLS for isolation. Missing: data retention policies, backup verification.
Scalability
Stateless API, async jobs, atomic claiming supports horizontal scaling. Single region, no auto-scaling on Fly.io.
Reliability
Error tracking, retry logic, timeout protection, kill switches. No chaos testing, no dead letter queue.
Observability
Sentry errors, audit logs, console forwarding. Missing: structured metrics, APM dashboards, distributed tracing.
Testing
Vitest + Playwright framework established. Coverage exists but has gaps. E2E tests for critical paths.
Lessons Learned
Hard-won knowledge from building this system. Each one cost real time and debugging sessions.
Build-Time vs Runtime Are Different Worlds
Next.js prerenders pages at build time where env vars may not exist. Runtime handles real requests where they MUST exist. We learned this the hard way when a "helpful" placeholder URL masked a broken Supabase connection in production. Rule: runtime code must throw on missing config. Build-time code can use placeholders.
Return-Based Auth Beats Throw-Based
requireSession() returns SessionContext | Response instead of throwing. This eliminated try/catch boilerplate in every API route, made auth errors explicit in the type system, and prevented stack traces from leaking to users.
Reviewer Parse Failures Must Never Block Delivery
The Reviewer agent validates proposals with Claude Haiku. If Haiku returns malformed JSON (it happens), the proposal should still be delivered. We implemented a soft pass pattern — parse failure = pass with needs_human_review flag set.
Memory Writes Must Be Non-Fatal
The Memory Node (step 7 in the brain pipeline) runs AFTER the proposal is committed. If the OpenAI embedding API is down, the user still gets their proposal. Intelligence accumulation is valuable but never worth blocking delivery.
Promise.allSettled Over Promise.all
When fetching usage counts across 6 tables, one failed query shouldn't crash the entire response. Promise.allSettled() gives you partial results. We only use Promise.all() when ALL results are truly required to proceed.
One Change Per Branch, Always
Batching unrelated changes makes root cause analysis impossible when something breaks. We enforce: one logical change per branch, one branch per PR. It feels slower but catches issues faster.
Vercel and Fly.io Are Separate Worlds
Secrets set in Vercel don't propagate to Fly.io. The worker and brain have their own env vars managed via fly secrets set. We've been burned by updating one and forgetting the other. Now it's in the deployment checklist.
Solo Founder Architecture: Optimize for Debuggability
As a solo builder, the ability to debug at 2 AM matters more than theoretical scalability. PostgreSQL polling is debuggable with a SQL query. Message brokers add infrastructure I'd have to monitor alone. Every architectural choice was filtered through: "Can I debug this by myself?"
Frequently Asked Questions
fly secrets set. No secrets in source code, no .env files committed. Sensitive vars in Vercel cannot be viewed after setting.Roadmap
Where we're going next. Foundation first, then features.
Stabilize Foundation
Dev environment hardening, dependency pinning, Doppler credentials, Playwright testing, CI/CD pipeline validation. No new features until this passes.
Contract Dashboard + QuickBooks
Full contract management dashboard, e-signature flow, QuickBooks integration for invoice sync.
Intelligence UI
Surface the intelligence backend in the frontend. Client psychographic profiles visible in UI, deal outcome tracking, win-rate analytics, org brain management.
Enterprise Features
SSO/SAML, API access for integrations, advanced analytics, custom webhook notifications, multi-region deployment options.