Skip to main content

Service Architecture

This document defines the internal services layer that sits between Terra and Airtable. The goal is to add document extraction, fraud signals, program intelligence, and multilingual messaging without replacing Airtable as the long-term review UI or source of truth for case management.

Core Constraints

  • Airtable remains the review source of truth for a long time.
  • All Airtable writes are intentional and happen after data lands in Supabase.
  • A pre-Airtable visualizer is required to control what gets pushed.
  • Terra stays the intake system; services enrich submissions after intake.

Service Catalog

1) Document Extraction Service

Purpose: Extract structured fields from uploaded documents and attach results to submissions. Inputs
  • Uploaded documents (Supabase Storage)
  • Submission metadata (program, applicant, document type)
Outputs
  • documents.extracted_data + confidence scores
  • Optional field mappings for Terra form prefills
  • Signals for fraud analysis
Notes
  • Extraction runs asynchronously after submission.
  • Terra remains the form system; extraction only enriches or pre-fills.

2) Review Staging (Pre-Airtable Visualizer)

Purpose: Internal UI to review enriched submissions and control what flows into Airtable. Inputs
  • Submissions + extraction results + fraud flags (Supabase)
Outputs
  • Approved payloads for Airtable push
  • Rejected/held submissions with audit trail
Notes
  • Airtable sync is manual and intentional.
  • Staging UI is a control point, not a replacement for Airtable.

3) Sentinel (Fraud Analysis)

Purpose: Run rules and scoring on submissions to surface risk flags. Inputs
  • Supabase submissions + identity + documents
  • Extraction results and derived features
Outputs
  • Fraud signals + risk scores
  • Evidence bundle for review and audit
Notes
  • Human review required; no automated denials.
  • Flags surface in staging and can be pushed to Airtable.

4) Forge (Program Studio)

Purpose: Let program managers upload reference docs, ask questions, and generate templates. Inputs
  • Program documents, policies, past templates
  • Public benefit program references
Outputs
  • Program template schema
  • Terra-ready form draft
  • Suggested eligibility logic + question bank
Notes
  • Uses RAG over Supabase embeddings.
  • Generates drafts for human refinement.

5) Messaging Studio (Multilingual Outreach)

Purpose: Generate program-aware messaging across channels and languages. Inputs
  • Program template + lifecycle events
  • Approved messaging guidelines
Outputs
  • Notifications, letters, banners, social copy
  • Translations via DeepL + human review queue
Notes
  • Sends are optional; can export copy for manual use.

6) Hub (Applicant 360)

Purpose: A read-only, unified applicant view across Terra and Pathfinder. Inputs
  • Supabase applicant + submission data
  • Airtable records (synced back when needed)
Outputs
  • Consolidated timeline for internal visibility
Notes
  • Hub is a secondary layer and not the review source of truth.

End-to-End Data Flow


Control Points (Why This Matters)

  • Supabase is the system of record for all incoming data.
  • Airtable only receives curated data from staging.
  • Fraud signals never auto-block; they require review.
  • Extraction results are advisory until reviewed and pushed.

RAG + Embeddings Strategy (Supabase)

Given the team’s SQL strength, the initial RAG system should be SQL-first:
  • Store program documents and templates in Supabase.
  • Generate embeddings into a program_docs_embeddings table.
  • Query with pgvector (Supabase) for nearest neighbors.
  • Build a thin TypeScript service for prompt assembly and generation.
This avoids a Python-first stack and keeps iteration speed high.

Suggested Build Sequence

  1. Review Staging Visualizer (control point + manual Airtable push)
  2. Document Extraction (enrichment + confidence + review surfacing)
  3. Forge Program Studio (RAG + template export to Terra)
  4. Sentinel Fraud (signals framework + review queue)
  5. Messaging Studio (multilingual content generation + approvals)

Open Questions

  • Where should the staging UI live: new app or a Terra admin module?
  • Should Airtable push be per-submission or batch-based?
  • Which document types need extraction first (paystub, ID, bank)?
  • What is the minimum template schema for Forge → Terra export?