Service Architecture

This document defines the internal services layer that sits between Terra and Airtable. The goal is to add document extraction, fraud signals, program intelligence, and multilingual messaging without replacing Airtable as the long-term review UI or source of truth for case management.

Core Constraints

Airtable remains the review source of truth for a long time.
All Airtable writes are intentional and happen after data lands in Supabase.
A pre-Airtable visualizer is required to control what gets pushed.
Terra stays the intake system; services enrich submissions after intake.

Service Catalog

1) Document Extraction Service

Purpose: Extract structured fields from uploaded documents and attach results to submissions. Inputs

Uploaded documents (Supabase Storage)
Submission metadata (program, applicant, document type)

Outputs

documents.extracted_data + confidence scores
Optional field mappings for Terra form prefills
Signals for fraud analysis

Notes

Extraction runs asynchronously after submission.
Terra remains the form system; extraction only enriches or pre-fills.

2) Review Staging (Pre-Airtable Visualizer)

Purpose: Internal UI to review enriched submissions and control what flows into Airtable. Inputs

Submissions + extraction results + fraud flags (Supabase)

Outputs

Approved payloads for Airtable push
Rejected/held submissions with audit trail

Notes

Airtable sync is manual and intentional.
Staging UI is a control point, not a replacement for Airtable.

3) Sentinel (Fraud Analysis)

Purpose: Run rules and scoring on submissions to surface risk flags. Inputs

Supabase submissions + identity + documents
Extraction results and derived features

Outputs

Fraud signals + risk scores
Evidence bundle for review and audit

Notes

Human review required; no automated denials.
Flags surface in staging and can be pushed to Airtable.

4) Forge (Program Studio)

Purpose: Let program managers upload reference docs, ask questions, and generate templates. Inputs

Program documents, policies, past templates
Public benefit program references

Outputs

Program template schema
Terra-ready form draft
Suggested eligibility logic + question bank

Notes

Uses RAG over Supabase embeddings.
Generates drafts for human refinement.

5) Messaging Studio (Multilingual Outreach)

Purpose: Generate program-aware messaging across channels and languages. Inputs

Program template + lifecycle events
Approved messaging guidelines

Outputs

Notifications, letters, banners, social copy
Translations via DeepL + human review queue

Notes

Sends are optional; can export copy for manual use.

6) Hub (Applicant 360)

Purpose: A read-only, unified applicant view across Terra and Pathfinder. Inputs

Supabase applicant + submission data
Airtable records (synced back when needed)

Outputs

Consolidated timeline for internal visibility

Notes

Hub is a secondary layer and not the review source of truth.

End-to-End Data Flow

Control Points (Why This Matters)

Supabase is the system of record for all incoming data.
Airtable only receives curated data from staging.
Fraud signals never auto-block; they require review.
Extraction results are advisory until reviewed and pushed.

RAG + Embeddings Strategy (Supabase)

Given the team’s SQL strength, the initial RAG system should be SQL-first:

Store program documents and templates in Supabase.
Generate embeddings into a program_docs_embeddings table.
Query with pgvector (Supabase) for nearest neighbors.
Build a thin TypeScript service for prompt assembly and generation.

This avoids a Python-first stack and keeps iteration speed high.

Suggested Build Sequence

Review Staging Visualizer (control point + manual Airtable push)
Document Extraction (enrichment + confidence + review surfacing)
Forge Program Studio (RAG + template export to Terra)
Sentinel Fraud (signals framework + review queue)
Messaging Studio (multilingual content generation + approvals)

Open Questions

Where should the staging UI live: new app or a Terra admin module?
Should Airtable push be per-submission or batch-based?
Which document types need extraction first (paystub, ID, bank)?
What is the minimum template schema for Forge → Terra export?

Overview

​Service Architecture

​Core Constraints

​Service Catalog

​1) Document Extraction Service

​2) Review Staging (Pre-Airtable Visualizer)

​3) Sentinel (Fraud Analysis)

​4) Forge (Program Studio)

​5) Messaging Studio (Multilingual Outreach)

​6) Hub (Applicant 360)

​End-to-End Data Flow

​Control Points (Why This Matters)

​RAG + Embeddings Strategy (Supabase)

​Suggested Build Sequence

​Open Questions

Service Architecture

Core Constraints

Service Catalog

1) Document Extraction Service

2) Review Staging (Pre-Airtable Visualizer)

3) Sentinel (Fraud Analysis)

4) Forge (Program Studio)

5) Messaging Studio (Multilingual Outreach)

6) Hub (Applicant 360)

End-to-End Data Flow

Control Points (Why This Matters)

RAG + Embeddings Strategy (Supabase)

Suggested Build Sequence

Open Questions