Architecture
How AI agent systems are designed, orchestrated, and connected to deliver production-grade automation at scale.
AI Agent Architecture
Production AI agents are not simple prompt-and-response systems. They require deliberate orchestration: how context is maintained across turns, how the LLM decides which tool to call, how sub-workflows are triggered, and how failures are handled gracefully. The architecture below reflects those design decisions.
System Architecture Diagram
High-level flow: user message → messaging channel → agent orchestrator → sub-workflows + LLM → external APIs & data sources → response
Custom architecture diagram placeholder
Replace with a detailed architecture diagram (PNG/SVG) showing the full Sophia AI system: Meta Lead Ads, WhatsApp Business API, n8n Orchestrator, OpenAI API, Node.js services, Google Sheets, and routing logic.
Automation Stack
Each layer of the stack is selected for reliability, observability, and extensibility. The goal is a system that can be understood, debugged, and extended by any competent engineer — not a black box dependent on a single vendor or paradigm.
Visual, code-extensible workflow automation. Wires together triggers, AI model calls, API requests, and business logic into reliable, auditable pipelines. Currently running 20+ active production workflows.
Lightweight, stateless service modules handle complex decision trees, data transformations, and routing logic that exceeds what n8n nodes can express natively. Prior to n8n, the entire lead pipeline ran on custom Node.js services.
Multi-model strategy: GPT-4o for conversational agents (Sophia) and intent classification; Whisper for audio transcription; Claude for coding and reasoning tasks; Gemini for large-context processing and GCP ecosystem integration.
Generates personalized audio greetings in the customer's detected language using language-specific voice models. Output is Opus-encoded audio delivered directly via WhatsApp. Session recorded in MongoDB chat memory.
Real-time bidirectional messaging via Meta's official Cloud API — webhook ingestion, template messages, interactive buttons, and media. Twilio used for SMS in the pre-n8n Node.js pipeline.
Appointment booking sub-workflow called via LangChain tool use. The agent extracts appointment details from conversation (time, client name, email, salesperson) via $fromAI() and creates structured calendar events with participant metadata.
MongoDB stores persistent conversation memory (chatMemory collection, keyed by WhatsApp ID) and the live inventory database (ucdInventory). Google Sheets API serves as a low-friction operational datastore for leads and records. MySQL for relational persistence.
Document AI processors automate structured data extraction from automotive funding forms (credit applications, lease documents), replacing manual data entry. Vertex AI used for Gemini orchestration and cloud-native ML tasks.
Facebook/Instagram lead forms feed into the pipeline via webhook, triggering qualification within seconds. Google Sheets API, Drive API, and Gmail API power inventory synchronization, document management, and automated email workflows.
AWS: EC2 (compute), S3 (object storage), Route 53 (DNS), SES (transactional email), IAM (access management). GCP: Cloud Functions, Cloud Storage, Document AI, Vertex AI. Multi-cloud deployment strategy based on service fit.
Design Principles
Tools & Technologies
The full inventory of tools, platforms, and protocols used across AI agent and automation projects — organized by function.
AI & Language Models
- OpenAI GPT / GPT-4o-mini— Conversational agents & routing
- OpenAI Whisper— Audio transcription
- Anthropic Claude— Coding & complex reasoning
- Google Gemini / Vertex AI— Large-context processing & GCP
- ElevenLabs eleven_multilingual_v2— Voice synthesis (30+ languages)
- Function Calling / Tool Use— Structured agent actions
- Prompt Engineering— System prompts & context control
- RAG Pipelines— Retrieval-augmented generation
Orchestration & Automation
- n8n (20+ production workflows)— Visual workflow orchestration
- LangChain (LLM agent framework)— Agent + tool invocation
- Webhook ingestion— Event-driven triggers
- Cron scheduling— Time-based automation
- Error handling & retries— Resilient pipelines
- Mautic— Email campaign automation
- Zapier— Supplemental automation
Runtime & Backend
- Node.js— Primary runtime
- TypeScript / JavaScript— Service & agent code
- REST APIs— Integration pattern
- Python (scripting)— Data processing & ML tasks
- Postman collections— API testing & CRM integration
Messaging, Calendar & Channels
- WhatsApp Business Cloud API— Bidirectional messaging
- Meta Lead Ads webhooks— Lead ingestion
- Twilio— SMS in early pipeline
- Nylas Calendar API— Appointment booking
- ElevenLabs Voice— Audio greetings via WhatsApp
Data & Storage
- MongoDB Atlas— Chat memory & inventory DB
- Google Sheets API— Operational datastore
- Google Drive & Gmail APIs— Document & email workflows
- MySQL— Relational persistence
- Google Cloud Document AI— OCR & structured extraction
Cloud & Infrastructure
- AWS EC2— Compute
- AWS S3— Object storage
- AWS Route 53 / SES / IAM— DNS, email, access control
- GCP Cloud Functions— Serverless execution
- GCP Cloud Storage— File storage
- Git / GitHub— Version control