Engineering Approach

Architecture

How AI agent systems are designed, orchestrated, and connected to deliver production-grade automation at scale.

AI Agent Architecture

Production AI agents are not simple prompt-and-response systems. They require deliberate orchestration: how context is maintained across turns, how the LLM decides which tool to call, how sub-workflows are triggered, and how failures are handled gracefully. The architecture below reflects those design decisions.

LLM Orchestration

Language model calls are structured with system prompts, conversation history, function definitions, and output schemas. The orchestrator manages context window limits, injects dynamic data (inventory, appointments), and routes the model's tool-use decisions to downstream services.

Workflow Routing

Each user message is classified by intent before processing. Routing logic determines whether the request flows to a qualification workflow, an inventory query, a booking flow, or a human escalation path. This prevents the LLM from being the sole decision-maker for business-critical branching.

External API Integrations

Agents interact with the outside world through typed integration modules — thin wrappers around external APIs that normalize errors, handle retries, and return structured data the LLM can reason about. APIs are never called directly from prompts.

Automation Pipelines

n8n workflows serve as the durable execution layer. They handle event triggers, sequential steps, parallel branches, error handling, and human-in-the-loop checkpoints — ensuring the agent operates reliably even when upstream services experience delays or failures.

System Architecture Diagram

User

WhatsApp Message

Messaging Channel

WhatsApp Business Cloud API

Agent Orchestrator

n8n + Node.js

Lead Qualification

Intent classification

LLM Reasoning

OpenAI GPT-4

Booking Flow

Calendar integration

Meta Lead Ads

Lead ingestion

Google Sheets

CRM / data store

Inventory API

Vehicle data

High-level flow: user message → messaging channel → agent orchestrator → sub-workflows + LLM → external APIs & data sources → response

Custom architecture diagram placeholder

Replace with a detailed architecture diagram (PNG/SVG) showing the full Sophia AI system: Meta Lead Ads, WhatsApp Business API, n8n Orchestrator, OpenAI API, Node.js services, Google Sheets, and routing logic.

Automation Stack

Each layer of the stack is selected for reliability, observability, and extensibility. The goal is a system that can be understood, debugged, and extended by any competent engineer — not a black box dependent on a single vendor or paradigm.

Orchestrationn8n

Visual, code-extensible workflow automation. Wires together triggers, AI model calls, API requests, and business logic into reliable, auditable pipelines. Currently running 20+ active production workflows.

Business LogicNode.js Microservices

Lightweight, stateless service modules handle complex decision trees, data transformations, and routing logic that exceeds what n8n nodes can express natively. Prior to n8n, the entire lead pipeline ran on custom Node.js services.

AI / LLM LayerOpenAI GPT + Whisper · Claude · Gemini

Multi-model strategy: GPT-4o for conversational agents (Sophia) and intent classification; Whisper for audio transcription; Claude for coding and reasoning tasks; Gemini for large-context processing and GCP ecosystem integration.

Voice AIElevenLabs eleven_multilingual_v2

Generates personalized audio greetings in the customer's detected language using language-specific voice models. Output is Opus-encoded audio delivered directly via WhatsApp. Session recorded in MongoDB chat memory.

Messaging ChannelWhatsApp Business Cloud API · Twilio

Real-time bidirectional messaging via Meta's official Cloud API — webhook ingestion, template messages, interactive buttons, and media. Twilio used for SMS in the pre-n8n Node.js pipeline.

Calendar & SchedulingNylas Calendar API

Appointment booking sub-workflow called via LangChain tool use. The agent extracts appointment details from conversation (time, client name, email, salesperson) via $fromAI() and creates structured calendar events with participant metadata.

Data & StorageMongoDB Atlas · Google Sheets API · MySQL

MongoDB stores persistent conversation memory (chatMemory collection, keyed by WhatsApp ID) and the live inventory database (ucdInventory). Google Sheets API serves as a low-friction operational datastore for leads and records. MySQL for relational persistence.

Intelligent Document ProcessingGoogle Cloud Document AI · Vertex AI

Document AI processors automate structured data extraction from automotive funding forms (credit applications, lease documents), replacing manual data entry. Vertex AI used for Gemini orchestration and cloud-native ML tasks.

Lead AcquisitionMeta Lead Ads · Google Workspace APIs

Facebook/Instagram lead forms feed into the pipeline via webhook, triggering qualification within seconds. Google Sheets API, Drive API, and Gmail API power inventory synchronization, document management, and automated email workflows.

Cloud InfrastructureAWS · Google Cloud Platform

AWS: EC2 (compute), S3 (object storage), Route 53 (DNS), SES (transactional email), IAM (access management). GCP: Cloud Functions, Cloud Storage, Document AI, Vertex AI. Multi-cloud deployment strategy based on service fit.

Design Principles

Separation of Concerns

The LLM handles language understanding. Orchestration handles workflow. APIs handle data. No single component does everything — this makes each layer independently testable and replaceable.

Graceful Degradation

When an external API is unavailable or an LLM call fails, the system routes to a fallback path rather than surfacing a raw error to the end user. Every integration point has an error strategy.

Auditability

Every significant agent decision — intent classification result, tool call made, response sent — is logged with timestamps and metadata. This enables debugging, performance analysis, and compliance.

Human-in-the-Loop Checkpoints

For high-stakes decisions (escalation, large transactions, ambiguous intent), the system pauses and routes to a human operator rather than guessing. Automation should increase reliability, not introduce unchecked risk.

Tools & Technologies

The full inventory of tools, platforms, and protocols used across AI agent and automation projects — organized by function.

AI & Language Models

OpenAI GPT / GPT-4o-mini— Conversational agents & routing
OpenAI Whisper— Audio transcription
Anthropic Claude— Coding & complex reasoning
Google Gemini / Vertex AI— Large-context processing & GCP
ElevenLabs eleven_multilingual_v2— Voice synthesis (30+ languages)
Function Calling / Tool Use— Structured agent actions
Prompt Engineering— System prompts & context control
RAG Pipelines— Retrieval-augmented generation

Orchestration & Automation

n8n (20+ production workflows)— Visual workflow orchestration
LangChain (LLM agent framework)— Agent + tool invocation
Webhook ingestion— Event-driven triggers
Cron scheduling— Time-based automation
Error handling & retries— Resilient pipelines
Mautic— Email campaign automation
Zapier— Supplemental automation

Runtime & Backend

Node.js— Primary runtime
TypeScript / JavaScript— Service & agent code
REST APIs— Integration pattern
Python (scripting)— Data processing & ML tasks
Postman collections— API testing & CRM integration

Messaging, Calendar & Channels

WhatsApp Business Cloud API— Bidirectional messaging
Meta Lead Ads webhooks— Lead ingestion
Twilio— SMS in early pipeline
Nylas Calendar API— Appointment booking
ElevenLabs Voice— Audio greetings via WhatsApp

Data & Storage

MongoDB Atlas— Chat memory & inventory DB
Google Sheets API— Operational datastore
Google Drive & Gmail APIs— Document & email workflows
MySQL— Relational persistence
Google Cloud Document AI— OCR & structured extraction

Cloud & Infrastructure

AWS EC2— Compute
AWS S3— Object storage
AWS Route 53 / SES / IAM— DNS, email, access control
GCP Cloud Functions— Serverless execution
GCP Cloud Storage— File storage
Git / GitHub— Version control