Skip to main content

High-Level Architecture

Architecture diagram

Architectural Principles

The system design is governed by five core principles:
  1. Separation of Concerns: Clear logical boundaries are enforced between ingestion (write), retrieval (read), and optimization (background) processes.
  2. Distributed Processing: High-volume ingestion is handled via async queues to absorb load without blocking the retrieval pipeline.
  3. Dual-Write Consistency: The system enforces synchronized persistence between the Metadata Store and Vector Database to ensure index integrity.
  4. Horizontal Scalability: The API layer is stateless, enabling automatic scaling across nodes.
  5. Schema Agnosticism: The architecture supports User-defined memory schemas via the Console without requiring code changes.

Component Layers

1. Management Layer

  • Visual Configuration Console: A React SPA for zero-code configuration and schema definition.
  • AI Configuration Wizard: Automated tool for generating system configurations.

2. API Gateway

  • Request Router: Stateless serverless functions that route traffic to the appropriate engine.
  • Rate Limiting & Auth: Managed services handling security and throughput control.

3. Processing Engine

  • Ingestion Pipeline: Async extraction and validation engine.
  • Adaptive Search Engine: The core logic for intent detection and strategy routing.
  • Optimization Workers: Background processes that handle deduplication, parameter tuning, and centroid calibration.

4. Intelligence Layer

  • Cloud AI APIs: Abstraction layer for LLM integration, pattern detection, and embedding generation.
  • Semantic Enrichment: Service responsible for bidirectional context expansion.

5. Storage Layer

  • Metadata Store: NoSQL Database handling ACID transactions and configuration.
  • Vector Database: Managed Vector Store for high-performance similarity search.
  • Analytics Store: Time-Series Database for recording retrieval telemetry and optimization feedback.