12 System Architecture

This section describes the architectural principles that guide the design, evolution, and scalability of the chatbot system. The objective is to support the initial deployment for the SRTFA production line and ensure seamless expansion to additional lines and domains across the factory.

12.1 Scalability Principles (Horizontal Scaling)

The system architecture prioritizes horizontal scalability, enabling growth as new production lines, machine families, and operational areas adopt the chatbot.

12.1.1 Why Horizontal Scaling

Allows multiple backend instances to run in parallel with load balancing.
Supports increased user concurrency across different production lines.
Ensures the vector database can grow as new technical documents and knowledge domains are added.
Enables isolated knowledge contexts (e.g., SRTFA, SRTFB, SMT, Final Assembly) without major rework.

12.1.2 Practical Application

Stateless backend services
Decoupled frontend deployment
Isolated AI/NLP engine
Containerized services (Docker)
Load balancer for traffic distribution

12.2 Clean Architecture (Applied to Backend and Frontend)

The system follows Clean Architecture (Uncle Bob) to ensure modularity, testability, and long-term maintainability.

12.2.1 Backend Structure

Domain Layer
Core logic and manufacturing rules.
Use Cases Layer
Operational flows (e.g., retrieve error code, fetch procedure, query machine documentation).
Interface Adapters
Controllers, DTOs, presenters, mapping layers.
Infrastructure Layer
REST APIs, vector database integration, LLM communication, and external services.

12.2.2 Frontend Structure

Reusable and isolated components
Separation of UI logic and business logic
Service layer for API communication
Architecture ready for mobile apps or operator HMIs (future)

12.3 Cloud Architecture Fundamentals

12.3.1 Initial Deployment

The initial deployment will use Render, chosen for its simplicity, low operational cost, and ease of setup during early development.

12.3.2 Future Cloud Scaling (AWS-Ready)

As usage grows across multiple lines, the system is prepared for migration to AWS services:

Compute: ECS, EKS (Kubernetes), or Lambda
Networking: API Gateway, VPC
Storage: S3 for manuals and documents
Databases: RDS, DynamoDB, or Aurora
Security: IAM, KMS
Containerization: Fargate / EKS

12.3.3 Benefits

Horizontal autoscaling
Enterprise-grade security
Full observability and monitoring
CI/CD expansion
High resilience for factory operations

12.4 AI/NLP Architecture Fundamentals

The AI layer is based on a RAG (Retrieval-Augmented Generation) pipeline to ensure precise and context-aware responses aligned with manufacturing documentation.

12.4.1 Components

LLM: GPT-4
Embeddings: OpenAI embedding models
Vector Database: Chroma, Pinecone, or Qdrant
Ingestion Pipeline:
- Document parsing (PDFs, manuals, logs)
- Text normalization
- Chunking
- Embedding generation
Retriever:
- Similarity search
- Contextual filtering (line, machine, error category)

12.4.2 Benefits

Prevents hallucinations
Ensures responses based on verified factory documents
Easy to add new machines or production lines
Continuous improvement of the knowledge base
Supports future AI workflows (summaries, anomaly detection, guided troubleshooting)

12.5 High-Level System Architecture Diagram (C4 – Container Level)

The following diagram provides a container-level view of the system and the relationships between components.