12  System Architecture

This section describes the architectural principles that guide the design, evolution, and scalability of the chatbot system. The objective is to support the initial deployment for the SRTFA production line and ensure seamless expansion to additional lines and domains across the factory.


12.1 Scalability Principles (Horizontal Scaling)

The system architecture prioritizes horizontal scalability, enabling growth as new production lines, machine families, and operational areas adopt the chatbot.

12.1.1 Why Horizontal Scaling

  • Allows multiple backend instances to run in parallel with load balancing.
  • Supports increased user concurrency across different production lines.
  • Ensures the vector database can grow as new technical documents and knowledge domains are added.
  • Enables isolated knowledge contexts (e.g., SRTFA, SRTFB, SMT, Final Assembly) without major rework.

12.1.2 Practical Application

  • Stateless backend services
  • Decoupled frontend deployment
  • Isolated AI/NLP engine
  • Containerized services (Docker)
  • Load balancer for traffic distribution

12.2 Clean Architecture (Applied to Backend and Frontend)

The system follows Clean Architecture (Uncle Bob) to ensure modularity, testability, and long-term maintainability.

12.2.1 Backend Structure

  • Domain Layer
    Core logic and manufacturing rules.
  • Use Cases Layer
    Operational flows (e.g., retrieve error code, fetch procedure, query machine documentation).
  • Interface Adapters
    Controllers, DTOs, presenters, mapping layers.
  • Infrastructure Layer
    REST APIs, vector database integration, LLM communication, and external services.

12.2.2 Frontend Structure

  • Reusable and isolated components
  • Separation of UI logic and business logic
  • Service layer for API communication
  • Architecture ready for mobile apps or operator HMIs (future)

12.3 Cloud Architecture Fundamentals

12.3.1 Initial Deployment

The initial deployment will use Render, chosen for its simplicity, low operational cost, and ease of setup during early development.

12.3.2 Future Cloud Scaling (AWS-Ready)

As usage grows across multiple lines, the system is prepared for migration to AWS services:

  • Compute: ECS, EKS (Kubernetes), or Lambda
  • Networking: API Gateway, VPC
  • Storage: S3 for manuals and documents
  • Databases: RDS, DynamoDB, or Aurora
  • Security: IAM, KMS
  • Containerization: Fargate / EKS

12.3.3 Benefits

  • Horizontal autoscaling
  • Enterprise-grade security
  • Full observability and monitoring
  • CI/CD expansion
  • High resilience for factory operations

12.4 AI/NLP Architecture Fundamentals

The AI layer is based on a RAG (Retrieval-Augmented Generation) pipeline to ensure precise and context-aware responses aligned with manufacturing documentation.

12.4.1 Components

  • LLM: GPT-4
  • Embeddings: OpenAI embedding models
  • Vector Database: Chroma, Pinecone, or Qdrant
  • Ingestion Pipeline:
    • Document parsing (PDFs, manuals, logs)
    • Text normalization
    • Chunking
    • Embedding generation
  • Retriever:
    • Similarity search
    • Contextual filtering (line, machine, error category)

12.4.2 Benefits

  • Prevents hallucinations
  • Ensures responses based on verified factory documents
  • Easy to add new machines or production lines
  • Continuous improvement of the knowledge base
  • Supports future AI workflows (summaries, anomaly detection, guided troubleshooting)

12.5 High-Level System Architecture Diagram (C4 – Container Level)

The following diagram provides a container-level view of the system and the relationships between components.