12 System Architecture
This section describes the architectural principles that guide the design, evolution, and scalability of the chatbot system. The objective is to support the initial deployment for the SRTFA production line and ensure seamless expansion to additional lines and domains across the factory.
12.1 Scalability Principles (Horizontal Scaling)
The system architecture prioritizes horizontal scalability, enabling growth as new production lines, machine families, and operational areas adopt the chatbot.
12.1.1 Why Horizontal Scaling
- Allows multiple backend instances to run in parallel with load balancing.
- Supports increased user concurrency across different production lines.
- Ensures the vector database can grow as new technical documents and knowledge domains are added.
- Enables isolated knowledge contexts (e.g., SRTFA, SRTFB, SMT, Final Assembly) without major rework.
12.1.2 Practical Application
- Stateless backend services
- Decoupled frontend deployment
- Isolated AI/NLP engine
- Containerized services (Docker)
- Load balancer for traffic distribution
12.2 Clean Architecture (Applied to Backend and Frontend)
The system follows Clean Architecture (Uncle Bob) to ensure modularity, testability, and long-term maintainability.
12.2.1 Backend Structure
- Domain Layer
Core logic and manufacturing rules. - Use Cases Layer
Operational flows (e.g., retrieve error code, fetch procedure, query machine documentation). - Interface Adapters
Controllers, DTOs, presenters, mapping layers. - Infrastructure Layer
REST APIs, vector database integration, LLM communication, and external services.
12.2.2 Frontend Structure
- Reusable and isolated components
- Separation of UI logic and business logic
- Service layer for API communication
- Architecture ready for mobile apps or operator HMIs (future)
12.3 Cloud Architecture Fundamentals
12.3.1 Initial Deployment
The initial deployment will use Render, chosen for its simplicity, low operational cost, and ease of setup during early development.
12.3.2 Future Cloud Scaling (AWS-Ready)
As usage grows across multiple lines, the system is prepared for migration to AWS services:
- Compute: ECS, EKS (Kubernetes), or Lambda
- Networking: API Gateway, VPC
- Storage: S3 for manuals and documents
- Databases: RDS, DynamoDB, or Aurora
- Security: IAM, KMS
- Containerization: Fargate / EKS
12.3.3 Benefits
- Horizontal autoscaling
- Enterprise-grade security
- Full observability and monitoring
- CI/CD expansion
- High resilience for factory operations
12.4 AI/NLP Architecture Fundamentals
The AI layer is based on a RAG (Retrieval-Augmented Generation) pipeline to ensure precise and context-aware responses aligned with manufacturing documentation.
12.4.1 Components
- LLM: GPT-4
- Embeddings: OpenAI embedding models
- Vector Database: Chroma, Pinecone, or Qdrant
- Ingestion Pipeline:
- Document parsing (PDFs, manuals, logs)
- Text normalization
- Chunking
- Embedding generation
- Document parsing (PDFs, manuals, logs)
- Retriever:
- Similarity search
- Contextual filtering (line, machine, error category)
- Similarity search
12.4.2 Benefits
- Prevents hallucinations
- Ensures responses based on verified factory documents
- Easy to add new machines or production lines
- Continuous improvement of the knowledge base
- Supports future AI workflows (summaries, anomaly detection, guided troubleshooting)
12.5 High-Level System Architecture Diagram (C4 – Container Level)
The following diagram provides a container-level view of the system and the relationships between components.