# Production-Grade RAG System with LightRAG A high-performance Retrieval-Augmented Generation (RAG) system built on LightRAG framework with multi-format document processing, GPU-accelerated OCR, and production-grade database storage. ## System Architecture ### Storage Backends - **KV Storage**: Redis (Memurai) - `redis://localhost:6379` - **Graph Storage**: Neo4j - `bolt://localhost:7687` (username: `neo4j`, password: `jleu1212`) - **Vector Storage**: Qdrant - `http://localhost:6333/` - **Document Status Storage**: PostgreSQL - `postgresql://jleu3482:jleu1212@localhost:5432/rag_anything` ### AI Model Configuration - **LLM**: DeepSeek API (API Key: `sk-55f6e57f1d834b0e93ceaf98cc2cb715`) - **Embeddings**: Jina AI - **Entity Extraction**: spaCy models for fast indexing - **Reranker**: Disabled for performance optimization - **OCR**: PaddleOCR with GPU acceleration ### Performance Settings - **Multi-core processing**: Parallel document processing - **GPU acceleration**: NVIDIA RTX 4070 Super - **Chunking**: 1200 tokens with 100 token overlap - **Max graph nodes**: 1000 - **Max parallel insert**: 4 concurrent operations ## Installation ### Prerequisites - Python 3.9+ - Redis (Memurai) running on port 6379 - Neo4j running on bolt://localhost:7687 - Qdrant running on http://localhost:6333 - PostgreSQL running on port 5432 with database `rag_anything` ### Install Dependencies ```bash cd LightRAG-main pip install -r requirements.txt ``` ### Download spaCy Model ```bash python -m spacy download en_core_web_lg ``` ### Environment Setup Set the following environment variables: ```bash # For Jina embeddings (optional) set JINA_API_KEY=your_jina_api_key_here # For DeepSeek LLM set DEEPSEEK_API_KEY=sk-55f6e57f1d834b0e93ceaf98cc2cb715 # For Windows terminal encoding set PYTHONIOENCODING=utf-8 ``` ## Running the System ### Start the Server ```bash python -m lightrag.api.lightrag_server --port 3015 --working-dir rag_storage --input-dir inputs --key jleu1212 --auto-scan-at-startup --llm-binding openai --embedding-binding jina --rerank-binding null ``` ### Server Parameters - `--port 3015`: Web server port - `--working-dir rag_storage`: Storage directory for processed data - `--input-dir inputs`: Directory for document uploads - `--key jleu1212`: Authentication key - `--auto-scan-at-startup`: Automatically process documents in input directory - `--llm-binding openai`: Use OpenAI-compatible API (DeepSeek) - `--embedding-binding jina`: Use Jina embeddings - `--rerank-binding null`: Disable reranker for performance ## Document Processing Capabilities ### Supported Formats - **DOCX/DOC**: python-docx + custom table parser - **XLSX/XLS**: pandas + openpyxl - **PDF (Text-based)**: pymupdf (fitz) - **PDF (Image-based)**: paddleocr + layout detection - **PPTX/PPT**: python-pptx - **Images**: MobileOne classification → PaddleOCR (Fast filter, OCR only when needed) - **TXT/CSV**: Direct read - **HTML**: beautifulsoup4 ### OCR Processing - Uses PaddleOCR with GPU acceleration - MobileOne-S1 model for image classification - Automatic detection of scanned pages and images - Fast filtering to only OCR when necessary ## API Endpoints ### Authentication ```http POST /login Content-Type: application/x-www-form-urlencoded username=admin&password=jleu1212 ``` ### Document Management ```http POST /upload Content-Type: multipart/form-data Authorization: Bearer {token} POST /documents/status GET /documents ``` ### Search ```http POST /search Content-Type: application/json Authorization: Bearer {token} { "query": "search query", "top_k": 10 } ``` ### System Health ```http GET /health ``` ## Web UI The system includes a web interface running on port 3015. Access it at: ``` http://localhost:3015 ``` ### Web UI Features - Document upload and management - Real-time search interface - Document processing status - System health monitoring - Authentication-protected access ## Testing ### Test Script Run the comprehensive test script: ```bash python test_lightrag_webui.py ``` ### Test Documents Place test documents in the `test_documents/` directory: - Text files (.txt) - PDF documents (.pdf) - Word documents (.docx) - Excel files (.xlsx) - PowerPoint files (.pptx) - Images with text (.jpg, .png) ### Health Check ```bash curl http://localhost:3015/health ``` ## Performance Optimization ### GPU Configuration - Uses NVIDIA RTX 4070 Super for OCR and model inference - CUDA device: `cuda:0` - Batch processing for optimal throughput ### Memory Management - Parallel processing with configurable worker count - Chunked document processing - Efficient entity extraction with spaCy ### Database Optimization - Connection pooling for all databases - Batch insert operations - Indexed queries for fast retrieval ## Troubleshooting ### Common Issues 1. **Authentication 401 Errors** - Ensure login uses form data, not JSON - Check password matches the --key parameter 2. **OCR Initialization Failures** - Verify PaddleOCR version 3.3.0+ - Check GPU availability and CUDA installation 3. **Database Connection Issues** - Verify all databases are running - Check connection strings in config.ini 4. **Unicode Encoding Problems** - Set PYTHONIOENCODING=utf-8 environment variable ### Logs and Monitoring - Server logs output to console - Document processing status available via API - System health endpoint for monitoring ## Configuration Edit `config.ini` for custom settings: - Database connection strings - Performance parameters - OCR and processing settings - Server configuration ## License This project is built on LightRAG framework. See individual component licenses for details.