5.6 KiB
5.6 KiB
Production-Grade RAG System with LightRAG
A high-performance Retrieval-Augmented Generation (RAG) system built on LightRAG framework with multi-format document processing, GPU-accelerated OCR, and production-grade database storage.
System Architecture
Storage Backends
- KV Storage: Redis (Memurai) -
redis://localhost:6379 - Graph Storage: Neo4j -
bolt://localhost:7687(username:neo4j, password:jleu1212) - Vector Storage: Qdrant -
http://localhost:6333/ - Document Status Storage: PostgreSQL -
postgresql://jleu3482:jleu1212@localhost:5432/rag_anything
AI Model Configuration
- LLM: DeepSeek API (API Key:
sk-55f6e57f1d834b0e93ceaf98cc2cb715) - Embeddings: Jina AI
- Entity Extraction: spaCy models for fast indexing
- Reranker: Disabled for performance optimization
- OCR: PaddleOCR with GPU acceleration
Performance Settings
- Multi-core processing: Parallel document processing
- GPU acceleration: NVIDIA RTX 4070 Super
- Chunking: 1200 tokens with 100 token overlap
- Max graph nodes: 1000
- Max parallel insert: 4 concurrent operations
Installation
Prerequisites
- Python 3.9+
- Redis (Memurai) running on port 6379
- Neo4j running on bolt://localhost:7687
- Qdrant running on http://localhost:6333
- PostgreSQL running on port 5432 with database
rag_anything
Install Dependencies
cd LightRAG-main
pip install -r requirements.txt
Download spaCy Model
python -m spacy download en_core_web_lg
Environment Setup
Set the following environment variables:
# For Jina embeddings (optional)
set JINA_API_KEY=your_jina_api_key_here
# For DeepSeek LLM
set DEEPSEEK_API_KEY=sk-55f6e57f1d834b0e93ceaf98cc2cb715
# For Windows terminal encoding
set PYTHONIOENCODING=utf-8
Running the System
Start the Server
python -m lightrag.api.lightrag_server --port 3015 --working-dir rag_storage --input-dir inputs --key jleu1212 --auto-scan-at-startup --llm-binding openai --embedding-binding jina --rerank-binding null
Server Parameters
--port 3015: Web server port--working-dir rag_storage: Storage directory for processed data--input-dir inputs: Directory for document uploads--key jleu1212: Authentication key--auto-scan-at-startup: Automatically process documents in input directory--llm-binding openai: Use OpenAI-compatible API (DeepSeek)--embedding-binding jina: Use Jina embeddings--rerank-binding null: Disable reranker for performance
Document Processing Capabilities
Supported Formats
- DOCX/DOC: python-docx + custom table parser
- XLSX/XLS: pandas + openpyxl
- PDF (Text-based): pymupdf (fitz)
- PDF (Image-based): paddleocr + layout detection
- PPTX/PPT: python-pptx
- Images: MobileOne classification → PaddleOCR (Fast filter, OCR only when needed)
- TXT/CSV: Direct read
- HTML: beautifulsoup4
OCR Processing
- Uses PaddleOCR with GPU acceleration
- MobileOne-S1 model for image classification
- Automatic detection of scanned pages and images
- Fast filtering to only OCR when necessary
API Endpoints
Authentication
POST /login
Content-Type: application/x-www-form-urlencoded
username=admin&password=jleu1212
Document Management
POST /upload
Content-Type: multipart/form-data
Authorization: Bearer {token}
POST /documents/status
GET /documents
Search
POST /search
Content-Type: application/json
Authorization: Bearer {token}
{
"query": "search query",
"top_k": 10
}
System Health
GET /health
Web UI
The system includes a web interface running on port 3015. Access it at:
http://localhost:3015
Web UI Features
- Document upload and management
- Real-time search interface
- Document processing status
- System health monitoring
- Authentication-protected access
Testing
Test Script
Run the comprehensive test script:
python test_lightrag_webui.py
Test Documents
Place test documents in the test_documents/ directory:
- Text files (.txt)
- PDF documents (.pdf)
- Word documents (.docx)
- Excel files (.xlsx)
- PowerPoint files (.pptx)
- Images with text (.jpg, .png)
Health Check
curl http://localhost:3015/health
Performance Optimization
GPU Configuration
- Uses NVIDIA RTX 4070 Super for OCR and model inference
- CUDA device:
cuda:0 - Batch processing for optimal throughput
Memory Management
- Parallel processing with configurable worker count
- Chunked document processing
- Efficient entity extraction with spaCy
Database Optimization
- Connection pooling for all databases
- Batch insert operations
- Indexed queries for fast retrieval
Troubleshooting
Common Issues
-
Authentication 401 Errors
- Ensure login uses form data, not JSON
- Check password matches the --key parameter
-
OCR Initialization Failures
- Verify PaddleOCR version 3.3.0+
- Check GPU availability and CUDA installation
-
Database Connection Issues
- Verify all databases are running
- Check connection strings in config.ini
-
Unicode Encoding Problems
- Set PYTHONIOENCODING=utf-8 environment variable
Logs and Monitoring
- Server logs output to console
- Document processing status available via API
- System health endpoint for monitoring
Configuration
Edit config.ini for custom settings:
- Database connection strings
- Performance parameters
- OCR and processing settings
- Server configuration
License
This project is built on LightRAG framework. See individual component licenses for details.