Files
railseek6/LightRAG-main
..

Production-Grade RAG System with LightRAG

A high-performance Retrieval-Augmented Generation (RAG) system built on LightRAG framework with multi-format document processing, GPU-accelerated OCR, and production-grade database storage.

System Architecture

Storage Backends

  • KV Storage: Redis (Memurai) - redis://localhost:6379
  • Graph Storage: Neo4j - bolt://localhost:7687 (username: neo4j, password: jleu1212)
  • Vector Storage: Qdrant - http://localhost:6333/
  • Document Status Storage: PostgreSQL - postgresql://jleu3482:jleu1212@localhost:5432/rag_anything

AI Model Configuration

  • LLM: DeepSeek API (API Key: sk-55f6e57f1d834b0e93ceaf98cc2cb715)
  • Embeddings: Jina AI
  • Entity Extraction: spaCy models for fast indexing
  • Reranker: Disabled for performance optimization
  • OCR: PaddleOCR with GPU acceleration

Performance Settings

  • Multi-core processing: Parallel document processing
  • GPU acceleration: NVIDIA RTX 4070 Super
  • Chunking: 1200 tokens with 100 token overlap
  • Max graph nodes: 1000
  • Max parallel insert: 4 concurrent operations

Installation

Prerequisites

  • Python 3.9+
  • Redis (Memurai) running on port 6379
  • Neo4j running on bolt://localhost:7687
  • Qdrant running on http://localhost:6333
  • PostgreSQL running on port 5432 with database rag_anything

Install Dependencies

cd LightRAG-main
pip install -r requirements.txt

Download spaCy Model

python -m spacy download en_core_web_lg

Environment Setup

Set the following environment variables:

# For Jina embeddings (optional)
set JINA_API_KEY=your_jina_api_key_here

# For DeepSeek LLM
set DEEPSEEK_API_KEY=sk-55f6e57f1d834b0e93ceaf98cc2cb715

# For Windows terminal encoding
set PYTHONIOENCODING=utf-8

Running the System

Start the Server

python -m lightrag.api.lightrag_server --port 3015 --working-dir rag_storage --input-dir inputs --key jleu1212 --auto-scan-at-startup --llm-binding openai --embedding-binding jina --rerank-binding null

Server Parameters

  • --port 3015: Web server port
  • --working-dir rag_storage: Storage directory for processed data
  • --input-dir inputs: Directory for document uploads
  • --key jleu1212: Authentication key
  • --auto-scan-at-startup: Automatically process documents in input directory
  • --llm-binding openai: Use OpenAI-compatible API (DeepSeek)
  • --embedding-binding jina: Use Jina embeddings
  • --rerank-binding null: Disable reranker for performance

Document Processing Capabilities

Supported Formats

  • DOCX/DOC: python-docx + custom table parser
  • XLSX/XLS: pandas + openpyxl
  • PDF (Text-based): pymupdf (fitz)
  • PDF (Image-based): paddleocr + layout detection
  • PPTX/PPT: python-pptx
  • Images: MobileOne classification → PaddleOCR (Fast filter, OCR only when needed)
  • TXT/CSV: Direct read
  • HTML: beautifulsoup4

OCR Processing

  • Uses PaddleOCR with GPU acceleration
  • MobileOne-S1 model for image classification
  • Automatic detection of scanned pages and images
  • Fast filtering to only OCR when necessary

API Endpoints

Authentication

POST /login
Content-Type: application/x-www-form-urlencoded

username=admin&password=jleu1212

Document Management

POST /upload
Content-Type: multipart/form-data
Authorization: Bearer {token}

POST /documents/status
GET /documents
POST /search
Content-Type: application/json
Authorization: Bearer {token}

{
  "query": "search query",
  "top_k": 10
}

System Health

GET /health

Web UI

The system includes a web interface running on port 3015. Access it at:

http://localhost:3015

Web UI Features

  • Document upload and management
  • Real-time search interface
  • Document processing status
  • System health monitoring
  • Authentication-protected access

Testing

Test Script

Run the comprehensive test script:

python test_lightrag_webui.py

Test Documents

Place test documents in the test_documents/ directory:

  • Text files (.txt)
  • PDF documents (.pdf)
  • Word documents (.docx)
  • Excel files (.xlsx)
  • PowerPoint files (.pptx)
  • Images with text (.jpg, .png)

Health Check

curl http://localhost:3015/health

Performance Optimization

GPU Configuration

  • Uses NVIDIA RTX 4070 Super for OCR and model inference
  • CUDA device: cuda:0
  • Batch processing for optimal throughput

Memory Management

  • Parallel processing with configurable worker count
  • Chunked document processing
  • Efficient entity extraction with spaCy

Database Optimization

  • Connection pooling for all databases
  • Batch insert operations
  • Indexed queries for fast retrieval

Troubleshooting

Common Issues

  1. Authentication 401 Errors

    • Ensure login uses form data, not JSON
    • Check password matches the --key parameter
  2. OCR Initialization Failures

    • Verify PaddleOCR version 3.3.0+
    • Check GPU availability and CUDA installation
  3. Database Connection Issues

    • Verify all databases are running
    • Check connection strings in config.ini
  4. Unicode Encoding Problems

    • Set PYTHONIOENCODING=utf-8 environment variable

Logs and Monitoring

  • Server logs output to console
  • Document processing status available via API
  • System health endpoint for monitoring

Configuration

Edit config.ini for custom settings:

  • Database connection strings
  • Performance parameters
  • OCR and processing settings
  • Server configuration

License

This project is built on LightRAG framework. See individual component licenses for details.