llm-server

LLM Server

A high-performance LLM inference server with OpenAI-compatible API.

Features

Installation

# Install dependencies
pip install -r requirements.txt

# Copy environment file
cp .env.example .env

# Edit .env with your settings

Quick Start

# Start the server
python -m uvicorn src.main:app --host 0.0.0.0 --port 8000

# Or use the start script
./scripts/start_server.sh

API Endpoints

Docker

# Build and run with Docker
cd docker
docker-compose up -d

Configuration

See .env.example for all configuration options.