141 lines
5.4 KiB
Markdown
141 lines
5.4 KiB
Markdown
# Nexus One AI — Installer
|
||
|
||
## Quick Start
|
||
|
||
```bash
|
||
git clone <cgit-url>
|
||
cd cgit
|
||
sudo bash install.sh
|
||
```
|
||
|
||
Server reboots automatically after NVIDIA drivers install. Phase 2 runs on its own after reboot.
|
||
|
||
On the custom ISO, Ubuntu autoinstall now pauses on the installer network screen so the operator can choose the final IP address from the VM console before installation continues.
|
||
|
||
## Software-Only / Existing Hardware
|
||
|
||
Run a feasibility scan before quoting or installing on customer-owned hardware:
|
||
|
||
```bash
|
||
bash scripts/cezen-feasibility.sh
|
||
```
|
||
|
||
The checker reports CPU, RAM, disk, NVIDIA GPU/VRAM, tool readiness, available features, and a recommended Cezen profile. It writes JSON to `/opt/cezen/feasibility.json` when possible, otherwise `./feasibility.json`.
|
||
|
||
Install on existing hardware without the appliance NVIDIA phase:
|
||
|
||
```bash
|
||
sudo bash install.sh --software-only --profile=auto
|
||
```
|
||
|
||
For small systems or slow customer networks, the installer skips default model downloads on lightweight profiles. To force the same behavior manually:
|
||
|
||
```bash
|
||
sudo bash install.sh --software-only --profile=cpu-ai --skip-model-pull
|
||
```
|
||
|
||
Profiles:
|
||
|
||
| Profile | Use When | Installs |
|
||
|---|---|---|
|
||
| `core` | no GPU / low RAM | portal, backend, nginx, health/metrics API |
|
||
| `cpu-ai` | 32 GB+ RAM, no usable GPU | core + Chroma/Ollama CPU path, model pull optional |
|
||
| `gpu-starter` | 24-32 GB VRAM | local AI starter stack, model pull optional |
|
||
| `gpu-standard` | 48-96 GB VRAM | standard GPU stack |
|
||
| `gpu-pro` | multi/high-VRAM GPU | advanced GPU stack |
|
||
| `gpu-max` | multi-node or HGX-class | full stack, custom sizing |
|
||
|
||
## Sellable v1 Admin APIs
|
||
|
||
The backend exposes the first productization APIs for software-only and appliance deployments:
|
||
|
||
| API | Purpose |
|
||
|---|---|
|
||
| `GET /api/license` | Shows current tier, feature matrix, and whether the tier is locked by Cezen. |
|
||
| `GET /api/system/feasibility` | Returns the generated hardware feasibility report or live fallback. |
|
||
| `GET /api/system/readiness-report` | Combines license, feasibility, and install readiness into a customer-facing report payload. |
|
||
| `GET /api/audit/report?days=7` | Basic audit summary for handover and admin review. |
|
||
| `GET /api/system/backups` | Lists local backups. |
|
||
| `POST /api/system/backups` | Creates a local backup of Cezen data. |
|
||
| `POST /api/system/backups/{name}/restore` | Restores a named local backup and creates a pre-restore safety snapshot. |
|
||
|
||
CLI backup helper:
|
||
|
||
```bash
|
||
sudo bash scripts/cezen-backup.sh backup
|
||
sudo bash scripts/cezen-backup.sh list
|
||
sudo bash scripts/cezen-backup.sh restore /opt/cezen/backups/cezen-backup-YYYYmmdd-HHMMSS.zip
|
||
```
|
||
|
||
## What Gets Installed (Entry Tier)
|
||
|
||
| Service | Port | Notes |
|
||
|---|---|---|
|
||
| Ollama | 11434 | LLM inference, 2 models pre-loaded |
|
||
| Open WebUI | 3001 | Chat interface |
|
||
| vLLM | 8000 | OpenAI-compatible API (start manually) |
|
||
| JupyterLab | 8888 | Token: `cezen2024` |
|
||
| ChromaDB | 8100 | Vector DB for RAG |
|
||
| MLflow | 5000 | Experiment tracking |
|
||
| MinIO | 9001 | Object storage (user: cezenadmin / Cezen@2024!) |
|
||
| Grafana | 3000 | GPU + system monitoring (admin / cezen2024) |
|
||
|
||
## Testing Without a GPU (Multipass)
|
||
|
||
```bash
|
||
# On your MacBook:
|
||
multipass launch 22.04 --name cezen-test --cpus 4 --mem 8G --disk 40G
|
||
multipass shell cezen-test
|
||
|
||
# Inside the VM:
|
||
git clone <cgit-url>
|
||
sudo bash install.sh
|
||
```
|
||
|
||
NVIDIA driver install will succeed but `nvidia-smi` won't show GPUs — that's expected. All other services will run fine.
|
||
|
||
## Pull More Models
|
||
|
||
```bash
|
||
bash models/pull-models.sh --tier=starter # phi3:mini + embeddings
|
||
bash models/pull-models.sh --tier=basic # llama3.1:8b, mistral:7b, codellama
|
||
bash models/pull-models.sh --tier=pro # + llama3.1:70b, mixtral, deepseek-coder
|
||
bash models/pull-models.sh --tier=max # + llama3.1:405b, mixtral:8x22b
|
||
```
|
||
|
||
## File Structure
|
||
|
||
```
|
||
cgit/
|
||
├── install.sh ← Entry point
|
||
├── ansible/
|
||
│ ├── phase1_nvidia.yml ← Phase 1: drivers (triggers reboot)
|
||
│ ├── starter.yml ← Phase 2: Starter tier (1 GPU, small team)
|
||
│ ├── entry.yml ← Phase 2: Basic tier (1–2 GPU, department)
|
||
│ ├── pro.yml ← Phase 2: Pro tier (2+ GPU, multi-team)
|
||
│ ├── max.yml ← Phase 2: Max tier (4–8 GPU, enterprise)
|
||
│ └── roles/
|
||
│ ├── base/ ← OS, Python, Miniconda, LangChain
|
||
│ ├── nvidia/ ← Drivers, CUDA 12.4, cuDNN 9
|
||
│ ├── docker/ ← Docker CE + NVIDIA Container Toolkit
|
||
│ ├── k3s/ ← Lightweight Kubernetes
|
||
│ ├── ollama/ ← Ollama + Open WebUI
|
||
│ ├── vllm/ ← vLLM inference server
|
||
│ ├── jupyterlab/ ← JupyterLab notebooks
|
||
│ ├── chromadb/ ← Vector database
|
||
│ ├── mlflow/ ← Experiment tracking
|
||
│ ├── minio/ ← Object storage
|
||
│ └── monitoring/ ← Grafana + Prometheus + DCGM
|
||
└── models/
|
||
└── pull-models.sh ← Pull additional models
|
||
```
|
||
|
||
## Change Default Passwords
|
||
|
||
Before shipping to a customer, update these:
|
||
|
||
- JupyterLab token: `/opt/cezen/.jupyter/jupyter_lab_config.py`
|
||
- MinIO: `/etc/default/minio`
|
||
- Grafana: environment vars in monitoring role, or via UI after first login
|
||
- MLflow: no auth by default (add reverse proxy if needed)
|