Document installer field workflow
This commit is contained in:
parent
f407a9331e
commit
3ee2698e87
317
README.md
317
README.md
@ -1,65 +1,188 @@
|
||||
# Nexus One AI — Installer
|
||||
# Nexus One AI Installer
|
||||
|
||||
## Quick Start
|
||||
This repository is the source of truth for Nexus One AI ISO and server installs.
|
||||
The ISO keeps itself small by pulling this package from cgit during setup, then
|
||||
the installer deploys the selected tier on the target server.
|
||||
|
||||
```bash
|
||||
git clone <cgit-url>
|
||||
cd cgit
|
||||
sudo bash install.sh
|
||||
## 1. Choose The Install Path
|
||||
|
||||
| Scenario | Use This Path |
|
||||
|---|---|
|
||||
| New appliance/server with ISO | Boot from the Nexus One AI ISO and complete first-boot setup. |
|
||||
| PSU/offline field install by pendrive | Boot the ISO from USB, enter license/tier details during first boot, upload large models later if needed. |
|
||||
| Existing Ubuntu server | Clone this repo and run the feasibility check before installing. |
|
||||
| Lab test without GPU | Use Multipass/VM and expect GPU services to be limited. |
|
||||
|
||||
## 2. New ISO Install
|
||||
|
||||
1. Flash the Nexus One AI ISO to a USB drive or attach it to the VM/server.
|
||||
2. Boot the server from the ISO.
|
||||
3. On the Ubuntu installer network screen, choose DHCP or the final static IP.
|
||||
4. Let Ubuntu finish installation and reboot.
|
||||
5. On first boot, open the setup URL shown on the server console:
|
||||
|
||||
```text
|
||||
http://<server-ip>
|
||||
```
|
||||
|
||||
Server reboots automatically after NVIDIA drivers install. Phase 2 runs on its own after reboot.
|
||||
|
||||
On the custom ISO, Ubuntu autoinstall now pauses on the installer network screen so the operator can choose the final IP address from the VM console before installation continues.
|
||||
|
||||
## Software-Only / Existing Hardware
|
||||
|
||||
Run a feasibility scan before quoting or installing on customer-owned hardware:
|
||||
6. Complete the setup wizard:
|
||||
- Network: DHCP or static IP.
|
||||
- License & customer details: customer name, project/customer ID, contact email, license key, support date.
|
||||
- Tier: Starter, Entry, Pro, or Max.
|
||||
- Tools: keep defaults unless a component should be skipped.
|
||||
7. Click **Start Installation**.
|
||||
8. Wait for Phase 1 NVIDIA driver setup. The server may reboot once.
|
||||
9. After reboot, Phase 2 continues automatically through `cezen-phase2.service`.
|
||||
10. Monitor progress:
|
||||
|
||||
```bash
|
||||
bash scripts/cezen-feasibility.sh
|
||||
ssh cezen@<server-ip>
|
||||
sudo journalctl -fu cezen-phase2.service
|
||||
sudo tail -f /var/log/cezen-install.log
|
||||
```
|
||||
|
||||
The checker reports CPU, RAM, disk, NVIDIA GPU/VRAM, tool readiness, available features, and a recommended Cezen profile. It writes JSON to `/opt/cezen/feasibility.json` when possible, otherwise `./feasibility.json`.
|
||||
11. Open the portal after install:
|
||||
|
||||
Install on existing hardware without the appliance NVIDIA phase:
|
||||
```text
|
||||
http://<server-ip>/
|
||||
```
|
||||
|
||||
## 3. PSU / Pendrive Field Install
|
||||
|
||||
Use this when a team physically visits the site and installs from a USB drive.
|
||||
|
||||
1. Carry the latest ISO on a bootable USB drive.
|
||||
2. Boot the PSU/customer server from the USB.
|
||||
3. Configure the final network on the Ubuntu installer screen.
|
||||
4. After first boot, use either:
|
||||
- Browser setup at `http://<server-ip>`, or
|
||||
- Physical console terminal wizard if no browser is available.
|
||||
5. Enter customer/license details during setup. If the final license key is not available, leave it blank; the system records the install as field staging/evaluation.
|
||||
6. Select the commercial tier sold to the customer.
|
||||
7. Complete install.
|
||||
8. Upload or pull large models later after bandwidth/storage is confirmed.
|
||||
|
||||
License details are stored on the installed server at:
|
||||
|
||||
```text
|
||||
/opt/cezen/license.json
|
||||
```
|
||||
|
||||
Installer selections are stored at:
|
||||
|
||||
```text
|
||||
/opt/cezen/install.conf
|
||||
```
|
||||
|
||||
## 4. Existing Server Feasibility Check
|
||||
|
||||
Run this before quoting, committing a tier, or installing on customer-owned hardware.
|
||||
|
||||
```bash
|
||||
git clone https://cgit.cezentech.com/jinojose/aipackage.git
|
||||
cd aipackage
|
||||
sudo bash install.sh --feasibility-only
|
||||
```
|
||||
|
||||
The report checks CPU, RAM, disk, NVIDIA GPU/VRAM, and likely supported features.
|
||||
It writes JSON to:
|
||||
|
||||
```text
|
||||
/opt/cezen/feasibility.json
|
||||
```
|
||||
|
||||
If `/opt/cezen` is not writable, it writes:
|
||||
|
||||
```text
|
||||
./feasibility.json
|
||||
```
|
||||
|
||||
Recommended interpretation:
|
||||
|
||||
| Result | Meaning |
|
||||
|---|---|
|
||||
| `core` | Portal/backend only; no local model serving recommended. |
|
||||
| `cpu-ai` | CPU-only RAG/chat possible, but constrained. |
|
||||
| `gpu-starter` | Starter GPU deployment. |
|
||||
| `gpu-standard` | Entry tier style deployment. |
|
||||
| `gpu-pro` | Pro tier candidate. |
|
||||
| `gpu-max` | Max tier candidate. |
|
||||
|
||||
## 5. Existing Server Install
|
||||
|
||||
After feasibility check, install on an existing Ubuntu server:
|
||||
|
||||
```bash
|
||||
sudo bash install.sh --software-only --profile=auto
|
||||
```
|
||||
|
||||
For small systems or slow customer networks, the installer skips default model downloads on lightweight profiles. To force the same behavior manually:
|
||||
For small systems or slow customer networks, skip default model downloads:
|
||||
|
||||
```bash
|
||||
sudo bash install.sh --software-only --profile=cpu-ai --skip-model-pull
|
||||
```
|
||||
|
||||
Profiles:
|
||||
To force a commercial tier:
|
||||
|
||||
| Profile | Use When | Installs |
|
||||
|---|---|---|
|
||||
| `core` | no GPU / low RAM | portal, backend, nginx, health/metrics API |
|
||||
| `cpu-ai` | 32 GB+ RAM, no usable GPU | core + Chroma/Ollama CPU path, model pull optional |
|
||||
| `gpu-starter` | 24-32 GB VRAM | local AI starter stack, model pull optional |
|
||||
| `gpu-standard` | 48-96 GB VRAM | standard GPU stack |
|
||||
| `gpu-pro` | multi/high-VRAM GPU | advanced GPU stack |
|
||||
| `gpu-max` | multi-node or HGX-class | full stack, custom sizing |
|
||||
```bash
|
||||
sudo bash install.sh --software-only --tier=starter
|
||||
sudo bash install.sh --software-only --tier=basic
|
||||
sudo bash install.sh --software-only --tier=pro
|
||||
sudo bash install.sh --software-only --tier=max
|
||||
```
|
||||
|
||||
## Sellable v1 Admin APIs
|
||||
The installer warns if selected tier and hardware recommendation do not match.
|
||||
The selected tier still wins, because the sale/license decision is commercial.
|
||||
|
||||
The backend exposes the first productization APIs for software-only and appliance deployments:
|
||||
## 6. Tier Guide
|
||||
|
||||
| Tier | Target Hardware | Typical Use | Default Models |
|
||||
|---|---|---|---|
|
||||
| Starter | 1 GPU around 24-32 GB VRAM, or constrained CPU system | Small team, RAG/admin portal, light chat | `phi3:mini`, `nomic-embed-text` |
|
||||
| Entry / Basic | 1 RTX Pro 6000 class GPU, around 48-96 GB VRAM | Department deployment | `llama3.1:8b`, `mistral:7b`, `codellama:13b`, `nomic-embed-text` |
|
||||
| Pro | 2+ high VRAM GPUs | Multi-team deployment, heavier coding/RAG/fine-tuning workflows | Entry models plus `llama3.1:70b`, `mixtral:8x7b`, `deepseek-coder-v2:16b` |
|
||||
| Max | 4-8 enterprise GPUs such as H100/H200/A100 class | Enterprise deployment, large models, high concurrency | Pro models plus `llama3.1:405b`, `mixtral:8x22b` |
|
||||
|
||||
Large models can be pulled later. The ISO does not need to contain them.
|
||||
|
||||
```bash
|
||||
bash models/pull-models.sh --tier=starter
|
||||
bash models/pull-models.sh --tier=basic
|
||||
bash models/pull-models.sh --tier=pro
|
||||
bash models/pull-models.sh --tier=max
|
||||
```
|
||||
|
||||
## 7. What Gets Installed
|
||||
|
||||
All tiers install the Nexus One AI portal, backend API, nginx, health/readiness
|
||||
reporting, license/tier handling, and selected AI tools.
|
||||
|
||||
| Component | Port | Notes |
|
||||
|---|---:|---|
|
||||
| Nexus One AI portal | 80 | Main UI served by nginx. |
|
||||
| cezen-api backend | 8080 | FastAPI backend, systemd service `cezen-api`. |
|
||||
| Ollama | 11434 | Local model inference. |
|
||||
| Open WebUI | 3001 | Chat UI. |
|
||||
| ChromaDB | 8100 | Vector database for RAG. |
|
||||
| vLLM | 8000 | OpenAI-compatible serving path, mainly Pro/Max. |
|
||||
| JupyterLab | 8888 | Notebook environment. |
|
||||
| MLflow | 5000 | Experiment tracking. |
|
||||
| MinIO | 9001 | S3-compatible object/model storage. |
|
||||
| Grafana | 3000 | Monitoring dashboard. |
|
||||
|
||||
## 8. Admin And Readiness APIs
|
||||
|
||||
| API | Purpose |
|
||||
|---|---|
|
||||
| `GET /api/license` | Shows current tier, feature matrix, and whether the tier is locked by Cezen. |
|
||||
| `GET /api/system/feasibility` | Returns the generated hardware feasibility report or live fallback. |
|
||||
| `GET /api/system/readiness-report` | Combines license, feasibility, and install readiness into a customer-facing report payload. |
|
||||
| `GET /api/audit/report?days=7` | Basic audit summary for handover and admin review. |
|
||||
| `GET /api/system/backups` | Lists local backups. |
|
||||
| `POST /api/system/backups` | Creates a local backup of Cezen data. |
|
||||
| `POST /api/system/backups/{name}/restore` | Restores a named local backup and creates a pre-restore safety snapshot. |
|
||||
| `GET /api/license` | Current tier, feature matrix, and safe license metadata. |
|
||||
| `GET /api/system/feasibility` | Hardware feasibility report or live fallback. |
|
||||
| `GET /api/system/readiness-report` | License + feasibility + install readiness payload. |
|
||||
| `GET /api/audit/report?days=7` | Audit summary for handover/admin review. |
|
||||
| `GET /api/system/backups` | List local backups. |
|
||||
| `POST /api/system/backups` | Create local backup. |
|
||||
| `POST /api/system/backups/{name}/restore` | Restore backup with pre-restore safety snapshot. |
|
||||
|
||||
CLI backup helper:
|
||||
Backup helper:
|
||||
|
||||
```bash
|
||||
sudo bash scripts/cezen-backup.sh backup
|
||||
@ -67,74 +190,84 @@ sudo bash scripts/cezen-backup.sh list
|
||||
sudo bash scripts/cezen-backup.sh restore /opt/cezen/backups/cezen-backup-YYYYmmdd-HHMMSS.zip
|
||||
```
|
||||
|
||||
## What Gets Installed (Entry Tier)
|
||||
## 9. Post-Install Checks
|
||||
|
||||
| Service | Port | Notes |
|
||||
|---|---|---|
|
||||
| Ollama | 11434 | LLM inference, 2 models pre-loaded |
|
||||
| Open WebUI | 3001 | Chat interface |
|
||||
| vLLM | 8000 | OpenAI-compatible API (start manually) |
|
||||
| JupyterLab | 8888 | Token: `cezen2024` |
|
||||
| ChromaDB | 8100 | Vector DB for RAG |
|
||||
| MLflow | 5000 | Experiment tracking |
|
||||
| MinIO | 9001 | Object storage (user: cezenadmin / Cezen@2024!) |
|
||||
| Grafana | 3000 | GPU + system monitoring (admin / cezen2024) |
|
||||
|
||||
## Testing Without a GPU (Multipass)
|
||||
Run these after install:
|
||||
|
||||
```bash
|
||||
systemctl status cezen-api --no-pager
|
||||
systemctl status cezen-phase2.service --no-pager
|
||||
curl -s http://localhost:8080/api/settings/branding
|
||||
curl -s http://localhost:8080/api/system/feasibility
|
||||
```
|
||||
|
||||
Check service ports:
|
||||
|
||||
```bash
|
||||
ss -lntp
|
||||
```
|
||||
|
||||
Check Ollama models:
|
||||
|
||||
```bash
|
||||
curl -s http://localhost:11434/api/tags
|
||||
```
|
||||
|
||||
## 10. Test Without A GPU
|
||||
|
||||
On a MacBook:
|
||||
|
||||
```bash
|
||||
# On your MacBook:
|
||||
multipass launch 22.04 --name cezen-test --cpus 4 --mem 8G --disk 40G
|
||||
multipass shell cezen-test
|
||||
|
||||
# Inside the VM:
|
||||
git clone <cgit-url>
|
||||
sudo bash install.sh
|
||||
```
|
||||
|
||||
NVIDIA driver install will succeed but `nvidia-smi` won't show GPUs — that's expected. All other services will run fine.
|
||||
|
||||
## Pull More Models
|
||||
Inside the VM:
|
||||
|
||||
```bash
|
||||
bash models/pull-models.sh --tier=starter # phi3:mini + embeddings
|
||||
bash models/pull-models.sh --tier=basic # llama3.1:8b, mistral:7b, codellama
|
||||
bash models/pull-models.sh --tier=pro # + llama3.1:70b, mixtral, deepseek-coder
|
||||
bash models/pull-models.sh --tier=max # + llama3.1:405b, mixtral:8x22b
|
||||
git clone https://cgit.cezentech.com/jinojose/aipackage.git
|
||||
cd aipackage
|
||||
sudo bash install.sh --feasibility-only
|
||||
sudo bash install.sh --software-only --profile=auto --skip-model-pull
|
||||
```
|
||||
|
||||
## File Structure
|
||||
No GPU will be detected. That is expected.
|
||||
|
||||
```
|
||||
cgit/
|
||||
├── install.sh ← Entry point
|
||||
├── ansible/
|
||||
│ ├── phase1_nvidia.yml ← Phase 1: drivers (triggers reboot)
|
||||
│ ├── starter.yml ← Phase 2: Starter tier (1 GPU, small team)
|
||||
│ ├── entry.yml ← Phase 2: Basic tier (1–2 GPU, department)
|
||||
│ ├── pro.yml ← Phase 2: Pro tier (2+ GPU, multi-team)
|
||||
│ ├── max.yml ← Phase 2: Max tier (4–8 GPU, enterprise)
|
||||
│ └── roles/
|
||||
│ ├── base/ ← OS, Python, Miniconda, LangChain
|
||||
│ ├── nvidia/ ← Drivers, CUDA 12.4, cuDNN 9
|
||||
│ ├── docker/ ← Docker CE + NVIDIA Container Toolkit
|
||||
│ ├── k3s/ ← Lightweight Kubernetes
|
||||
│ ├── ollama/ ← Ollama + Open WebUI
|
||||
│ ├── vllm/ ← vLLM inference server
|
||||
│ ├── jupyterlab/ ← JupyterLab notebooks
|
||||
│ ├── chromadb/ ← Vector database
|
||||
│ ├── mlflow/ ← Experiment tracking
|
||||
│ ├── minio/ ← Object storage
|
||||
│ └── monitoring/ ← Grafana + Prometheus + DCGM
|
||||
└── models/
|
||||
└── pull-models.sh ← Pull additional models
|
||||
```
|
||||
## 11. Change Default Passwords Before Customer Handover
|
||||
|
||||
## Change Default Passwords
|
||||
|
||||
Before shipping to a customer, update these:
|
||||
Before shipping to a customer, rotate these:
|
||||
|
||||
- Initial OS/admin account password.
|
||||
- JupyterLab token: `/opt/cezen/.jupyter/jupyter_lab_config.py`
|
||||
- MinIO: `/etc/default/minio`
|
||||
- Grafana: environment vars in monitoring role, or via UI after first login
|
||||
- MLflow: no auth by default (add reverse proxy if needed)
|
||||
- MinIO credentials: `/etc/default/minio`
|
||||
- Grafana admin password.
|
||||
- Any temporary portal/backend admin credentials.
|
||||
- Any staging license key if the final license is issued later.
|
||||
|
||||
## 12. Useful Files
|
||||
|
||||
```text
|
||||
cgit/
|
||||
├── install.sh # Main installer entry point
|
||||
├── autoinstall/ # ISO first-boot setup and web setup
|
||||
├── scripts/cezen-feasibility.sh # Existing-server feasibility checker
|
||||
├── scripts/cezen-backup.sh # Backup/restore helper
|
||||
├── ansible/
|
||||
│ ├── phase1_nvidia.yml # NVIDIA/CUDA phase
|
||||
│ ├── starter.yml # Starter tier
|
||||
│ ├── entry.yml # Entry/Basic tier
|
||||
│ ├── pro.yml # Pro tier
|
||||
│ ├── max.yml # Max tier
|
||||
│ └── roles/
|
||||
│ ├── cezen-backend/ # FastAPI backend, cezen-api service
|
||||
│ ├── cezen-nginx/ # Portal/nginx deployment
|
||||
│ ├── ollama/ # Ollama + Open WebUI
|
||||
│ ├── chromadb/ # RAG vector DB
|
||||
│ ├── vllm/ # vLLM serving
|
||||
│ ├── jupyterlab/ # Notebooks
|
||||
│ ├── mlflow/ # Experiment tracking
|
||||
│ ├── minio/ # Object storage
|
||||
│ └── monitoring/ # Grafana/Prometheus/DCGM
|
||||
├── cezen-portal/ # Packaged portal UI
|
||||
└── models/pull-models.sh # Pull tier-specific models
|
||||
```
|
||||
|
||||
Loading…
Reference in New Issue
Block a user