Document installer field workflow

This commit is contained in:
Jino Jose 2026-06-30 12:31:09 +05:30
parent f407a9331e
commit 3ee2698e87

317
README.md
View File

@ -1,65 +1,188 @@
# Nexus One AI Installer # Nexus One AI Installer
## Quick Start This repository is the source of truth for Nexus One AI ISO and server installs.
The ISO keeps itself small by pulling this package from cgit during setup, then
the installer deploys the selected tier on the target server.
```bash ## 1. Choose The Install Path
git clone <cgit-url>
cd cgit | Scenario | Use This Path |
sudo bash install.sh |---|---|
| New appliance/server with ISO | Boot from the Nexus One AI ISO and complete first-boot setup. |
| PSU/offline field install by pendrive | Boot the ISO from USB, enter license/tier details during first boot, upload large models later if needed. |
| Existing Ubuntu server | Clone this repo and run the feasibility check before installing. |
| Lab test without GPU | Use Multipass/VM and expect GPU services to be limited. |
## 2. New ISO Install
1. Flash the Nexus One AI ISO to a USB drive or attach it to the VM/server.
2. Boot the server from the ISO.
3. On the Ubuntu installer network screen, choose DHCP or the final static IP.
4. Let Ubuntu finish installation and reboot.
5. On first boot, open the setup URL shown on the server console:
```text
http://<server-ip>
``` ```
Server reboots automatically after NVIDIA drivers install. Phase 2 runs on its own after reboot. 6. Complete the setup wizard:
- Network: DHCP or static IP.
On the custom ISO, Ubuntu autoinstall now pauses on the installer network screen so the operator can choose the final IP address from the VM console before installation continues. - License & customer details: customer name, project/customer ID, contact email, license key, support date.
- Tier: Starter, Entry, Pro, or Max.
## Software-Only / Existing Hardware - Tools: keep defaults unless a component should be skipped.
7. Click **Start Installation**.
Run a feasibility scan before quoting or installing on customer-owned hardware: 8. Wait for Phase 1 NVIDIA driver setup. The server may reboot once.
9. After reboot, Phase 2 continues automatically through `cezen-phase2.service`.
10. Monitor progress:
```bash ```bash
bash scripts/cezen-feasibility.sh ssh cezen@<server-ip>
sudo journalctl -fu cezen-phase2.service
sudo tail -f /var/log/cezen-install.log
``` ```
The checker reports CPU, RAM, disk, NVIDIA GPU/VRAM, tool readiness, available features, and a recommended Cezen profile. It writes JSON to `/opt/cezen/feasibility.json` when possible, otherwise `./feasibility.json`. 11. Open the portal after install:
Install on existing hardware without the appliance NVIDIA phase: ```text
http://<server-ip>/
```
## 3. PSU / Pendrive Field Install
Use this when a team physically visits the site and installs from a USB drive.
1. Carry the latest ISO on a bootable USB drive.
2. Boot the PSU/customer server from the USB.
3. Configure the final network on the Ubuntu installer screen.
4. After first boot, use either:
- Browser setup at `http://<server-ip>`, or
- Physical console terminal wizard if no browser is available.
5. Enter customer/license details during setup. If the final license key is not available, leave it blank; the system records the install as field staging/evaluation.
6. Select the commercial tier sold to the customer.
7. Complete install.
8. Upload or pull large models later after bandwidth/storage is confirmed.
License details are stored on the installed server at:
```text
/opt/cezen/license.json
```
Installer selections are stored at:
```text
/opt/cezen/install.conf
```
## 4. Existing Server Feasibility Check
Run this before quoting, committing a tier, or installing on customer-owned hardware.
```bash
git clone https://cgit.cezentech.com/jinojose/aipackage.git
cd aipackage
sudo bash install.sh --feasibility-only
```
The report checks CPU, RAM, disk, NVIDIA GPU/VRAM, and likely supported features.
It writes JSON to:
```text
/opt/cezen/feasibility.json
```
If `/opt/cezen` is not writable, it writes:
```text
./feasibility.json
```
Recommended interpretation:
| Result | Meaning |
|---|---|
| `core` | Portal/backend only; no local model serving recommended. |
| `cpu-ai` | CPU-only RAG/chat possible, but constrained. |
| `gpu-starter` | Starter GPU deployment. |
| `gpu-standard` | Entry tier style deployment. |
| `gpu-pro` | Pro tier candidate. |
| `gpu-max` | Max tier candidate. |
## 5. Existing Server Install
After feasibility check, install on an existing Ubuntu server:
```bash ```bash
sudo bash install.sh --software-only --profile=auto sudo bash install.sh --software-only --profile=auto
``` ```
For small systems or slow customer networks, the installer skips default model downloads on lightweight profiles. To force the same behavior manually: For small systems or slow customer networks, skip default model downloads:
```bash ```bash
sudo bash install.sh --software-only --profile=cpu-ai --skip-model-pull sudo bash install.sh --software-only --profile=cpu-ai --skip-model-pull
``` ```
Profiles: To force a commercial tier:
| Profile | Use When | Installs | ```bash
|---|---|---| sudo bash install.sh --software-only --tier=starter
| `core` | no GPU / low RAM | portal, backend, nginx, health/metrics API | sudo bash install.sh --software-only --tier=basic
| `cpu-ai` | 32 GB+ RAM, no usable GPU | core + Chroma/Ollama CPU path, model pull optional | sudo bash install.sh --software-only --tier=pro
| `gpu-starter` | 24-32 GB VRAM | local AI starter stack, model pull optional | sudo bash install.sh --software-only --tier=max
| `gpu-standard` | 48-96 GB VRAM | standard GPU stack | ```
| `gpu-pro` | multi/high-VRAM GPU | advanced GPU stack |
| `gpu-max` | multi-node or HGX-class | full stack, custom sizing |
## Sellable v1 Admin APIs The installer warns if selected tier and hardware recommendation do not match.
The selected tier still wins, because the sale/license decision is commercial.
The backend exposes the first productization APIs for software-only and appliance deployments: ## 6. Tier Guide
| Tier | Target Hardware | Typical Use | Default Models |
|---|---|---|---|
| Starter | 1 GPU around 24-32 GB VRAM, or constrained CPU system | Small team, RAG/admin portal, light chat | `phi3:mini`, `nomic-embed-text` |
| Entry / Basic | 1 RTX Pro 6000 class GPU, around 48-96 GB VRAM | Department deployment | `llama3.1:8b`, `mistral:7b`, `codellama:13b`, `nomic-embed-text` |
| Pro | 2+ high VRAM GPUs | Multi-team deployment, heavier coding/RAG/fine-tuning workflows | Entry models plus `llama3.1:70b`, `mixtral:8x7b`, `deepseek-coder-v2:16b` |
| Max | 4-8 enterprise GPUs such as H100/H200/A100 class | Enterprise deployment, large models, high concurrency | Pro models plus `llama3.1:405b`, `mixtral:8x22b` |
Large models can be pulled later. The ISO does not need to contain them.
```bash
bash models/pull-models.sh --tier=starter
bash models/pull-models.sh --tier=basic
bash models/pull-models.sh --tier=pro
bash models/pull-models.sh --tier=max
```
## 7. What Gets Installed
All tiers install the Nexus One AI portal, backend API, nginx, health/readiness
reporting, license/tier handling, and selected AI tools.
| Component | Port | Notes |
|---|---:|---|
| Nexus One AI portal | 80 | Main UI served by nginx. |
| cezen-api backend | 8080 | FastAPI backend, systemd service `cezen-api`. |
| Ollama | 11434 | Local model inference. |
| Open WebUI | 3001 | Chat UI. |
| ChromaDB | 8100 | Vector database for RAG. |
| vLLM | 8000 | OpenAI-compatible serving path, mainly Pro/Max. |
| JupyterLab | 8888 | Notebook environment. |
| MLflow | 5000 | Experiment tracking. |
| MinIO | 9001 | S3-compatible object/model storage. |
| Grafana | 3000 | Monitoring dashboard. |
## 8. Admin And Readiness APIs
| API | Purpose | | API | Purpose |
|---|---| |---|---|
| `GET /api/license` | Shows current tier, feature matrix, and whether the tier is locked by Cezen. | | `GET /api/license` | Current tier, feature matrix, and safe license metadata. |
| `GET /api/system/feasibility` | Returns the generated hardware feasibility report or live fallback. | | `GET /api/system/feasibility` | Hardware feasibility report or live fallback. |
| `GET /api/system/readiness-report` | Combines license, feasibility, and install readiness into a customer-facing report payload. | | `GET /api/system/readiness-report` | License + feasibility + install readiness payload. |
| `GET /api/audit/report?days=7` | Basic audit summary for handover and admin review. | | `GET /api/audit/report?days=7` | Audit summary for handover/admin review. |
| `GET /api/system/backups` | Lists local backups. | | `GET /api/system/backups` | List local backups. |
| `POST /api/system/backups` | Creates a local backup of Cezen data. | | `POST /api/system/backups` | Create local backup. |
| `POST /api/system/backups/{name}/restore` | Restores a named local backup and creates a pre-restore safety snapshot. | | `POST /api/system/backups/{name}/restore` | Restore backup with pre-restore safety snapshot. |
CLI backup helper: Backup helper:
```bash ```bash
sudo bash scripts/cezen-backup.sh backup sudo bash scripts/cezen-backup.sh backup
@ -67,74 +190,84 @@ sudo bash scripts/cezen-backup.sh list
sudo bash scripts/cezen-backup.sh restore /opt/cezen/backups/cezen-backup-YYYYmmdd-HHMMSS.zip sudo bash scripts/cezen-backup.sh restore /opt/cezen/backups/cezen-backup-YYYYmmdd-HHMMSS.zip
``` ```
## What Gets Installed (Entry Tier) ## 9. Post-Install Checks
| Service | Port | Notes | Run these after install:
|---|---|---|
| Ollama | 11434 | LLM inference, 2 models pre-loaded | ```bash
| Open WebUI | 3001 | Chat interface | systemctl status cezen-api --no-pager
| vLLM | 8000 | OpenAI-compatible API (start manually) | systemctl status cezen-phase2.service --no-pager
| JupyterLab | 8888 | Token: `cezen2024` | curl -s http://localhost:8080/api/settings/branding
| ChromaDB | 8100 | Vector DB for RAG | curl -s http://localhost:8080/api/system/feasibility
| MLflow | 5000 | Experiment tracking | ```
| MinIO | 9001 | Object storage (user: cezenadmin / Cezen@2024!) |
| Grafana | 3000 | GPU + system monitoring (admin / cezen2024) | Check service ports:
## Testing Without a GPU (Multipass) ```bash
ss -lntp
```
Check Ollama models:
```bash
curl -s http://localhost:11434/api/tags
```
## 10. Test Without A GPU
On a MacBook:
```bash ```bash
# On your MacBook:
multipass launch 22.04 --name cezen-test --cpus 4 --mem 8G --disk 40G multipass launch 22.04 --name cezen-test --cpus 4 --mem 8G --disk 40G
multipass shell cezen-test multipass shell cezen-test
# Inside the VM:
git clone <cgit-url>
sudo bash install.sh
``` ```
NVIDIA driver install will succeed but `nvidia-smi` won't show GPUs — that's expected. All other services will run fine. Inside the VM:
## Pull More Models
```bash ```bash
bash models/pull-models.sh --tier=starter # phi3:mini + embeddings git clone https://cgit.cezentech.com/jinojose/aipackage.git
bash models/pull-models.sh --tier=basic # llama3.1:8b, mistral:7b, codellama cd aipackage
bash models/pull-models.sh --tier=pro # + llama3.1:70b, mixtral, deepseek-coder sudo bash install.sh --feasibility-only
bash models/pull-models.sh --tier=max # + llama3.1:405b, mixtral:8x22b sudo bash install.sh --software-only --profile=auto --skip-model-pull
``` ```
## File Structure No GPU will be detected. That is expected.
``` ## 11. Change Default Passwords Before Customer Handover
cgit/
├── install.sh ← Entry point
├── ansible/
│ ├── phase1_nvidia.yml ← Phase 1: drivers (triggers reboot)
│ ├── starter.yml ← Phase 2: Starter tier (1 GPU, small team)
│ ├── entry.yml ← Phase 2: Basic tier (12 GPU, department)
│ ├── pro.yml ← Phase 2: Pro tier (2+ GPU, multi-team)
│ ├── max.yml ← Phase 2: Max tier (48 GPU, enterprise)
│ └── roles/
│ ├── base/ ← OS, Python, Miniconda, LangChain
│ ├── nvidia/ ← Drivers, CUDA 12.4, cuDNN 9
│ ├── docker/ ← Docker CE + NVIDIA Container Toolkit
│ ├── k3s/ ← Lightweight Kubernetes
│ ├── ollama/ ← Ollama + Open WebUI
│ ├── vllm/ ← vLLM inference server
│ ├── jupyterlab/ ← JupyterLab notebooks
│ ├── chromadb/ ← Vector database
│ ├── mlflow/ ← Experiment tracking
│ ├── minio/ ← Object storage
│ └── monitoring/ ← Grafana + Prometheus + DCGM
└── models/
└── pull-models.sh ← Pull additional models
```
## Change Default Passwords Before shipping to a customer, rotate these:
Before shipping to a customer, update these:
- Initial OS/admin account password.
- JupyterLab token: `/opt/cezen/.jupyter/jupyter_lab_config.py` - JupyterLab token: `/opt/cezen/.jupyter/jupyter_lab_config.py`
- MinIO: `/etc/default/minio` - MinIO credentials: `/etc/default/minio`
- Grafana: environment vars in monitoring role, or via UI after first login - Grafana admin password.
- MLflow: no auth by default (add reverse proxy if needed) - Any temporary portal/backend admin credentials.
- Any staging license key if the final license is issued later.
## 12. Useful Files
```text
cgit/
├── install.sh # Main installer entry point
├── autoinstall/ # ISO first-boot setup and web setup
├── scripts/cezen-feasibility.sh # Existing-server feasibility checker
├── scripts/cezen-backup.sh # Backup/restore helper
├── ansible/
│ ├── phase1_nvidia.yml # NVIDIA/CUDA phase
│ ├── starter.yml # Starter tier
│ ├── entry.yml # Entry/Basic tier
│ ├── pro.yml # Pro tier
│ ├── max.yml # Max tier
│ └── roles/
│ ├── cezen-backend/ # FastAPI backend, cezen-api service
│ ├── cezen-nginx/ # Portal/nginx deployment
│ ├── ollama/ # Ollama + Open WebUI
│ ├── chromadb/ # RAG vector DB
│ ├── vllm/ # vLLM serving
│ ├── jupyterlab/ # Notebooks
│ ├── mlflow/ # Experiment tracking
│ ├── minio/ # Object storage
│ └── monitoring/ # Grafana/Prometheus/DCGM
├── cezen-portal/ # Packaged portal UI
└── models/pull-models.sh # Pull tier-specific models
```