Document installer field workflow
This commit is contained in:
parent
f407a9331e
commit
3ee2698e87
317
README.md
317
README.md
@ -1,65 +1,188 @@
|
|||||||
# Nexus One AI — Installer
|
# Nexus One AI Installer
|
||||||
|
|
||||||
## Quick Start
|
This repository is the source of truth for Nexus One AI ISO and server installs.
|
||||||
|
The ISO keeps itself small by pulling this package from cgit during setup, then
|
||||||
|
the installer deploys the selected tier on the target server.
|
||||||
|
|
||||||
```bash
|
## 1. Choose The Install Path
|
||||||
git clone <cgit-url>
|
|
||||||
cd cgit
|
| Scenario | Use This Path |
|
||||||
sudo bash install.sh
|
|---|---|
|
||||||
|
| New appliance/server with ISO | Boot from the Nexus One AI ISO and complete first-boot setup. |
|
||||||
|
| PSU/offline field install by pendrive | Boot the ISO from USB, enter license/tier details during first boot, upload large models later if needed. |
|
||||||
|
| Existing Ubuntu server | Clone this repo and run the feasibility check before installing. |
|
||||||
|
| Lab test without GPU | Use Multipass/VM and expect GPU services to be limited. |
|
||||||
|
|
||||||
|
## 2. New ISO Install
|
||||||
|
|
||||||
|
1. Flash the Nexus One AI ISO to a USB drive or attach it to the VM/server.
|
||||||
|
2. Boot the server from the ISO.
|
||||||
|
3. On the Ubuntu installer network screen, choose DHCP or the final static IP.
|
||||||
|
4. Let Ubuntu finish installation and reboot.
|
||||||
|
5. On first boot, open the setup URL shown on the server console:
|
||||||
|
|
||||||
|
```text
|
||||||
|
http://<server-ip>
|
||||||
```
|
```
|
||||||
|
|
||||||
Server reboots automatically after NVIDIA drivers install. Phase 2 runs on its own after reboot.
|
6. Complete the setup wizard:
|
||||||
|
- Network: DHCP or static IP.
|
||||||
On the custom ISO, Ubuntu autoinstall now pauses on the installer network screen so the operator can choose the final IP address from the VM console before installation continues.
|
- License & customer details: customer name, project/customer ID, contact email, license key, support date.
|
||||||
|
- Tier: Starter, Entry, Pro, or Max.
|
||||||
## Software-Only / Existing Hardware
|
- Tools: keep defaults unless a component should be skipped.
|
||||||
|
7. Click **Start Installation**.
|
||||||
Run a feasibility scan before quoting or installing on customer-owned hardware:
|
8. Wait for Phase 1 NVIDIA driver setup. The server may reboot once.
|
||||||
|
9. After reboot, Phase 2 continues automatically through `cezen-phase2.service`.
|
||||||
|
10. Monitor progress:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
bash scripts/cezen-feasibility.sh
|
ssh cezen@<server-ip>
|
||||||
|
sudo journalctl -fu cezen-phase2.service
|
||||||
|
sudo tail -f /var/log/cezen-install.log
|
||||||
```
|
```
|
||||||
|
|
||||||
The checker reports CPU, RAM, disk, NVIDIA GPU/VRAM, tool readiness, available features, and a recommended Cezen profile. It writes JSON to `/opt/cezen/feasibility.json` when possible, otherwise `./feasibility.json`.
|
11. Open the portal after install:
|
||||||
|
|
||||||
Install on existing hardware without the appliance NVIDIA phase:
|
```text
|
||||||
|
http://<server-ip>/
|
||||||
|
```
|
||||||
|
|
||||||
|
## 3. PSU / Pendrive Field Install
|
||||||
|
|
||||||
|
Use this when a team physically visits the site and installs from a USB drive.
|
||||||
|
|
||||||
|
1. Carry the latest ISO on a bootable USB drive.
|
||||||
|
2. Boot the PSU/customer server from the USB.
|
||||||
|
3. Configure the final network on the Ubuntu installer screen.
|
||||||
|
4. After first boot, use either:
|
||||||
|
- Browser setup at `http://<server-ip>`, or
|
||||||
|
- Physical console terminal wizard if no browser is available.
|
||||||
|
5. Enter customer/license details during setup. If the final license key is not available, leave it blank; the system records the install as field staging/evaluation.
|
||||||
|
6. Select the commercial tier sold to the customer.
|
||||||
|
7. Complete install.
|
||||||
|
8. Upload or pull large models later after bandwidth/storage is confirmed.
|
||||||
|
|
||||||
|
License details are stored on the installed server at:
|
||||||
|
|
||||||
|
```text
|
||||||
|
/opt/cezen/license.json
|
||||||
|
```
|
||||||
|
|
||||||
|
Installer selections are stored at:
|
||||||
|
|
||||||
|
```text
|
||||||
|
/opt/cezen/install.conf
|
||||||
|
```
|
||||||
|
|
||||||
|
## 4. Existing Server Feasibility Check
|
||||||
|
|
||||||
|
Run this before quoting, committing a tier, or installing on customer-owned hardware.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone https://cgit.cezentech.com/jinojose/aipackage.git
|
||||||
|
cd aipackage
|
||||||
|
sudo bash install.sh --feasibility-only
|
||||||
|
```
|
||||||
|
|
||||||
|
The report checks CPU, RAM, disk, NVIDIA GPU/VRAM, and likely supported features.
|
||||||
|
It writes JSON to:
|
||||||
|
|
||||||
|
```text
|
||||||
|
/opt/cezen/feasibility.json
|
||||||
|
```
|
||||||
|
|
||||||
|
If `/opt/cezen` is not writable, it writes:
|
||||||
|
|
||||||
|
```text
|
||||||
|
./feasibility.json
|
||||||
|
```
|
||||||
|
|
||||||
|
Recommended interpretation:
|
||||||
|
|
||||||
|
| Result | Meaning |
|
||||||
|
|---|---|
|
||||||
|
| `core` | Portal/backend only; no local model serving recommended. |
|
||||||
|
| `cpu-ai` | CPU-only RAG/chat possible, but constrained. |
|
||||||
|
| `gpu-starter` | Starter GPU deployment. |
|
||||||
|
| `gpu-standard` | Entry tier style deployment. |
|
||||||
|
| `gpu-pro` | Pro tier candidate. |
|
||||||
|
| `gpu-max` | Max tier candidate. |
|
||||||
|
|
||||||
|
## 5. Existing Server Install
|
||||||
|
|
||||||
|
After feasibility check, install on an existing Ubuntu server:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
sudo bash install.sh --software-only --profile=auto
|
sudo bash install.sh --software-only --profile=auto
|
||||||
```
|
```
|
||||||
|
|
||||||
For small systems or slow customer networks, the installer skips default model downloads on lightweight profiles. To force the same behavior manually:
|
For small systems or slow customer networks, skip default model downloads:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
sudo bash install.sh --software-only --profile=cpu-ai --skip-model-pull
|
sudo bash install.sh --software-only --profile=cpu-ai --skip-model-pull
|
||||||
```
|
```
|
||||||
|
|
||||||
Profiles:
|
To force a commercial tier:
|
||||||
|
|
||||||
| Profile | Use When | Installs |
|
```bash
|
||||||
|---|---|---|
|
sudo bash install.sh --software-only --tier=starter
|
||||||
| `core` | no GPU / low RAM | portal, backend, nginx, health/metrics API |
|
sudo bash install.sh --software-only --tier=basic
|
||||||
| `cpu-ai` | 32 GB+ RAM, no usable GPU | core + Chroma/Ollama CPU path, model pull optional |
|
sudo bash install.sh --software-only --tier=pro
|
||||||
| `gpu-starter` | 24-32 GB VRAM | local AI starter stack, model pull optional |
|
sudo bash install.sh --software-only --tier=max
|
||||||
| `gpu-standard` | 48-96 GB VRAM | standard GPU stack |
|
```
|
||||||
| `gpu-pro` | multi/high-VRAM GPU | advanced GPU stack |
|
|
||||||
| `gpu-max` | multi-node or HGX-class | full stack, custom sizing |
|
|
||||||
|
|
||||||
## Sellable v1 Admin APIs
|
The installer warns if selected tier and hardware recommendation do not match.
|
||||||
|
The selected tier still wins, because the sale/license decision is commercial.
|
||||||
|
|
||||||
The backend exposes the first productization APIs for software-only and appliance deployments:
|
## 6. Tier Guide
|
||||||
|
|
||||||
|
| Tier | Target Hardware | Typical Use | Default Models |
|
||||||
|
|---|---|---|---|
|
||||||
|
| Starter | 1 GPU around 24-32 GB VRAM, or constrained CPU system | Small team, RAG/admin portal, light chat | `phi3:mini`, `nomic-embed-text` |
|
||||||
|
| Entry / Basic | 1 RTX Pro 6000 class GPU, around 48-96 GB VRAM | Department deployment | `llama3.1:8b`, `mistral:7b`, `codellama:13b`, `nomic-embed-text` |
|
||||||
|
| Pro | 2+ high VRAM GPUs | Multi-team deployment, heavier coding/RAG/fine-tuning workflows | Entry models plus `llama3.1:70b`, `mixtral:8x7b`, `deepseek-coder-v2:16b` |
|
||||||
|
| Max | 4-8 enterprise GPUs such as H100/H200/A100 class | Enterprise deployment, large models, high concurrency | Pro models plus `llama3.1:405b`, `mixtral:8x22b` |
|
||||||
|
|
||||||
|
Large models can be pulled later. The ISO does not need to contain them.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
bash models/pull-models.sh --tier=starter
|
||||||
|
bash models/pull-models.sh --tier=basic
|
||||||
|
bash models/pull-models.sh --tier=pro
|
||||||
|
bash models/pull-models.sh --tier=max
|
||||||
|
```
|
||||||
|
|
||||||
|
## 7. What Gets Installed
|
||||||
|
|
||||||
|
All tiers install the Nexus One AI portal, backend API, nginx, health/readiness
|
||||||
|
reporting, license/tier handling, and selected AI tools.
|
||||||
|
|
||||||
|
| Component | Port | Notes |
|
||||||
|
|---|---:|---|
|
||||||
|
| Nexus One AI portal | 80 | Main UI served by nginx. |
|
||||||
|
| cezen-api backend | 8080 | FastAPI backend, systemd service `cezen-api`. |
|
||||||
|
| Ollama | 11434 | Local model inference. |
|
||||||
|
| Open WebUI | 3001 | Chat UI. |
|
||||||
|
| ChromaDB | 8100 | Vector database for RAG. |
|
||||||
|
| vLLM | 8000 | OpenAI-compatible serving path, mainly Pro/Max. |
|
||||||
|
| JupyterLab | 8888 | Notebook environment. |
|
||||||
|
| MLflow | 5000 | Experiment tracking. |
|
||||||
|
| MinIO | 9001 | S3-compatible object/model storage. |
|
||||||
|
| Grafana | 3000 | Monitoring dashboard. |
|
||||||
|
|
||||||
|
## 8. Admin And Readiness APIs
|
||||||
|
|
||||||
| API | Purpose |
|
| API | Purpose |
|
||||||
|---|---|
|
|---|---|
|
||||||
| `GET /api/license` | Shows current tier, feature matrix, and whether the tier is locked by Cezen. |
|
| `GET /api/license` | Current tier, feature matrix, and safe license metadata. |
|
||||||
| `GET /api/system/feasibility` | Returns the generated hardware feasibility report or live fallback. |
|
| `GET /api/system/feasibility` | Hardware feasibility report or live fallback. |
|
||||||
| `GET /api/system/readiness-report` | Combines license, feasibility, and install readiness into a customer-facing report payload. |
|
| `GET /api/system/readiness-report` | License + feasibility + install readiness payload. |
|
||||||
| `GET /api/audit/report?days=7` | Basic audit summary for handover and admin review. |
|
| `GET /api/audit/report?days=7` | Audit summary for handover/admin review. |
|
||||||
| `GET /api/system/backups` | Lists local backups. |
|
| `GET /api/system/backups` | List local backups. |
|
||||||
| `POST /api/system/backups` | Creates a local backup of Cezen data. |
|
| `POST /api/system/backups` | Create local backup. |
|
||||||
| `POST /api/system/backups/{name}/restore` | Restores a named local backup and creates a pre-restore safety snapshot. |
|
| `POST /api/system/backups/{name}/restore` | Restore backup with pre-restore safety snapshot. |
|
||||||
|
|
||||||
CLI backup helper:
|
Backup helper:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
sudo bash scripts/cezen-backup.sh backup
|
sudo bash scripts/cezen-backup.sh backup
|
||||||
@ -67,74 +190,84 @@ sudo bash scripts/cezen-backup.sh list
|
|||||||
sudo bash scripts/cezen-backup.sh restore /opt/cezen/backups/cezen-backup-YYYYmmdd-HHMMSS.zip
|
sudo bash scripts/cezen-backup.sh restore /opt/cezen/backups/cezen-backup-YYYYmmdd-HHMMSS.zip
|
||||||
```
|
```
|
||||||
|
|
||||||
## What Gets Installed (Entry Tier)
|
## 9. Post-Install Checks
|
||||||
|
|
||||||
| Service | Port | Notes |
|
Run these after install:
|
||||||
|---|---|---|
|
|
||||||
| Ollama | 11434 | LLM inference, 2 models pre-loaded |
|
```bash
|
||||||
| Open WebUI | 3001 | Chat interface |
|
systemctl status cezen-api --no-pager
|
||||||
| vLLM | 8000 | OpenAI-compatible API (start manually) |
|
systemctl status cezen-phase2.service --no-pager
|
||||||
| JupyterLab | 8888 | Token: `cezen2024` |
|
curl -s http://localhost:8080/api/settings/branding
|
||||||
| ChromaDB | 8100 | Vector DB for RAG |
|
curl -s http://localhost:8080/api/system/feasibility
|
||||||
| MLflow | 5000 | Experiment tracking |
|
```
|
||||||
| MinIO | 9001 | Object storage (user: cezenadmin / Cezen@2024!) |
|
|
||||||
| Grafana | 3000 | GPU + system monitoring (admin / cezen2024) |
|
Check service ports:
|
||||||
|
|
||||||
## Testing Without a GPU (Multipass)
|
```bash
|
||||||
|
ss -lntp
|
||||||
|
```
|
||||||
|
|
||||||
|
Check Ollama models:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -s http://localhost:11434/api/tags
|
||||||
|
```
|
||||||
|
|
||||||
|
## 10. Test Without A GPU
|
||||||
|
|
||||||
|
On a MacBook:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
# On your MacBook:
|
|
||||||
multipass launch 22.04 --name cezen-test --cpus 4 --mem 8G --disk 40G
|
multipass launch 22.04 --name cezen-test --cpus 4 --mem 8G --disk 40G
|
||||||
multipass shell cezen-test
|
multipass shell cezen-test
|
||||||
|
|
||||||
# Inside the VM:
|
|
||||||
git clone <cgit-url>
|
|
||||||
sudo bash install.sh
|
|
||||||
```
|
```
|
||||||
|
|
||||||
NVIDIA driver install will succeed but `nvidia-smi` won't show GPUs — that's expected. All other services will run fine.
|
Inside the VM:
|
||||||
|
|
||||||
## Pull More Models
|
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
bash models/pull-models.sh --tier=starter # phi3:mini + embeddings
|
git clone https://cgit.cezentech.com/jinojose/aipackage.git
|
||||||
bash models/pull-models.sh --tier=basic # llama3.1:8b, mistral:7b, codellama
|
cd aipackage
|
||||||
bash models/pull-models.sh --tier=pro # + llama3.1:70b, mixtral, deepseek-coder
|
sudo bash install.sh --feasibility-only
|
||||||
bash models/pull-models.sh --tier=max # + llama3.1:405b, mixtral:8x22b
|
sudo bash install.sh --software-only --profile=auto --skip-model-pull
|
||||||
```
|
```
|
||||||
|
|
||||||
## File Structure
|
No GPU will be detected. That is expected.
|
||||||
|
|
||||||
```
|
## 11. Change Default Passwords Before Customer Handover
|
||||||
cgit/
|
|
||||||
├── install.sh ← Entry point
|
|
||||||
├── ansible/
|
|
||||||
│ ├── phase1_nvidia.yml ← Phase 1: drivers (triggers reboot)
|
|
||||||
│ ├── starter.yml ← Phase 2: Starter tier (1 GPU, small team)
|
|
||||||
│ ├── entry.yml ← Phase 2: Basic tier (1–2 GPU, department)
|
|
||||||
│ ├── pro.yml ← Phase 2: Pro tier (2+ GPU, multi-team)
|
|
||||||
│ ├── max.yml ← Phase 2: Max tier (4–8 GPU, enterprise)
|
|
||||||
│ └── roles/
|
|
||||||
│ ├── base/ ← OS, Python, Miniconda, LangChain
|
|
||||||
│ ├── nvidia/ ← Drivers, CUDA 12.4, cuDNN 9
|
|
||||||
│ ├── docker/ ← Docker CE + NVIDIA Container Toolkit
|
|
||||||
│ ├── k3s/ ← Lightweight Kubernetes
|
|
||||||
│ ├── ollama/ ← Ollama + Open WebUI
|
|
||||||
│ ├── vllm/ ← vLLM inference server
|
|
||||||
│ ├── jupyterlab/ ← JupyterLab notebooks
|
|
||||||
│ ├── chromadb/ ← Vector database
|
|
||||||
│ ├── mlflow/ ← Experiment tracking
|
|
||||||
│ ├── minio/ ← Object storage
|
|
||||||
│ └── monitoring/ ← Grafana + Prometheus + DCGM
|
|
||||||
└── models/
|
|
||||||
└── pull-models.sh ← Pull additional models
|
|
||||||
```
|
|
||||||
|
|
||||||
## Change Default Passwords
|
Before shipping to a customer, rotate these:
|
||||||
|
|
||||||
Before shipping to a customer, update these:
|
|
||||||
|
|
||||||
|
- Initial OS/admin account password.
|
||||||
- JupyterLab token: `/opt/cezen/.jupyter/jupyter_lab_config.py`
|
- JupyterLab token: `/opt/cezen/.jupyter/jupyter_lab_config.py`
|
||||||
- MinIO: `/etc/default/minio`
|
- MinIO credentials: `/etc/default/minio`
|
||||||
- Grafana: environment vars in monitoring role, or via UI after first login
|
- Grafana admin password.
|
||||||
- MLflow: no auth by default (add reverse proxy if needed)
|
- Any temporary portal/backend admin credentials.
|
||||||
|
- Any staging license key if the final license is issued later.
|
||||||
|
|
||||||
|
## 12. Useful Files
|
||||||
|
|
||||||
|
```text
|
||||||
|
cgit/
|
||||||
|
├── install.sh # Main installer entry point
|
||||||
|
├── autoinstall/ # ISO first-boot setup and web setup
|
||||||
|
├── scripts/cezen-feasibility.sh # Existing-server feasibility checker
|
||||||
|
├── scripts/cezen-backup.sh # Backup/restore helper
|
||||||
|
├── ansible/
|
||||||
|
│ ├── phase1_nvidia.yml # NVIDIA/CUDA phase
|
||||||
|
│ ├── starter.yml # Starter tier
|
||||||
|
│ ├── entry.yml # Entry/Basic tier
|
||||||
|
│ ├── pro.yml # Pro tier
|
||||||
|
│ ├── max.yml # Max tier
|
||||||
|
│ └── roles/
|
||||||
|
│ ├── cezen-backend/ # FastAPI backend, cezen-api service
|
||||||
|
│ ├── cezen-nginx/ # Portal/nginx deployment
|
||||||
|
│ ├── ollama/ # Ollama + Open WebUI
|
||||||
|
│ ├── chromadb/ # RAG vector DB
|
||||||
|
│ ├── vllm/ # vLLM serving
|
||||||
|
│ ├── jupyterlab/ # Notebooks
|
||||||
|
│ ├── mlflow/ # Experiment tracking
|
||||||
|
│ ├── minio/ # Object storage
|
||||||
|
│ └── monitoring/ # Grafana/Prometheus/DCGM
|
||||||
|
├── cezen-portal/ # Packaged portal UI
|
||||||
|
└── models/pull-models.sh # Pull tier-specific models
|
||||||
|
```
|
||||||
|
|||||||
Loading…
Reference in New Issue
Block a user