aipackage/README.md
2026-06-30 12:31:09 +05:30

274 lines
9.1 KiB
Markdown

# Nexus One AI Installer
This repository is the source of truth for Nexus One AI ISO and server installs.
The ISO keeps itself small by pulling this package from cgit during setup, then
the installer deploys the selected tier on the target server.
## 1. Choose The Install Path
| Scenario | Use This Path |
|---|---|
| New appliance/server with ISO | Boot from the Nexus One AI ISO and complete first-boot setup. |
| PSU/offline field install by pendrive | Boot the ISO from USB, enter license/tier details during first boot, upload large models later if needed. |
| Existing Ubuntu server | Clone this repo and run the feasibility check before installing. |
| Lab test without GPU | Use Multipass/VM and expect GPU services to be limited. |
## 2. New ISO Install
1. Flash the Nexus One AI ISO to a USB drive or attach it to the VM/server.
2. Boot the server from the ISO.
3. On the Ubuntu installer network screen, choose DHCP or the final static IP.
4. Let Ubuntu finish installation and reboot.
5. On first boot, open the setup URL shown on the server console:
```text
http://<server-ip>
```
6. Complete the setup wizard:
- Network: DHCP or static IP.
- License & customer details: customer name, project/customer ID, contact email, license key, support date.
- Tier: Starter, Entry, Pro, or Max.
- Tools: keep defaults unless a component should be skipped.
7. Click **Start Installation**.
8. Wait for Phase 1 NVIDIA driver setup. The server may reboot once.
9. After reboot, Phase 2 continues automatically through `cezen-phase2.service`.
10. Monitor progress:
```bash
ssh cezen@<server-ip>
sudo journalctl -fu cezen-phase2.service
sudo tail -f /var/log/cezen-install.log
```
11. Open the portal after install:
```text
http://<server-ip>/
```
## 3. PSU / Pendrive Field Install
Use this when a team physically visits the site and installs from a USB drive.
1. Carry the latest ISO on a bootable USB drive.
2. Boot the PSU/customer server from the USB.
3. Configure the final network on the Ubuntu installer screen.
4. After first boot, use either:
- Browser setup at `http://<server-ip>`, or
- Physical console terminal wizard if no browser is available.
5. Enter customer/license details during setup. If the final license key is not available, leave it blank; the system records the install as field staging/evaluation.
6. Select the commercial tier sold to the customer.
7. Complete install.
8. Upload or pull large models later after bandwidth/storage is confirmed.
License details are stored on the installed server at:
```text
/opt/cezen/license.json
```
Installer selections are stored at:
```text
/opt/cezen/install.conf
```
## 4. Existing Server Feasibility Check
Run this before quoting, committing a tier, or installing on customer-owned hardware.
```bash
git clone https://cgit.cezentech.com/jinojose/aipackage.git
cd aipackage
sudo bash install.sh --feasibility-only
```
The report checks CPU, RAM, disk, NVIDIA GPU/VRAM, and likely supported features.
It writes JSON to:
```text
/opt/cezen/feasibility.json
```
If `/opt/cezen` is not writable, it writes:
```text
./feasibility.json
```
Recommended interpretation:
| Result | Meaning |
|---|---|
| `core` | Portal/backend only; no local model serving recommended. |
| `cpu-ai` | CPU-only RAG/chat possible, but constrained. |
| `gpu-starter` | Starter GPU deployment. |
| `gpu-standard` | Entry tier style deployment. |
| `gpu-pro` | Pro tier candidate. |
| `gpu-max` | Max tier candidate. |
## 5. Existing Server Install
After feasibility check, install on an existing Ubuntu server:
```bash
sudo bash install.sh --software-only --profile=auto
```
For small systems or slow customer networks, skip default model downloads:
```bash
sudo bash install.sh --software-only --profile=cpu-ai --skip-model-pull
```
To force a commercial tier:
```bash
sudo bash install.sh --software-only --tier=starter
sudo bash install.sh --software-only --tier=basic
sudo bash install.sh --software-only --tier=pro
sudo bash install.sh --software-only --tier=max
```
The installer warns if selected tier and hardware recommendation do not match.
The selected tier still wins, because the sale/license decision is commercial.
## 6. Tier Guide
| Tier | Target Hardware | Typical Use | Default Models |
|---|---|---|---|
| Starter | 1 GPU around 24-32 GB VRAM, or constrained CPU system | Small team, RAG/admin portal, light chat | `phi3:mini`, `nomic-embed-text` |
| Entry / Basic | 1 RTX Pro 6000 class GPU, around 48-96 GB VRAM | Department deployment | `llama3.1:8b`, `mistral:7b`, `codellama:13b`, `nomic-embed-text` |
| Pro | 2+ high VRAM GPUs | Multi-team deployment, heavier coding/RAG/fine-tuning workflows | Entry models plus `llama3.1:70b`, `mixtral:8x7b`, `deepseek-coder-v2:16b` |
| Max | 4-8 enterprise GPUs such as H100/H200/A100 class | Enterprise deployment, large models, high concurrency | Pro models plus `llama3.1:405b`, `mixtral:8x22b` |
Large models can be pulled later. The ISO does not need to contain them.
```bash
bash models/pull-models.sh --tier=starter
bash models/pull-models.sh --tier=basic
bash models/pull-models.sh --tier=pro
bash models/pull-models.sh --tier=max
```
## 7. What Gets Installed
All tiers install the Nexus One AI portal, backend API, nginx, health/readiness
reporting, license/tier handling, and selected AI tools.
| Component | Port | Notes |
|---|---:|---|
| Nexus One AI portal | 80 | Main UI served by nginx. |
| cezen-api backend | 8080 | FastAPI backend, systemd service `cezen-api`. |
| Ollama | 11434 | Local model inference. |
| Open WebUI | 3001 | Chat UI. |
| ChromaDB | 8100 | Vector database for RAG. |
| vLLM | 8000 | OpenAI-compatible serving path, mainly Pro/Max. |
| JupyterLab | 8888 | Notebook environment. |
| MLflow | 5000 | Experiment tracking. |
| MinIO | 9001 | S3-compatible object/model storage. |
| Grafana | 3000 | Monitoring dashboard. |
## 8. Admin And Readiness APIs
| API | Purpose |
|---|---|
| `GET /api/license` | Current tier, feature matrix, and safe license metadata. |
| `GET /api/system/feasibility` | Hardware feasibility report or live fallback. |
| `GET /api/system/readiness-report` | License + feasibility + install readiness payload. |
| `GET /api/audit/report?days=7` | Audit summary for handover/admin review. |
| `GET /api/system/backups` | List local backups. |
| `POST /api/system/backups` | Create local backup. |
| `POST /api/system/backups/{name}/restore` | Restore backup with pre-restore safety snapshot. |
Backup helper:
```bash
sudo bash scripts/cezen-backup.sh backup
sudo bash scripts/cezen-backup.sh list
sudo bash scripts/cezen-backup.sh restore /opt/cezen/backups/cezen-backup-YYYYmmdd-HHMMSS.zip
```
## 9. Post-Install Checks
Run these after install:
```bash
systemctl status cezen-api --no-pager
systemctl status cezen-phase2.service --no-pager
curl -s http://localhost:8080/api/settings/branding
curl -s http://localhost:8080/api/system/feasibility
```
Check service ports:
```bash
ss -lntp
```
Check Ollama models:
```bash
curl -s http://localhost:11434/api/tags
```
## 10. Test Without A GPU
On a MacBook:
```bash
multipass launch 22.04 --name cezen-test --cpus 4 --mem 8G --disk 40G
multipass shell cezen-test
```
Inside the VM:
```bash
git clone https://cgit.cezentech.com/jinojose/aipackage.git
cd aipackage
sudo bash install.sh --feasibility-only
sudo bash install.sh --software-only --profile=auto --skip-model-pull
```
No GPU will be detected. That is expected.
## 11. Change Default Passwords Before Customer Handover
Before shipping to a customer, rotate these:
- Initial OS/admin account password.
- JupyterLab token: `/opt/cezen/.jupyter/jupyter_lab_config.py`
- MinIO credentials: `/etc/default/minio`
- Grafana admin password.
- Any temporary portal/backend admin credentials.
- Any staging license key if the final license is issued later.
## 12. Useful Files
```text
cgit/
├── install.sh # Main installer entry point
├── autoinstall/ # ISO first-boot setup and web setup
├── scripts/cezen-feasibility.sh # Existing-server feasibility checker
├── scripts/cezen-backup.sh # Backup/restore helper
├── ansible/
│ ├── phase1_nvidia.yml # NVIDIA/CUDA phase
│ ├── starter.yml # Starter tier
│ ├── entry.yml # Entry/Basic tier
│ ├── pro.yml # Pro tier
│ ├── max.yml # Max tier
│ └── roles/
│ ├── cezen-backend/ # FastAPI backend, cezen-api service
│ ├── cezen-nginx/ # Portal/nginx deployment
│ ├── ollama/ # Ollama + Open WebUI
│ ├── chromadb/ # RAG vector DB
│ ├── vllm/ # vLLM serving
│ ├── jupyterlab/ # Notebooks
│ ├── mlflow/ # Experiment tracking
│ ├── minio/ # Object storage
│ └── monitoring/ # Grafana/Prometheus/DCGM
├── cezen-portal/ # Packaged portal UI
└── models/pull-models.sh # Pull tier-specific models
```