322 lines
12 KiB
Markdown
322 lines
12 KiB
Markdown
# Nexus One AI Installer
|
|
|
|
This repository is the source of truth for Nexus One AI ISO and server installs.
|
|
The ISO keeps itself small by pulling this package from cgit during setup, then
|
|
the installer deploys the selected tier on the target server.
|
|
|
|
## 1. Choose The Install Path
|
|
|
|
| Scenario | Use This Path |
|
|
|---|---|
|
|
| New appliance/server with ISO | Boot from the Nexus One AI ISO and complete first-boot setup. |
|
|
| PSU/offline field install by pendrive | Boot the ISO from USB, enter license/tier details during first boot, upload large models later if needed. |
|
|
| Existing Ubuntu server | Clone this repo and run the feasibility check before installing. |
|
|
| Lab test without GPU | Use Multipass/VM and expect GPU services to be limited. |
|
|
|
|
## 2. New ISO Install
|
|
|
|
1. Flash the Nexus One AI ISO to a USB drive or attach it to the VM/server.
|
|
2. Boot the server from the ISO.
|
|
3. On the Ubuntu installer network screen, choose DHCP or the final static IP.
|
|
4. Let Ubuntu finish installation and reboot.
|
|
5. On first boot, open the setup URL shown on the server console:
|
|
|
|
```text
|
|
http://<server-ip>
|
|
```
|
|
|
|
6. Complete the setup wizard:
|
|
- Network: DHCP or static IP.
|
|
- License & customer details: customer name, project/customer ID, contact email, license key, support date.
|
|
- Tier: Starter, Entry, Pro, or Max.
|
|
- Tools: keep defaults unless a component should be skipped.
|
|
7. Click **Start Installation**.
|
|
8. Wait for Phase 1 NVIDIA driver setup. The server may reboot once.
|
|
9. After reboot, Phase 2 continues automatically through `cezen-phase2.service`.
|
|
10. Monitor progress:
|
|
|
|
```bash
|
|
ssh cezen@<server-ip>
|
|
sudo journalctl -fu cezen-phase2.service
|
|
sudo tail -f /var/log/cezen-install.log
|
|
```
|
|
|
|
11. Open the portal after install:
|
|
|
|
```text
|
|
http://<server-ip>/
|
|
```
|
|
|
|
## 3. PSU / Pendrive Field Install
|
|
|
|
Use this when a team physically visits the site and installs from a USB drive.
|
|
|
|
1. Carry the latest ISO on a bootable USB drive.
|
|
2. Boot the PSU/customer server from the USB.
|
|
3. Configure the final network on the Ubuntu installer screen.
|
|
4. After first boot, use either:
|
|
- Browser setup at `http://<server-ip>`, or
|
|
- Physical console terminal wizard if no browser is available.
|
|
5. Enter customer/license details during setup. If the final license key is not available, leave it blank; the system records the install as field staging/evaluation.
|
|
6. Select the commercial tier sold to the customer.
|
|
7. Complete install.
|
|
8. Upload or pull large models later after bandwidth/storage is confirmed.
|
|
|
|
License details are stored on the installed server at:
|
|
|
|
```text
|
|
/opt/cezen/license.json
|
|
```
|
|
|
|
Installer selections are stored at:
|
|
|
|
```text
|
|
/opt/cezen/install.conf
|
|
```
|
|
|
|
## 4. Existing Server Feasibility Check
|
|
|
|
Run this before quoting, committing a tier, or installing on customer-owned hardware.
|
|
|
|
```bash
|
|
git clone https://cgit.cezentech.com/jinojose/aipackage.git
|
|
cd aipackage
|
|
sudo bash install.sh --feasibility-only
|
|
```
|
|
|
|
The report checks CPU, RAM, disk, NVIDIA GPU/VRAM, and likely supported features.
|
|
It writes JSON to:
|
|
|
|
```text
|
|
/opt/cezen/feasibility.json
|
|
```
|
|
|
|
If `/opt/cezen` is not writable, it writes:
|
|
|
|
```text
|
|
./feasibility.json
|
|
```
|
|
|
|
Recommended interpretation:
|
|
|
|
| Result | Meaning |
|
|
|---|---|
|
|
| `core` | Portal/backend only; no local model serving recommended. |
|
|
| `cpu-ai` | CPU-only RAG/chat possible, but constrained. |
|
|
| `gpu-starter` | Starter GPU deployment. |
|
|
| `gpu-standard` | Entry tier style deployment. |
|
|
| `gpu-pro` | Pro tier candidate. |
|
|
| `gpu-max` | Max tier candidate. |
|
|
|
|
## 5. Existing Server Install
|
|
|
|
After feasibility check, install on an existing Ubuntu server:
|
|
|
|
```bash
|
|
sudo bash install.sh --software-only --profile=auto
|
|
```
|
|
|
|
For small systems or slow customer networks, skip default model downloads:
|
|
|
|
```bash
|
|
sudo bash install.sh --software-only --profile=cpu-ai --skip-model-pull
|
|
```
|
|
|
|
To force a commercial tier:
|
|
|
|
```bash
|
|
sudo bash install.sh --software-only --tier=starter
|
|
sudo bash install.sh --software-only --tier=basic
|
|
sudo bash install.sh --software-only --tier=pro
|
|
sudo bash install.sh --software-only --tier=max
|
|
```
|
|
|
|
The installer warns if selected tier and hardware recommendation do not match.
|
|
The selected tier still wins, because the sale/license decision is commercial.
|
|
|
|
## 6. Tier Guide
|
|
|
|
| Tier | Target Hardware | Typical Use | Default Models |
|
|
|---|---|---|---|
|
|
| Starter | 1 GPU around 24-32 GB VRAM, or constrained CPU system | Small team, RAG/admin portal, light chat | `phi3:mini`, `nomic-embed-text` |
|
|
| Entry / Basic | 1 RTX Pro 6000 class GPU, around 48-96 GB VRAM | Department deployment | `llama3.1:8b`, `mistral:7b`, `codellama:13b`, `nomic-embed-text` |
|
|
| Pro | 2+ high VRAM GPUs | Multi-team deployment, heavier coding/RAG/fine-tuning workflows | Entry models plus `llama3.1:70b`, `mixtral:8x7b`, `deepseek-coder-v2:16b` |
|
|
| Max | 4-8 enterprise GPUs such as H100/H200/A100 class | Enterprise deployment, large models, high concurrency | Pro models plus `llama3.1:405b`, `mixtral:8x22b` |
|
|
|
|
Large models can be pulled later. The ISO does not need to contain them.
|
|
|
|
```bash
|
|
bash models/pull-models.sh --tier=starter
|
|
bash models/pull-models.sh --tier=basic
|
|
bash models/pull-models.sh --tier=pro
|
|
bash models/pull-models.sh --tier=max
|
|
```
|
|
|
|
## 7. Product Features
|
|
|
|
Nexus One AI includes these application features through the portal and backend:
|
|
|
|
| Feature | What It Does |
|
|
|---|---|
|
|
| Secure admin portal | Browser UI for setup, chat, tools, users, models, reports, and system status. |
|
|
| Authentication and sessions | JWT login, role-aware admin access, brute-force lockout, active session tracking. |
|
|
| User and team management | Admin-managed users, teams, roles, and account status. |
|
|
| Private chat | On-prem chat over local or routed models. |
|
|
| RAG knowledge base | Upload documents, index them into ChromaDB, and query private knowledge. |
|
|
| Prompt library | Government/enterprise prompt templates grouped by use case. |
|
|
| Model management | View local models, pull Ollama models, upload GGUF models, and track model status. |
|
|
| Model router | Route requests by rule to local, GPU, or external model endpoints on supported tiers. |
|
|
| Document intelligence | Parse, summarize, and extract structured information from documents. |
|
|
| Meeting assistant | Transcript/audio processing, summaries, decisions, action items, and follow-ups. |
|
|
| Agent builder | Create and run configured agents, including scheduled agent jobs. |
|
|
| Workflow automation | Run portal workflows with HTTP, email, RAG, save-to-knowledge-base, and filter steps. |
|
|
| Connectors | Store and sync supported data connectors. |
|
|
| Guardrails | Keyword, regex, and PII checks for safer prompts and responses. |
|
|
| Analytics and audit | Query logs, usage summaries, audit reports, and admin visibility. |
|
|
| Evaluation suite | Manage datasets, eval jobs, and model/prompt quality checks. |
|
|
| Fine-tuning jobs | QLoRA and advanced training paths for higher tiers. |
|
|
| API key manager | Create, list, and revoke API keys for integrations. |
|
|
| Backups and restore | Local backup, list, restore, and pre-restore safety snapshot APIs. |
|
|
| System readiness | Feasibility, license, and readiness reports for handover and support. |
|
|
|
|
## 8. Features By Tier
|
|
|
|
The backend exposes this same matrix from `GET /api/license`.
|
|
|
|
| Feature | Starter | Entry / Basic | Pro | Max |
|
|
|---|---|---|---|---|
|
|
| Max users | 10 | 25 | 100 | Custom |
|
|
| Portal | Yes | Yes | Yes | Yes |
|
|
| Private chat | Yes | Yes | Yes | Yes |
|
|
| RAG knowledge base | Yes | Yes | Advanced | Advanced |
|
|
| Meeting assistant | No | Yes | Yes | Yes |
|
|
| Workflows | Basic | Basic | Advanced | Advanced |
|
|
| Connectors | No | Limited | Yes | Yes |
|
|
| Model router | No | No | Yes | Yes |
|
|
| Audit reports | Yes | Yes | Yes | Yes |
|
|
| Backup and restore | Yes | Yes | Yes | Yes |
|
|
| Guardrails | Basic | Basic | Advanced | Advanced |
|
|
| GPU inference | No | Optional | Yes | Yes |
|
|
| Fine-tuning | No | No | QLoRA | Advanced |
|
|
| DeepSpeed / distributed training | No | No | No | Custom |
|
|
|
|
## 9. What Gets Installed
|
|
|
|
All tiers install the Nexus One AI portal, backend API, nginx, health/readiness
|
|
reporting, license/tier handling, and selected AI tools.
|
|
|
|
| Component | Port | Notes |
|
|
|---|---:|---|
|
|
| Nexus One AI portal | 80 | Main UI served by nginx. |
|
|
| cezen-api backend | 8080 | FastAPI backend, systemd service `cezen-api`. |
|
|
| Ollama | 11434 | Local model inference. |
|
|
| Open WebUI | 3001 | Chat UI. |
|
|
| ChromaDB | 8100 | Vector database for RAG. |
|
|
| vLLM | 8000 | OpenAI-compatible serving path, mainly Pro/Max. |
|
|
| JupyterLab | 8888 | Notebook environment. |
|
|
| MLflow | 5000 | Experiment tracking. |
|
|
| MinIO | 9001 | S3-compatible object/model storage. |
|
|
| Grafana | 3000 | Monitoring dashboard. |
|
|
|
|
## 10. Admin And Readiness APIs
|
|
|
|
| API | Purpose |
|
|
|---|---|
|
|
| `GET /api/license` | Current tier, feature matrix, and safe license metadata. |
|
|
| `GET /api/system/feasibility` | Hardware feasibility report or live fallback. |
|
|
| `GET /api/system/readiness-report` | License + feasibility + install readiness payload. |
|
|
| `GET /api/audit/report?days=7` | Audit summary for handover/admin review. |
|
|
| `GET /api/system/backups` | List local backups. |
|
|
| `POST /api/system/backups` | Create local backup. |
|
|
| `POST /api/system/backups/{name}/restore` | Restore backup with pre-restore safety snapshot. |
|
|
|
|
Backup helper:
|
|
|
|
```bash
|
|
sudo bash scripts/cezen-backup.sh backup
|
|
sudo bash scripts/cezen-backup.sh list
|
|
sudo bash scripts/cezen-backup.sh restore /opt/cezen/backups/cezen-backup-YYYYmmdd-HHMMSS.zip
|
|
```
|
|
|
|
## 11. Post-Install Checks
|
|
|
|
Run these after install:
|
|
|
|
```bash
|
|
systemctl status cezen-api --no-pager
|
|
systemctl status cezen-phase2.service --no-pager
|
|
curl -s http://localhost:8080/api/settings/branding
|
|
curl -s http://localhost:8080/api/system/feasibility
|
|
```
|
|
|
|
Check service ports:
|
|
|
|
```bash
|
|
ss -lntp
|
|
```
|
|
|
|
Check Ollama models:
|
|
|
|
```bash
|
|
curl -s http://localhost:11434/api/tags
|
|
```
|
|
|
|
## 12. Test Without A GPU
|
|
|
|
On a MacBook:
|
|
|
|
```bash
|
|
multipass launch 22.04 --name cezen-test --cpus 4 --mem 8G --disk 40G
|
|
multipass shell cezen-test
|
|
```
|
|
|
|
Inside the VM:
|
|
|
|
```bash
|
|
git clone https://cgit.cezentech.com/jinojose/aipackage.git
|
|
cd aipackage
|
|
sudo bash install.sh --feasibility-only
|
|
sudo bash install.sh --software-only --profile=auto --skip-model-pull
|
|
```
|
|
|
|
No GPU will be detected. That is expected.
|
|
|
|
## 13. Change Default Passwords Before Customer Handover
|
|
|
|
Before shipping to a customer, rotate these:
|
|
|
|
- Initial OS/admin account password.
|
|
- JupyterLab token: `/opt/cezen/.jupyter/jupyter_lab_config.py`
|
|
- MinIO credentials: `/etc/default/minio`
|
|
- Grafana admin password.
|
|
- Any temporary portal/backend admin credentials.
|
|
- Any staging license key if the final license is issued later.
|
|
|
|
## 14. Useful Files
|
|
|
|
```text
|
|
cgit/
|
|
├── install.sh # Main installer entry point
|
|
├── autoinstall/ # ISO first-boot setup and web setup
|
|
├── scripts/cezen-feasibility.sh # Existing-server feasibility checker
|
|
├── scripts/cezen-backup.sh # Backup/restore helper
|
|
├── ansible/
|
|
│ ├── phase1_nvidia.yml # NVIDIA/CUDA phase
|
|
│ ├── starter.yml # Starter tier
|
|
│ ├── entry.yml # Entry/Basic tier
|
|
│ ├── pro.yml # Pro tier
|
|
│ ├── max.yml # Max tier
|
|
│ └── roles/
|
|
│ ├── cezen-backend/ # FastAPI backend, cezen-api service
|
|
│ ├── cezen-nginx/ # Portal/nginx deployment
|
|
│ ├── ollama/ # Ollama + Open WebUI
|
|
│ ├── chromadb/ # RAG vector DB
|
|
│ ├── vllm/ # vLLM serving
|
|
│ ├── jupyterlab/ # Notebooks
|
|
│ ├── mlflow/ # Experiment tracking
|
|
│ ├── minio/ # Object storage
|
|
│ └── monitoring/ # Grafana/Prometheus/DCGM
|
|
├── cezen-portal/ # Packaged portal UI
|
|
└── models/pull-models.sh # Pull tier-specific models
|
|
```
|