Jino Jose 84c45c1ad6 Document tier feature matrix

2026-06-30 12:41:13 +05:30

12 KiB

Raw Permalink Blame History

Nexus One AI Installer

This repository is the source of truth for Nexus One AI ISO and server installs. The ISO keeps itself small by pulling this package from cgit during setup, then the installer deploys the selected tier on the target server.

1. Choose The Install Path

Scenario	Use This Path
New appliance/server with ISO	Boot from the Nexus One AI ISO and complete first-boot setup.
PSU/offline field install by pendrive	Boot the ISO from USB, enter license/tier details during first boot, upload large models later if needed.
Existing Ubuntu server	Clone this repo and run the feasibility check before installing.
Lab test without GPU	Use Multipass/VM and expect GPU services to be limited.

2. New ISO Install

Flash the Nexus One AI ISO to a USB drive or attach it to the VM/server.
Boot the server from the ISO.
On the Ubuntu installer network screen, choose DHCP or the final static IP.
Let Ubuntu finish installation and reboot.
On first boot, open the setup URL shown on the server console:

http://<server-ip>

Complete the setup wizard:
- Network: DHCP or static IP.
- License & customer details: customer name, project/customer ID, contact email, license key, support date.
- Tier: Starter, Entry, Pro, or Max.
- Tools: keep defaults unless a component should be skipped.
Click Start Installation.
Wait for Phase 1 NVIDIA driver setup. The server may reboot once.
After reboot, Phase 2 continues automatically through cezen-phase2.service.
Monitor progress:

ssh cezen@<server-ip>
sudo journalctl -fu cezen-phase2.service
sudo tail -f /var/log/cezen-install.log

Open the portal after install:

http://<server-ip>/

3. PSU / Pendrive Field Install

Use this when a team physically visits the site and installs from a USB drive.

Carry the latest ISO on a bootable USB drive.
Boot the PSU/customer server from the USB.
Configure the final network on the Ubuntu installer screen.
After first boot, use either:
- Browser setup at http://<server-ip>, or
- Physical console terminal wizard if no browser is available.
Enter customer/license details during setup. If the final license key is not available, leave it blank; the system records the install as field staging/evaluation.
Select the commercial tier sold to the customer.
Complete install.
Upload or pull large models later after bandwidth/storage is confirmed.

License details are stored on the installed server at:

/opt/cezen/license.json

Installer selections are stored at:

/opt/cezen/install.conf

4. Existing Server Feasibility Check

Run this before quoting, committing a tier, or installing on customer-owned hardware.

git clone https://cgit.cezentech.com/jinojose/aipackage.git
cd aipackage
sudo bash install.sh --feasibility-only

The report checks CPU, RAM, disk, NVIDIA GPU/VRAM, and likely supported features. It writes JSON to:

/opt/cezen/feasibility.json

If /opt/cezen is not writable, it writes:

./feasibility.json

Recommended interpretation:

Result	Meaning
`core`	Portal/backend only; no local model serving recommended.
`cpu-ai`	CPU-only RAG/chat possible, but constrained.
`gpu-starter`	Starter GPU deployment.
`gpu-standard`	Entry tier style deployment.
`gpu-pro`	Pro tier candidate.
`gpu-max`	Max tier candidate.

5. Existing Server Install

After feasibility check, install on an existing Ubuntu server:

sudo bash install.sh --software-only --profile=auto

For small systems or slow customer networks, skip default model downloads:

sudo bash install.sh --software-only --profile=cpu-ai --skip-model-pull

To force a commercial tier:

sudo bash install.sh --software-only --tier=starter
sudo bash install.sh --software-only --tier=basic
sudo bash install.sh --software-only --tier=pro
sudo bash install.sh --software-only --tier=max

The installer warns if selected tier and hardware recommendation do not match. The selected tier still wins, because the sale/license decision is commercial.

6. Tier Guide

Tier	Target Hardware	Typical Use	Default Models
Starter	1 GPU around 24-32 GB VRAM, or constrained CPU system	Small team, RAG/admin portal, light chat	`phi3:mini`, `nomic-embed-text`
Entry / Basic	1 RTX Pro 6000 class GPU, around 48-96 GB VRAM	Department deployment	`llama3.1:8b`, `mistral:7b`, `codellama:13b`, `nomic-embed-text`
Pro	2+ high VRAM GPUs	Multi-team deployment, heavier coding/RAG/fine-tuning workflows	Entry models plus `llama3.1:70b`, `mixtral:8x7b`, `deepseek-coder-v2:16b`
Max	4-8 enterprise GPUs such as H100/H200/A100 class	Enterprise deployment, large models, high concurrency	Pro models plus `llama3.1:405b`, `mixtral:8x22b`

Large models can be pulled later. The ISO does not need to contain them.

bash models/pull-models.sh --tier=starter
bash models/pull-models.sh --tier=basic
bash models/pull-models.sh --tier=pro
bash models/pull-models.sh --tier=max

7. Product Features

Nexus One AI includes these application features through the portal and backend:

Feature	What It Does
Secure admin portal	Browser UI for setup, chat, tools, users, models, reports, and system status.
Authentication and sessions	JWT login, role-aware admin access, brute-force lockout, active session tracking.
User and team management	Admin-managed users, teams, roles, and account status.
Private chat	On-prem chat over local or routed models.
RAG knowledge base	Upload documents, index them into ChromaDB, and query private knowledge.
Prompt library	Government/enterprise prompt templates grouped by use case.
Model management	View local models, pull Ollama models, upload GGUF models, and track model status.
Model router	Route requests by rule to local, GPU, or external model endpoints on supported tiers.
Document intelligence	Parse, summarize, and extract structured information from documents.
Meeting assistant	Transcript/audio processing, summaries, decisions, action items, and follow-ups.
Agent builder	Create and run configured agents, including scheduled agent jobs.
Workflow automation	Run portal workflows with HTTP, email, RAG, save-to-knowledge-base, and filter steps.
Connectors	Store and sync supported data connectors.
Guardrails	Keyword, regex, and PII checks for safer prompts and responses.
Analytics and audit	Query logs, usage summaries, audit reports, and admin visibility.
Evaluation suite	Manage datasets, eval jobs, and model/prompt quality checks.
Fine-tuning jobs	QLoRA and advanced training paths for higher tiers.
API key manager	Create, list, and revoke API keys for integrations.
Backups and restore	Local backup, list, restore, and pre-restore safety snapshot APIs.
System readiness	Feasibility, license, and readiness reports for handover and support.

8. Features By Tier

The backend exposes this same matrix from GET /api/license.

Feature	Starter	Entry / Basic	Pro	Max
Max users	10	25	100	Custom
Portal	Yes	Yes	Yes	Yes
Private chat	Yes	Yes	Yes	Yes
RAG knowledge base	Yes	Yes	Advanced	Advanced
Meeting assistant	No	Yes	Yes	Yes
Workflows	Basic	Basic	Advanced	Advanced
Connectors	No	Limited	Yes	Yes
Model router	No	No	Yes	Yes
Audit reports	Yes	Yes	Yes	Yes
Backup and restore	Yes	Yes	Yes	Yes
Guardrails	Basic	Basic	Advanced	Advanced
GPU inference	No	Optional	Yes	Yes
Fine-tuning	No	No	QLoRA	Advanced
DeepSpeed / distributed training	No	No	No	Custom

9. What Gets Installed

All tiers install the Nexus One AI portal, backend API, nginx, health/readiness reporting, license/tier handling, and selected AI tools.

Component	Port	Notes
Nexus One AI portal	80	Main UI served by nginx.
cezen-api backend	8080	FastAPI backend, systemd service `cezen-api`.
Ollama	11434	Local model inference.
Open WebUI	3001	Chat UI.
ChromaDB	8100	Vector database for RAG.
vLLM	8000	OpenAI-compatible serving path, mainly Pro/Max.
JupyterLab	8888	Notebook environment.
MLflow	5000	Experiment tracking.
MinIO	9001	S3-compatible object/model storage.
Grafana	3000	Monitoring dashboard.

10. Admin And Readiness APIs

API	Purpose
`GET /api/license`	Current tier, feature matrix, and safe license metadata.
`GET /api/system/feasibility`	Hardware feasibility report or live fallback.
`GET /api/system/readiness-report`	License + feasibility + install readiness payload.
`GET /api/audit/report?days=7`	Audit summary for handover/admin review.
`GET /api/system/backups`	List local backups.
`POST /api/system/backups`	Create local backup.
`POST /api/system/backups/{name}/restore`	Restore backup with pre-restore safety snapshot.

Backup helper:

sudo bash scripts/cezen-backup.sh backup
sudo bash scripts/cezen-backup.sh list
sudo bash scripts/cezen-backup.sh restore /opt/cezen/backups/cezen-backup-YYYYmmdd-HHMMSS.zip

11. Post-Install Checks

Run these after install:

systemctl status cezen-api --no-pager
systemctl status cezen-phase2.service --no-pager
curl -s http://localhost:8080/api/settings/branding
curl -s http://localhost:8080/api/system/feasibility

Check service ports:

ss -lntp

Check Ollama models:

curl -s http://localhost:11434/api/tags

12. Test Without A GPU

On a MacBook:

multipass launch 22.04 --name cezen-test --cpus 4 --mem 8G --disk 40G
multipass shell cezen-test

Inside the VM:

git clone https://cgit.cezentech.com/jinojose/aipackage.git
cd aipackage
sudo bash install.sh --feasibility-only
sudo bash install.sh --software-only --profile=auto --skip-model-pull

No GPU will be detected. That is expected.

13. Change Default Passwords Before Customer Handover

Before shipping to a customer, rotate these:

Initial OS/admin account password.
JupyterLab token: /opt/cezen/.jupyter/jupyter_lab_config.py
MinIO credentials: /etc/default/minio
Grafana admin password.
Any temporary portal/backend admin credentials.
Any staging license key if the final license is issued later.

14. Useful Files

cgit/
├── install.sh                         # Main installer entry point
├── autoinstall/                       # ISO first-boot setup and web setup
├── scripts/cezen-feasibility.sh       # Existing-server feasibility checker
├── scripts/cezen-backup.sh            # Backup/restore helper
├── ansible/
│   ├── phase1_nvidia.yml              # NVIDIA/CUDA phase
│   ├── starter.yml                    # Starter tier
│   ├── entry.yml                      # Entry/Basic tier
│   ├── pro.yml                        # Pro tier
│   ├── max.yml                        # Max tier
│   └── roles/
│       ├── cezen-backend/             # FastAPI backend, cezen-api service
│       ├── cezen-nginx/               # Portal/nginx deployment
│       ├── ollama/                    # Ollama + Open WebUI
│       ├── chromadb/                  # RAG vector DB
│       ├── vllm/                      # vLLM serving
│       ├── jupyterlab/                # Notebooks
│       ├── mlflow/                    # Experiment tracking
│       ├── minio/                     # Object storage
│       └── monitoring/                # Grafana/Prometheus/DCGM
├── cezen-portal/                      # Packaged portal UI
└── models/pull-models.sh              # Pull tier-specific models

12 KiB Raw Permalink Blame History