aipackage/README.md

# Nexus One AI Installer

This repository is the source of truth for Nexus One AI ISO and server installs.
The ISO keeps itself small by pulling this package from cgit during setup, then
the installer deploys the selected tier on the target server.

## 1. Choose The Install Path

| Scenario | Use This Path |
|---|---|
| New appliance/server with ISO | Boot from the Nexus One AI ISO and complete first-boot setup. |
| PSU/offline field install by pendrive | Boot the ISO from USB, enter license/tier details during first boot, upload large models later if needed. |
| Existing Ubuntu server | Clone this repo and run the feasibility check before installing. |
| Lab test without GPU | Use Multipass/VM and expect GPU services to be limited. |

## 2. New ISO Install

1. Flash the Nexus One AI ISO to a USB drive or attach it to the VM/server.
2. Boot the server from the ISO.
3. On the Ubuntu installer network screen, choose DHCP or the final static IP.
4. Let Ubuntu finish installation and reboot.
5. On first boot, open the setup URL shown on the server console:

```text
http://<server-ip>
```

6. Complete the setup wizard:
   - Network: DHCP or static IP.
   - License & customer details: customer name, project/customer ID, contact email, license key, support date.
   - Tier: Starter, Entry, Pro, or Max.
   - Tools: keep defaults unless a component should be skipped.
7. Click **Start Installation**.
8. Wait for Phase 1 NVIDIA driver setup. The server may reboot once.
9. After reboot, Phase 2 continues automatically through `cezen-phase2.service`.
10. Monitor progress:

```bash
ssh cezen@<server-ip>
sudo journalctl -fu cezen-phase2.service
sudo tail -f /var/log/cezen-install.log
```

11. Open the portal after install:

```text
http://<server-ip>/
```

## 3. PSU / Pendrive Field Install

Use this when a team physically visits the site and installs from a USB drive.

1. Carry the latest ISO on a bootable USB drive.
2. Boot the PSU/customer server from the USB.
3. Configure the final network on the Ubuntu installer screen.
4. After first boot, use either:
   - Browser setup at `http://<server-ip>`, or
   - Physical console terminal wizard if no browser is available.
5. Enter customer/license details during setup. If the final license key is not available, leave it blank; the system records the install as field staging/evaluation.
6. Select the commercial tier sold to the customer.
7. Complete install.
8. Upload or pull large models later after bandwidth/storage is confirmed.

License details are stored on the installed server at:

```text
/opt/cezen/license.json
```

Installer selections are stored at:

```text
/opt/cezen/install.conf
```

## 4. Existing Server Feasibility Check

Run this before quoting, committing a tier, or installing on customer-owned hardware.

```bash
git clone https://cgit.cezentech.com/jinojose/aipackage.git
cd aipackage
sudo bash install.sh --feasibility-only
```

The report checks CPU, RAM, disk, NVIDIA GPU/VRAM, and likely supported features.
It writes JSON to:

```text
/opt/cezen/feasibility.json
```

If `/opt/cezen` is not writable, it writes:

```text
./feasibility.json
```

Recommended interpretation:

| Result | Meaning |
|---|---|
| `core` | Portal/backend only; no local model serving recommended. |
| `cpu-ai` | CPU-only RAG/chat possible, but constrained. |
| `gpu-starter` | Starter GPU deployment. |
| `gpu-standard` | Entry tier style deployment. |
| `gpu-pro` | Pro tier candidate. |
| `gpu-max` | Max tier candidate. |

## 5. Existing Server Install

After feasibility check, install on an existing Ubuntu server:

```bash
sudo bash install.sh --software-only --profile=auto
```

For small systems or slow customer networks, skip default model downloads:

```bash
sudo bash install.sh --software-only --profile=cpu-ai --skip-model-pull
```

To force a commercial tier:

```bash
sudo bash install.sh --software-only --tier=starter
sudo bash install.sh --software-only --tier=basic
sudo bash install.sh --software-only --tier=pro
sudo bash install.sh --software-only --tier=max
```

The installer warns if selected tier and hardware recommendation do not match.
The selected tier still wins, because the sale/license decision is commercial.

## 6. Tier Guide

| Tier | Target Hardware | Typical Use | Default Models |
|---|---|---|---|
| Starter | 1 GPU around 24-32 GB VRAM, or constrained CPU system | Small team, RAG/admin portal, light chat | `phi3:mini`, `nomic-embed-text` |
| Entry / Basic | 1 RTX Pro 6000 class GPU, around 48-96 GB VRAM | Department deployment | `llama3.1:8b`, `mistral:7b`, `codellama:13b`, `nomic-embed-text` |
| Pro | 2+ high VRAM GPUs | Multi-team deployment, heavier coding/RAG/fine-tuning workflows | Entry models plus `llama3.1:70b`, `mixtral:8x7b`, `deepseek-coder-v2:16b` |
| Max | 4-8 enterprise GPUs such as H100/H200/A100 class | Enterprise deployment, large models, high concurrency | Pro models plus `llama3.1:405b`, `mixtral:8x22b` |

Large models can be pulled later. The ISO does not need to contain them.

```bash
bash models/pull-models.sh --tier=starter
bash models/pull-models.sh --tier=basic
bash models/pull-models.sh --tier=pro
bash models/pull-models.sh --tier=max
```

## 7. Product Features

Nexus One AI includes these application features through the portal and backend:

| Feature | What It Does |
|---|---|
| Secure admin portal | Browser UI for setup, chat, tools, users, models, reports, and system status. |
| Authentication and sessions | JWT login, role-aware admin access, brute-force lockout, active session tracking. |
| User and team management | Admin-managed users, teams, roles, and account status. |
| Private chat | On-prem chat over local or routed models. |
| RAG knowledge base | Upload documents, index them into ChromaDB, and query private knowledge. |
| Prompt library | Government/enterprise prompt templates grouped by use case. |
| Model management | View local models, pull Ollama models, upload GGUF models, and track model status. |
| Model router | Route requests by rule to local, GPU, or external model endpoints on supported tiers. |
| Document intelligence | Parse, summarize, and extract structured information from documents. |
| Meeting assistant | Transcript/audio processing, summaries, decisions, action items, and follow-ups. |
| Agent builder | Create and run configured agents, including scheduled agent jobs. |
| Workflow automation | Run portal workflows with HTTP, email, RAG, save-to-knowledge-base, and filter steps. |
| Connectors | Store and sync supported data connectors. |
| Guardrails | Keyword, regex, and PII checks for safer prompts and responses. |
| Analytics and audit | Query logs, usage summaries, audit reports, and admin visibility. |
| Evaluation suite | Manage datasets, eval jobs, and model/prompt quality checks. |
| Fine-tuning jobs | QLoRA and advanced training paths for higher tiers. |
| API key manager | Create, list, and revoke API keys for integrations. |
| Backups and restore | Local backup, list, restore, and pre-restore safety snapshot APIs. |
| System readiness | Feasibility, license, and readiness reports for handover and support. |

## 8. Features By Tier

The backend exposes this same matrix from `GET /api/license`.

| Feature | Starter | Entry / Basic | Pro | Max |
|---|---|---|---|---|
| Max users | 10 | 25 | 100 | Custom |
| Portal | Yes | Yes | Yes | Yes |
| Private chat | Yes | Yes | Yes | Yes |
| RAG knowledge base | Yes | Yes | Advanced | Advanced |
| Meeting assistant | No | Yes | Yes | Yes |
| Workflows | Basic | Basic | Advanced | Advanced |
| Connectors | No | Limited | Yes | Yes |
| Model router | No | No | Yes | Yes |
| Audit reports | Yes | Yes | Yes | Yes |
| Backup and restore | Yes | Yes | Yes | Yes |
| Guardrails | Basic | Basic | Advanced | Advanced |
| GPU inference | No | Optional | Yes | Yes |
| Fine-tuning | No | No | QLoRA | Advanced |
| DeepSpeed / distributed training | No | No | No | Custom |

## 9. What Gets Installed

All tiers install the Nexus One AI portal, backend API, nginx, health/readiness
reporting, license/tier handling, and selected AI tools.

| Component | Port | Notes |
|---|---:|---|
| Nexus One AI portal | 80 | Main UI served by nginx. |
| cezen-api backend | 8080 | FastAPI backend, systemd service `cezen-api`. |
| Ollama | 11434 | Local model inference. |
| Open WebUI | 3001 | Chat UI. |
| ChromaDB | 8100 | Vector database for RAG. |
| vLLM | 8000 | OpenAI-compatible serving path, mainly Pro/Max. |
| JupyterLab | 8888 | Notebook environment. |
| MLflow | 5000 | Experiment tracking. |
| MinIO | 9001 | S3-compatible object/model storage. |
| Grafana | 3000 | Monitoring dashboard. |

## 10. Admin And Readiness APIs

| API | Purpose |
|---|---|
| `GET /api/license` | Current tier, feature matrix, and safe license metadata. |
| `GET /api/system/feasibility` | Hardware feasibility report or live fallback. |
| `GET /api/system/readiness-report` | License + feasibility + install readiness payload. |
| `GET /api/audit/report?days=7` | Audit summary for handover/admin review. |
| `GET /api/system/backups` | List local backups. |
| `POST /api/system/backups` | Create local backup. |
| `POST /api/system/backups/{name}/restore` | Restore backup with pre-restore safety snapshot. |

Backup helper:

```bash
sudo bash scripts/cezen-backup.sh backup
sudo bash scripts/cezen-backup.sh list
sudo bash scripts/cezen-backup.sh restore /opt/cezen/backups/cezen-backup-YYYYmmdd-HHMMSS.zip
```

## 11. Post-Install Checks

Run these after install:

```bash
systemctl status cezen-api --no-pager
systemctl status cezen-phase2.service --no-pager
curl -s http://localhost:8080/api/settings/branding
curl -s http://localhost:8080/api/system/feasibility
```

Check service ports:

```bash
ss -lntp
```

Check Ollama models:

```bash
curl -s http://localhost:11434/api/tags
```

## 12. Test Without A GPU

On a MacBook:

```bash
multipass launch 22.04 --name cezen-test --cpus 4 --mem 8G --disk 40G
multipass shell cezen-test
```

Inside the VM:

```bash
git clone https://cgit.cezentech.com/jinojose/aipackage.git
cd aipackage
sudo bash install.sh --feasibility-only
sudo bash install.sh --software-only --profile=auto --skip-model-pull
```

No GPU will be detected. That is expected.

## 13. Change Default Passwords Before Customer Handover

Before shipping to a customer, rotate these:

- Initial OS/admin account password.
- JupyterLab token: `/opt/cezen/.jupyter/jupyter_lab_config.py`
- MinIO credentials: `/etc/default/minio`
- Grafana admin password.
- Any temporary portal/backend admin credentials.
- Any staging license key if the final license is issued later.

## 14. Useful Files

```text
cgit/
├── install.sh                         # Main installer entry point
├── autoinstall/                       # ISO first-boot setup and web setup
├── scripts/cezen-feasibility.sh       # Existing-server feasibility checker
├── scripts/cezen-backup.sh            # Backup/restore helper
├── ansible/
│   ├── phase1_nvidia.yml              # NVIDIA/CUDA phase
│   ├── starter.yml                    # Starter tier
│   ├── entry.yml                      # Entry/Basic tier
│   ├── pro.yml                        # Pro tier
│   ├── max.yml                        # Max tier
│   └── roles/
│       ├── cezen-backend/             # FastAPI backend, cezen-api service
│       ├── cezen-nginx/               # Portal/nginx deployment
│       ├── ollama/                    # Ollama + Open WebUI
│       ├── chromadb/                  # RAG vector DB
│       ├── vllm/                      # vLLM serving
│       ├── jupyterlab/                # Notebooks
│       ├── mlflow/                    # Experiment tracking
│       ├── minio/                     # Object storage
│       └── monitoring/                # Grafana/Prometheus/DCGM
├── cezen-portal/                      # Packaged portal UI
└── models/pull-models.sh              # Pull tier-specific models
```