diff --git a/README.md b/README.md index 40c24e7..8f0aa39 100644 --- a/README.md +++ b/README.md @@ -1,65 +1,188 @@ -# Nexus One AI — Installer +# Nexus One AI Installer -## Quick Start +This repository is the source of truth for Nexus One AI ISO and server installs. +The ISO keeps itself small by pulling this package from cgit during setup, then +the installer deploys the selected tier on the target server. -```bash -git clone -cd cgit -sudo bash install.sh +## 1. Choose The Install Path + +| Scenario | Use This Path | +|---|---| +| New appliance/server with ISO | Boot from the Nexus One AI ISO and complete first-boot setup. | +| PSU/offline field install by pendrive | Boot the ISO from USB, enter license/tier details during first boot, upload large models later if needed. | +| Existing Ubuntu server | Clone this repo and run the feasibility check before installing. | +| Lab test without GPU | Use Multipass/VM and expect GPU services to be limited. | + +## 2. New ISO Install + +1. Flash the Nexus One AI ISO to a USB drive or attach it to the VM/server. +2. Boot the server from the ISO. +3. On the Ubuntu installer network screen, choose DHCP or the final static IP. +4. Let Ubuntu finish installation and reboot. +5. On first boot, open the setup URL shown on the server console: + +```text +http:// ``` -Server reboots automatically after NVIDIA drivers install. Phase 2 runs on its own after reboot. - -On the custom ISO, Ubuntu autoinstall now pauses on the installer network screen so the operator can choose the final IP address from the VM console before installation continues. - -## Software-Only / Existing Hardware - -Run a feasibility scan before quoting or installing on customer-owned hardware: +6. Complete the setup wizard: + - Network: DHCP or static IP. + - License & customer details: customer name, project/customer ID, contact email, license key, support date. + - Tier: Starter, Entry, Pro, or Max. + - Tools: keep defaults unless a component should be skipped. +7. Click **Start Installation**. +8. Wait for Phase 1 NVIDIA driver setup. The server may reboot once. +9. After reboot, Phase 2 continues automatically through `cezen-phase2.service`. +10. Monitor progress: ```bash -bash scripts/cezen-feasibility.sh +ssh cezen@ +sudo journalctl -fu cezen-phase2.service +sudo tail -f /var/log/cezen-install.log ``` -The checker reports CPU, RAM, disk, NVIDIA GPU/VRAM, tool readiness, available features, and a recommended Cezen profile. It writes JSON to `/opt/cezen/feasibility.json` when possible, otherwise `./feasibility.json`. +11. Open the portal after install: -Install on existing hardware without the appliance NVIDIA phase: +```text +http:/// +``` + +## 3. PSU / Pendrive Field Install + +Use this when a team physically visits the site and installs from a USB drive. + +1. Carry the latest ISO on a bootable USB drive. +2. Boot the PSU/customer server from the USB. +3. Configure the final network on the Ubuntu installer screen. +4. After first boot, use either: + - Browser setup at `http://`, or + - Physical console terminal wizard if no browser is available. +5. Enter customer/license details during setup. If the final license key is not available, leave it blank; the system records the install as field staging/evaluation. +6. Select the commercial tier sold to the customer. +7. Complete install. +8. Upload or pull large models later after bandwidth/storage is confirmed. + +License details are stored on the installed server at: + +```text +/opt/cezen/license.json +``` + +Installer selections are stored at: + +```text +/opt/cezen/install.conf +``` + +## 4. Existing Server Feasibility Check + +Run this before quoting, committing a tier, or installing on customer-owned hardware. + +```bash +git clone https://cgit.cezentech.com/jinojose/aipackage.git +cd aipackage +sudo bash install.sh --feasibility-only +``` + +The report checks CPU, RAM, disk, NVIDIA GPU/VRAM, and likely supported features. +It writes JSON to: + +```text +/opt/cezen/feasibility.json +``` + +If `/opt/cezen` is not writable, it writes: + +```text +./feasibility.json +``` + +Recommended interpretation: + +| Result | Meaning | +|---|---| +| `core` | Portal/backend only; no local model serving recommended. | +| `cpu-ai` | CPU-only RAG/chat possible, but constrained. | +| `gpu-starter` | Starter GPU deployment. | +| `gpu-standard` | Entry tier style deployment. | +| `gpu-pro` | Pro tier candidate. | +| `gpu-max` | Max tier candidate. | + +## 5. Existing Server Install + +After feasibility check, install on an existing Ubuntu server: ```bash sudo bash install.sh --software-only --profile=auto ``` -For small systems or slow customer networks, the installer skips default model downloads on lightweight profiles. To force the same behavior manually: +For small systems or slow customer networks, skip default model downloads: ```bash sudo bash install.sh --software-only --profile=cpu-ai --skip-model-pull ``` -Profiles: +To force a commercial tier: -| Profile | Use When | Installs | -|---|---|---| -| `core` | no GPU / low RAM | portal, backend, nginx, health/metrics API | -| `cpu-ai` | 32 GB+ RAM, no usable GPU | core + Chroma/Ollama CPU path, model pull optional | -| `gpu-starter` | 24-32 GB VRAM | local AI starter stack, model pull optional | -| `gpu-standard` | 48-96 GB VRAM | standard GPU stack | -| `gpu-pro` | multi/high-VRAM GPU | advanced GPU stack | -| `gpu-max` | multi-node or HGX-class | full stack, custom sizing | +```bash +sudo bash install.sh --software-only --tier=starter +sudo bash install.sh --software-only --tier=basic +sudo bash install.sh --software-only --tier=pro +sudo bash install.sh --software-only --tier=max +``` -## Sellable v1 Admin APIs +The installer warns if selected tier and hardware recommendation do not match. +The selected tier still wins, because the sale/license decision is commercial. -The backend exposes the first productization APIs for software-only and appliance deployments: +## 6. Tier Guide + +| Tier | Target Hardware | Typical Use | Default Models | +|---|---|---|---| +| Starter | 1 GPU around 24-32 GB VRAM, or constrained CPU system | Small team, RAG/admin portal, light chat | `phi3:mini`, `nomic-embed-text` | +| Entry / Basic | 1 RTX Pro 6000 class GPU, around 48-96 GB VRAM | Department deployment | `llama3.1:8b`, `mistral:7b`, `codellama:13b`, `nomic-embed-text` | +| Pro | 2+ high VRAM GPUs | Multi-team deployment, heavier coding/RAG/fine-tuning workflows | Entry models plus `llama3.1:70b`, `mixtral:8x7b`, `deepseek-coder-v2:16b` | +| Max | 4-8 enterprise GPUs such as H100/H200/A100 class | Enterprise deployment, large models, high concurrency | Pro models plus `llama3.1:405b`, `mixtral:8x22b` | + +Large models can be pulled later. The ISO does not need to contain them. + +```bash +bash models/pull-models.sh --tier=starter +bash models/pull-models.sh --tier=basic +bash models/pull-models.sh --tier=pro +bash models/pull-models.sh --tier=max +``` + +## 7. What Gets Installed + +All tiers install the Nexus One AI portal, backend API, nginx, health/readiness +reporting, license/tier handling, and selected AI tools. + +| Component | Port | Notes | +|---|---:|---| +| Nexus One AI portal | 80 | Main UI served by nginx. | +| cezen-api backend | 8080 | FastAPI backend, systemd service `cezen-api`. | +| Ollama | 11434 | Local model inference. | +| Open WebUI | 3001 | Chat UI. | +| ChromaDB | 8100 | Vector database for RAG. | +| vLLM | 8000 | OpenAI-compatible serving path, mainly Pro/Max. | +| JupyterLab | 8888 | Notebook environment. | +| MLflow | 5000 | Experiment tracking. | +| MinIO | 9001 | S3-compatible object/model storage. | +| Grafana | 3000 | Monitoring dashboard. | + +## 8. Admin And Readiness APIs | API | Purpose | |---|---| -| `GET /api/license` | Shows current tier, feature matrix, and whether the tier is locked by Cezen. | -| `GET /api/system/feasibility` | Returns the generated hardware feasibility report or live fallback. | -| `GET /api/system/readiness-report` | Combines license, feasibility, and install readiness into a customer-facing report payload. | -| `GET /api/audit/report?days=7` | Basic audit summary for handover and admin review. | -| `GET /api/system/backups` | Lists local backups. | -| `POST /api/system/backups` | Creates a local backup of Cezen data. | -| `POST /api/system/backups/{name}/restore` | Restores a named local backup and creates a pre-restore safety snapshot. | +| `GET /api/license` | Current tier, feature matrix, and safe license metadata. | +| `GET /api/system/feasibility` | Hardware feasibility report or live fallback. | +| `GET /api/system/readiness-report` | License + feasibility + install readiness payload. | +| `GET /api/audit/report?days=7` | Audit summary for handover/admin review. | +| `GET /api/system/backups` | List local backups. | +| `POST /api/system/backups` | Create local backup. | +| `POST /api/system/backups/{name}/restore` | Restore backup with pre-restore safety snapshot. | -CLI backup helper: +Backup helper: ```bash sudo bash scripts/cezen-backup.sh backup @@ -67,74 +190,84 @@ sudo bash scripts/cezen-backup.sh list sudo bash scripts/cezen-backup.sh restore /opt/cezen/backups/cezen-backup-YYYYmmdd-HHMMSS.zip ``` -## What Gets Installed (Entry Tier) +## 9. Post-Install Checks -| Service | Port | Notes | -|---|---|---| -| Ollama | 11434 | LLM inference, 2 models pre-loaded | -| Open WebUI | 3001 | Chat interface | -| vLLM | 8000 | OpenAI-compatible API (start manually) | -| JupyterLab | 8888 | Token: `cezen2024` | -| ChromaDB | 8100 | Vector DB for RAG | -| MLflow | 5000 | Experiment tracking | -| MinIO | 9001 | Object storage (user: cezenadmin / Cezen@2024!) | -| Grafana | 3000 | GPU + system monitoring (admin / cezen2024) | - -## Testing Without a GPU (Multipass) +Run these after install: + +```bash +systemctl status cezen-api --no-pager +systemctl status cezen-phase2.service --no-pager +curl -s http://localhost:8080/api/settings/branding +curl -s http://localhost:8080/api/system/feasibility +``` + +Check service ports: + +```bash +ss -lntp +``` + +Check Ollama models: + +```bash +curl -s http://localhost:11434/api/tags +``` + +## 10. Test Without A GPU + +On a MacBook: ```bash -# On your MacBook: multipass launch 22.04 --name cezen-test --cpus 4 --mem 8G --disk 40G multipass shell cezen-test - -# Inside the VM: -git clone -sudo bash install.sh ``` -NVIDIA driver install will succeed but `nvidia-smi` won't show GPUs — that's expected. All other services will run fine. - -## Pull More Models +Inside the VM: ```bash -bash models/pull-models.sh --tier=starter # phi3:mini + embeddings -bash models/pull-models.sh --tier=basic # llama3.1:8b, mistral:7b, codellama -bash models/pull-models.sh --tier=pro # + llama3.1:70b, mixtral, deepseek-coder -bash models/pull-models.sh --tier=max # + llama3.1:405b, mixtral:8x22b +git clone https://cgit.cezentech.com/jinojose/aipackage.git +cd aipackage +sudo bash install.sh --feasibility-only +sudo bash install.sh --software-only --profile=auto --skip-model-pull ``` -## File Structure +No GPU will be detected. That is expected. -``` -cgit/ -├── install.sh ← Entry point -├── ansible/ -│ ├── phase1_nvidia.yml ← Phase 1: drivers (triggers reboot) -│ ├── starter.yml ← Phase 2: Starter tier (1 GPU, small team) -│ ├── entry.yml ← Phase 2: Basic tier (1–2 GPU, department) -│ ├── pro.yml ← Phase 2: Pro tier (2+ GPU, multi-team) -│ ├── max.yml ← Phase 2: Max tier (4–8 GPU, enterprise) -│ └── roles/ -│ ├── base/ ← OS, Python, Miniconda, LangChain -│ ├── nvidia/ ← Drivers, CUDA 12.4, cuDNN 9 -│ ├── docker/ ← Docker CE + NVIDIA Container Toolkit -│ ├── k3s/ ← Lightweight Kubernetes -│ ├── ollama/ ← Ollama + Open WebUI -│ ├── vllm/ ← vLLM inference server -│ ├── jupyterlab/ ← JupyterLab notebooks -│ ├── chromadb/ ← Vector database -│ ├── mlflow/ ← Experiment tracking -│ ├── minio/ ← Object storage -│ └── monitoring/ ← Grafana + Prometheus + DCGM -└── models/ - └── pull-models.sh ← Pull additional models -``` +## 11. Change Default Passwords Before Customer Handover -## Change Default Passwords - -Before shipping to a customer, update these: +Before shipping to a customer, rotate these: +- Initial OS/admin account password. - JupyterLab token: `/opt/cezen/.jupyter/jupyter_lab_config.py` -- MinIO: `/etc/default/minio` -- Grafana: environment vars in monitoring role, or via UI after first login -- MLflow: no auth by default (add reverse proxy if needed) +- MinIO credentials: `/etc/default/minio` +- Grafana admin password. +- Any temporary portal/backend admin credentials. +- Any staging license key if the final license is issued later. + +## 12. Useful Files + +```text +cgit/ +├── install.sh # Main installer entry point +├── autoinstall/ # ISO first-boot setup and web setup +├── scripts/cezen-feasibility.sh # Existing-server feasibility checker +├── scripts/cezen-backup.sh # Backup/restore helper +├── ansible/ +│ ├── phase1_nvidia.yml # NVIDIA/CUDA phase +│ ├── starter.yml # Starter tier +│ ├── entry.yml # Entry/Basic tier +│ ├── pro.yml # Pro tier +│ ├── max.yml # Max tier +│ └── roles/ +│ ├── cezen-backend/ # FastAPI backend, cezen-api service +│ ├── cezen-nginx/ # Portal/nginx deployment +│ ├── ollama/ # Ollama + Open WebUI +│ ├── chromadb/ # RAG vector DB +│ ├── vllm/ # vLLM serving +│ ├── jupyterlab/ # Notebooks +│ ├── mlflow/ # Experiment tracking +│ ├── minio/ # Object storage +│ └── monitoring/ # Grafana/Prometheus/DCGM +├── cezen-portal/ # Packaged portal UI +└── models/pull-models.sh # Pull tier-specific models +```