Self-Hosting Guide

StudioBrain can be self-hosted on your own infrastructure for full control over data, performance, and security. This guide covers the architecture, prerequisites, configuration, and deployment of a complete StudioBrain instance.

Architecture Overview

StudioBrain uses a two-host topology that separates the application services from the GPU-accelerated AI service.

Host 1: App Server                          Host 2: GPU Server (optional)
+-------------------------------------+    +----------------------------+
|  Docker Compose Stack                |    |  AI Service Container      |
|                                      |    |                            |
|  +----------+    +-----------+       |    |  +--------------------+    |
|  |  Caddy   |--->| Frontend  |       |    |  | studiobrain-ai     |    |
|  | (HTTPS)  |    | (Next.js) |       |    |  | (FastAPI + PyTorch)|    |
|  +----------+    +-----+-----+       |    |  +--------------------+    |
|                        |             |    |       |                    |
|                  /api/* proxy        |    |       | GPU (NVIDIA)       |
|                        |             |    +----------------------------+
|                  +-----v-----+       |
|                  |  Backend   |       |         Databases (optional)
|                  | (FastAPI)  |       |    +----------------------------+
|                  +-----+-----+       |    | Auth DB    (PostgreSQL)    |
|                        |             |    | Content DB (PostgreSQL)    |
|                  NFS / Local FS      |    | Qdrant     (vectors)      |
|                  /data/content/      |    | Redis      (cache/state)  |
+-------------------------------------+    +----------------------------+

Why two hosts? The app server handles standard web traffic and does not need a GPU. The AI service requires an NVIDIA GPU for local model inference, embedding generation, and vision analysis. Separating them lets you place the AI service on a machine with appropriate hardware while running the app server on any commodity hardware.

Single-host mode. If you do not need the AI service (or plan to use only external API providers like OpenAI/Anthropic), you can run the entire stack on a single host without a GPU.

Prerequisites

Requirement	Minimum	Recommended
Docker	24.0+	Latest stable
Docker Compose	v2.20+	Latest stable
RAM (app host)	4 GB	8 GB
CPU (app host)	2 cores	4 cores
Disk (app host)	20 GB	100 GB
GPU (AI host)	NVIDIA with 8 GB VRAM	NVIDIA with 24+ GB VRAM
NVIDIA Driver	535+	Latest stable
NVIDIA Container Toolkit	Required for AI service	---
Storage	Local filesystem or NFS	NFS or shared storage for multi-host

Operating system: Any Linux distribution that supports Docker. Ubuntu 22.04+ and Debian 12+ are tested. macOS and Windows (via WSL2) work for development but are not recommended for production.

Docker Compose Stack

The core application runs as three containers managed by a single docker-compose.yml:

version: "3.8"
 
services:
  caddy:
    image: caddy:2-alpine
    container_name: studiobrain-caddy
    restart: unless-stopped
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - ./Caddyfile:/etc/caddy/Caddyfile
      - caddy_data:/data
      - caddy_config:/config
    depends_on:
      - frontend
 
  frontend:
    build:
      context: ..
      dockerfile: docker/Dockerfile.frontend
      args:
        BACKEND_URL: http://backend:8201
        AI_SERVICE_URL: http://your-gpu-host:8202
    container_name: studiobrain-frontend
    restart: unless-stopped
    expose:
      - "3100"
    depends_on:
      - backend
 
  backend:
    build:
      context: ..
      dockerfile: docker/Dockerfile.backend
    container_name: studiobrain-backend
    restart: unless-stopped
    ports:
      - "8201:8201"
    env_file:
      - .env
    volumes:
      - /path/to/content:/data/content
      - /path/to/db:/data/db
 
volumes:
  caddy_data:
  caddy_config:

How Requests Flow

Caddy terminates HTTPS on port 443 and reverse-proxies all traffic to the frontend container on port 3100.
Frontend (Next.js 15) serves the React application. Requests to /api/* are proxied internally to the backend container on port 8201.
Frontend proxies /ai-proxy/* requests to the AI service at the AI_SERVICE_URL configured at build time.
Backend (FastAPI) handles all business logic, entity CRUD, authentication, and markdown synchronization. It reads and writes entity markdown files from the mounted /data/content/ volume.

Caddy Configuration

A minimal Caddyfile for HTTPS with a custom domain:

studio.yourdomain.com {
    reverse_proxy frontend:3100
}

Caddy automatically provisions and renews TLS certificates via Let’s Encrypt. For local/internal deployments without a public domain, you can use a self-signed certificate or plain HTTP:

:80 {
    reverse_proxy frontend:3100
}

Environment Variables

Backend (.env)

Create a .env file in your docker/ directory with the following variables:

Variable	Required	Default	Description
`CONTENT_BASE_PATH`	Yes	`/data/content`	Root path for entity markdown files and assets
`DATABASE_URL`	Yes	`sqlite:////data/db/city_brains.db`	SQLite connection string for desktop/single-user mode
`AUTH_DATABASE_URL`	No	---	PostgreSQL connection for auth database (cloud mode)
`CONTENT_DATABASE_URL`	No	---	PostgreSQL connection for content database (cloud mode)
`QDRANT_URL`	No	---	Qdrant vector database URL
`QDRANT_API_KEY`	No	---	Qdrant API key for authenticated access
`REDIS_URL`	No	---	Redis connection string for sessions, rate limits, PubSub
`AI_SERVICE_URL`	No	`http://localhost:8202`	URL of the AI service
`JWT_SECRET`	Yes	---	Secret key for JWT token signing. Must be stable across restarts or all user sessions will be invalidated
`CORS_ORIGINS`	No	`*`	Comma-separated list of allowed CORS origins
`SKIP_STARTUP_SYNC`	No	`false`	Set to `true` to skip the initial content scan on startup (faster restarts during development)
`STRIPE_SECRET_KEY`	No	---	Stripe secret key for billing integration
`STRIPE_WEBHOOK_SECRET`	No	---	Stripe webhook signing secret

Example .env file for a self-hosted single-user deployment:

# Storage
CONTENT_BASE_PATH=/data/content
DATABASE_URL=sqlite:////data/db/city_brains.db
 
# Authentication
JWT_SECRET=your-secure-random-string-at-least-32-characters
 
# AI Service (optional, omit if not running AI service)
AI_SERVICE_URL=http://10.0.0.50:8202
 
# CORS
CORS_ORIGINS=https://studio.yourdomain.com
 
# Performance
SKIP_STARTUP_SYNC=false

Example .env for a multi-tenant cloud deployment with PostgreSQL:

# Storage
CONTENT_BASE_PATH=/data/content
 
# Databases
AUTH_DATABASE_URL=postgresql://studiobrain_auth:your_password@10.0.0.100:5432/studiobrain_auth
CONTENT_DATABASE_URL=postgresql://studiobrain_app:your_password@10.0.0.101:5432/studiobrain_content
 
# Vector Store
QDRANT_URL=http://10.0.0.102:6333
QDRANT_API_KEY=your-qdrant-api-key
 
# Cache & State
REDIS_URL=redis://studiobrain_app:your_password@10.0.0.103:6379/0
 
# Authentication
JWT_SECRET=your-secure-random-string-at-least-32-characters
 
# AI
AI_SERVICE_URL=http://10.0.0.50:8202
 
# Billing (optional)
STRIPE_SECRET_KEY=sk_live_your_stripe_key
STRIPE_WEBHOOK_SECRET=whsec_your_webhook_secret

Frontend Build-Time Variables

These are passed as Docker build arguments and are baked into the Next.js application at build time:

Variable	Required	Default	Description
`BACKEND_URL`	Yes	`http://backend:8201`	Backend API URL. Use the Docker service name when running in the same Compose stack
`AI_SERVICE_URL`	No	---	AI service URL for frontend proxy rewrites. Must be reachable from the frontend container

AI Service (.env)

If running the AI service, create a separate .env file for it:

Variable	Required	Default	Description
`SERVICE_PORT`	No	`8202`	Port the AI service listens on
`BACKEND_URL`	Yes	---	URL of the backend API (e.g., `http://10.0.0.10:8201`)
`PROJECT_ROOT`	Yes	`/data/content`	Path to entity content (same files the backend reads)
`CONTENT_DATABASE_URL`	No	---	PostgreSQL connection for content (read-only access)
`QDRANT_URL`	No	---	Qdrant vector database URL
`QDRANT_API_KEY`	No	---	Qdrant API key
`REDIS_URL`	No	---	Redis connection string
`RAG_BACKEND`	No	`chroma`	Vector backend: `chroma` for local (ChromaDB) or `qdrant` for cloud
`OPENAI_API_KEY`	No	---	OpenAI API key for GPT models
`ANTHROPIC_API_KEY`	No	---	Anthropic API key for Claude models
`GOOGLE_API_KEY`	No	---	Google API key for Gemini models
`GROK_API_KEY`	No	---	xAI API key for Grok models

Example AI service .env:

SERVICE_PORT=8202
BACKEND_URL=http://10.0.0.10:8201
PROJECT_ROOT=/data/content
 
# Vector store (use 'chroma' for local-only, 'qdrant' for cloud)
RAG_BACKEND=chroma
 
# AI providers (add keys for providers you want to use)
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key

Storage Configuration

Content Storage

StudioBrain stores all entity data as markdown files with YAML frontmatter. The backend reads from and writes to the path specified by CONTENT_BASE_PATH. This directory must be mounted into the backend container.

Local filesystem:

# docker-compose.yml
backend:
  volumes:
    - /home/studiobrain/content:/data/content
    - /home/studiobrain/db:/data/db

NFS mount (recommended for multi-host):

# On the host machine, mount the NFS share
sudo mount -t nfs 10.0.0.5:/export/studiobrain/content /mnt/studiobrain/content
sudo mount -t nfs 10.0.0.5:/export/studiobrain/db /mnt/studiobrain/db

# docker-compose.yml
backend:
  volumes:
    - /mnt/studiobrain/content:/data/content
    - /mnt/studiobrain/db:/data/db

Add to /etc/fstab for persistent mounts:

10.0.0.5:/export/studiobrain/content  /mnt/studiobrain/content  nfs  defaults,_netdev  0  0
10.0.0.5:/export/studiobrain/db       /mnt/studiobrain/db       nfs  defaults,_netdev  0  0

Content Directory Structure

The content directory follows this layout:

/data/content/
  _Templates/
    Standard/
      CHARACTER_TEMPLATE.md
      LOCATION_TEMPLATE.md
      BRAND_TEMPLATE.md
      ...
    Layouts/
      character.json
      location.json
      ...
  _Rules/
    RULES_INDEX.md
    CHARACTER_RULES.md
    LOCATION_RULES.md
    DIALOGUE_RULES.md
    ...
  _Plugins/
    plugin-name/
      plugin.json
      index.html
      ...
  Characters/
    character_name/
      CH_character_name.md
      portrait.png
      ...
  Locations/
    location_name/
      LOC_location_name.md
      concept_art.png
      ...
  Brands/
    brand_name/
      BR_brand_name.md
      logo.png
      ...

AI Service Deployment

The AI service runs on a separate host with GPU access. It uses the nvcr.io/nvidia/pytorch base image and volume-mounts the application code rather than baking it into the image.

Docker Compose Entry

Add this to a docker-compose.yml on the GPU host:

services:
  studiobrain-ai:
    image: nvcr.io/nvidia/pytorch:25.11-py3
    container_name: studiobrain-ai
    restart: unless-stopped
    ports:
      - "8202:8202"
    env_file:
      - /opt/studiobrain-ai/app/.env
    volumes:
      - /opt/studiobrain-ai/app:/app
      - /opt/studiobrain-ai/requirements-docker.txt:/requirements-docker.txt
      - /mnt/content:/data/content:ro
    working_dir: /app
    command: >
      bash -c "pip install -r /requirements-docker.txt &&
               python run_server.py"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
    environment:
      - CUDA_VISIBLE_DEVICES=0

Key Points

Code is volume-mounted from the host at /opt/studiobrain-ai/app/, not copied into the image. This makes updates fast: git pull and restart.
Dependencies are installed at startup via pip install -r /requirements-docker.txt. First boot is slower; subsequent restarts are fast because pip caches installed packages.
GPU access is configured via Docker’s deploy.resources.reservations.devices section. The NVIDIA Container Toolkit must be installed on the host.
Content is mounted read-only (:ro) from NFS or local storage. The AI service reads entity files for RAG indexing but does not write to them.
CUDA_VISIBLE_DEVICES controls which GPU(s) the service can use. Set this if the host has multiple GPUs.

If the GPU host runs other services (ComfyUI, other AI models), use CUDA_VISIBLE_DEVICES to partition GPU access:

environment:
  - CUDA_VISIBLE_DEVICES=0    # Use only GPU 0

StudioBrain dynamically loads and unloads models to manage VRAM. It can coexist with other GPU services, but heavy concurrent use may cause VRAM pressure.

HTTPS Setup

With a Public Domain

Caddy handles automatic HTTPS when you have a public domain pointing to your server:

# Caddyfile
studio.yourdomain.com {
    reverse_proxy frontend:3100
}

Ensure ports 80 and 443 are open and the domain’s DNS A record points to your server’s public IP.

With a Custom Certificate

studio.internal.company.com {
    tls /etc/caddy/certs/cert.pem /etc/caddy/certs/key.pem
    reverse_proxy frontend:3100
}

Mount your certificate files into the Caddy container:

caddy:
  volumes:
    - ./Caddyfile:/etc/caddy/Caddyfile
    - ./certs:/etc/caddy/certs:ro

Without HTTPS (Development Only)

:80 {
    reverse_proxy frontend:3100
}

First Run

On the first startup, the backend performs an initial content scan:

Directory scan. The backend recursively scans CONTENT_BASE_PATH for markdown files with YAML frontmatter.
Entity parsing. Each file is parsed into the database (SQLite or PostgreSQL depending on mode).
Template registration. Templates from _Templates/Standard/ are loaded and registered as entity type definitions.
Rule loading. Rules from _Rules/ are loaded into the rules engine.
Plugin discovery. Plugins from _Plugins/ are scanned and registered.

For a project with ~500 entities, the initial scan takes 10-30 seconds. Set SKIP_STARTUP_SYNC=true to skip this on subsequent restarts if you know the content has not changed.

Verifying the Deployment

After docker compose up -d, verify all services are healthy:

# Check container status
docker compose ps
 
# Check backend health
curl http://localhost:8201/health
 
# Check frontend is serving
curl -s -o /dev/null -w "%{http_code}" http://localhost:3100
 
# Check AI service health (if deployed)
curl http://your-gpu-host:8202/health

Expected backend health response:

{
  "status": "healthy",
  "entity_count": 236,
  "entity_types": ["character", "location", "brand", "district", "faction", "item", "job"]
}

Hardware Sizing Guide

Small Deployment (up to 100 users)

Component	Specification
App Host CPU	2 cores
App Host RAM	4 GB
App Host Disk	20 GB SSD
GPU Host	Not required (use external AI providers)
Database	SQLite (single-user) or PostgreSQL on the same host

Medium Deployment (up to 500 users)

Component	Specification
App Host CPU	4 cores
App Host RAM	8 GB
App Host Disk	100 GB SSD
GPU Host CPU	4 cores
GPU Host RAM	16 GB
GPU Host GPU	NVIDIA with 16+ GB VRAM (e.g., RTX 4080, A4000)
Database	Dedicated PostgreSQL instance, 50 GB
Redis	2 GB RAM
Qdrant	8 GB RAM, 20 GB NVMe

Large Deployment (up to 2,000 users)

Component	Specification
App Host CPU	8 cores
App Host RAM	16 GB
App Host Disk	200 GB SSD
GPU Host CPU	8 cores
GPU Host RAM	32 GB
GPU Host GPU	NVIDIA with 24+ GB VRAM (e.g., RTX 4090, A5000, RTX PRO 6000)
Auth DB	Dedicated PostgreSQL, 10 GB, encrypted at rest
Content DB	Dedicated PostgreSQL + RLS, 50 GB
Qdrant	8 GB RAM, 20 GB NVMe
Redis	2 GB RAM, NVMe storage

For large deployments, separate the Auth DB from the Content DB on different hosts. See the Database Guide for the full multi-database architecture.

Template and Rules Management

Templates and rules are core to how StudioBrain defines entity types and governs AI generation. As an administrator, understanding how to manage them is essential.

Templates

Templates live in _Templates/Standard/ within your content directory. Each template is a markdown file with YAML frontmatter that defines the fields and structure for an entity type.

How templates define entity types:

Each template file (e.g., CHARACTER_TEMPLATE.md) defines a single entity type.
The YAML frontmatter specifies all fields: names, types, default values, and relationships.
The backend parses these templates on startup and registers them as entity type definitions.
The frontend generates TypeScript types and Zod validation schemas from the templates at build time.
Adding a new template file creates a new entity type — no code changes required.

Template file example:

---
entity_id: ""
entity_name: ""
age: 0
gender: ""
faction: ""
primary_location: ""
personality_traits: []
associated_brands: []
---
 
# Character Template
 
Description and guidelines for creating characters...

Managing templates as an admin:

Adding an entity type: Create a new MYTYPE_TEMPLATE.md file in _Templates/Standard/. Restart the backend to pick up the new type. Rebuild the frontend to generate TypeScript types.
Modifying an entity type: Edit the template’s YAML frontmatter to add, rename, or remove fields. Existing entities will retain their data; new fields will use defaults.
System templates: Templates with is_system: true in the database (such as Assembly and Timeline) are core types and should not be removed.
Template versioning: Templates are versioned in the database. The version field increments on each change, which the sync system uses for conflict detection.

Rules

Rules live in _Rules/ and govern how the AI generates content. Each rule file contains constraints, style guidelines, and validation criteria for a specific domain.

Rule files include:

File	Purpose
`RULES_INDEX.md`	Master rules directory and priority ordering
`CHARACTER_RULES.md`	Character creation constraints (naming, personality)
`LOCATION_RULES.md`	Location atmosphere and description guidelines
`DIALOGUE_RULES.md`	Speech patterns and conversation rules
`WORLD_RULES.md`	Global world consistency constraints

How rules work:

When the AI service generates content, it loads applicable rules from the rules engine.
Rules are injected into the AI prompt as constraints.
The AI validates its output against the rules before returning results.
Rules support pre-generation validation (checking inputs) and post-generation validation (checking outputs).

Managing rules as an admin:

Edit rule files directly in the _Rules/ directory.
Changes are picked up on the next AI generation request (no restart needed).
Rules are scoped per entity type — add type-specific rules by creating {TYPE}_RULES.md.
The RULES_INDEX.md file controls which rules are active and their priority order.

Deployment Commands Reference

Build and Start

cd /path/to/studiobrain/docker
docker compose up -d --build

Upgrade

cd /path/to/studiobrain
git pull origin main
cd docker
docker compose down
docker compose build
docker compose up -d

Backend-Only Rebuild

When only backend code has changed, skip the frontend rebuild for faster deploys:

cd /path/to/studiobrain/docker
docker compose build backend
docker compose up -d backend

AI Service Update

On the GPU host, update the code and restart:

cd /opt/studiobrain-ai/app
git pull origin main
 
# If only code changed (deps unchanged):
docker restart studiobrain-ai
 
# If requirements changed:
docker compose restart studiobrain-ai

View Logs

# All services
docker compose logs -f
 
# Individual service
docker logs -f studiobrain-backend
docker logs -f studiobrain-frontend
docker logs -f studiobrain-caddy
 
# AI service (on GPU host)
docker logs -f studiobrain-ai

Admin Guide