Self-Hosting & AdminDatabase Guide

Database Guide

StudioBrain supports two database modes: SQLite for desktop/single-user deployments and PostgreSQL for multi-tenant cloud deployments. This guide covers the full database architecture, security model, migration workflow, and scaling strategy.

Database Modes

SQLite (Desktop / Single-User)

In desktop mode, StudioBrain uses a single SQLite database file. This is the default for self-hosted and open-core installations.

  • Zero configuration. No external database server required.
  • Single file. The entire database lives at the path specified by DATABASE_URL (e.g., sqlite:////data/db/city_brains.db).
  • Single tenant. All entities use tenant_id='local' as a hardcoded value.
  • No RLS. SQLite does not support row-level security; tenant isolation is enforced at the application layer.

Connection string format:

DATABASE_URL=sqlite:////data/db/city_brains.db

PostgreSQL (Cloud / Multi-Tenant)

In cloud mode, StudioBrain uses a five-service database architecture with strict separation between PII and content data.

  • Two PostgreSQL instances. Auth (PII/billing) and Content (entities/templates) run on separate servers.
  • Row-Level Security (RLS). Content database enforces tenant isolation at the database level.
  • Qdrant. Dedicated vector database for RAG embeddings, one collection per tenant.
  • Redis. Sessions, rate limits, real-time presence, and PubSub channels.

Connection string format:

AUTH_DATABASE_URL=postgresql://studiobrain_auth:password@auth-host:5432/studiobrain_auth
CONTENT_DATABASE_URL=postgresql://studiobrain_app:password@content-host:5432/studiobrain_content
QDRANT_URL=http://qdrant-host:6333
REDIS_URL=redis://user:password@redis-host:6379/0

Five-Service Architecture

The cloud deployment uses five dedicated database services:

+---------------------------------------------------+
|              ZONE 1: RESTRICTED (PII)             |
|                                                   |
|  +-----------------------------------------+      |
|  |  Auth DB (PostgreSQL)                   |      |
|  |                                         |      |
|  |  Tables:                                |      |
|  |    tenants       (company info, PII)    |      |
|  |    users         (emails, names, PII)   |      |
|  |    subscriptions (Stripe, billing)      |      |
|  |    oauth_tokens  (encrypted)            |      |
|  |    api_keys      (hashed)               |      |
|  |    audit_logs    (immutable)            |      |
|  |    usage_metrics (billing)              |      |
|  |    revoked_tokens                       |      |
|  |    stripe_events                        |      |
|  |                                         |      |
|  |  Access: App backend ONLY               |      |
|  |  Firewall: Block all except app server  |      |
|  +-----------------------------------------+      |
|                                                   |
+---------------------------------------------------+

+---------------------------------------------------+
|              ZONE 2: STANDARD (Content)           |
|                                                   |
|  +-----------------------------------------+      |
|  |  Content DB (PostgreSQL + RLS)          |      |
|  |                                         |      |
|  |  Tables:                                |      |
|  |    entities        (all entity types)   |      |
|  |    assets          (metadata, not files)|      |
|  |    templates       (type definitions)   |      |
|  |    plugin_data     (plugin state)       |      |
|  |    relationships   (entity links)       |      |
|  |    timeline_entries                     |      |
|  |    sync_state                           |      |
|  |                                         |      |
|  |  Access: App backend + AI service (RO)  |      |
|  |  All tables: RLS by tenant_id           |      |
|  |  NO PII. tenant_id is a UUID.           |      |
|  +-----------------------------------------+      |
|                                                   |
|  +-------------------+  +--------------------+    |
|  | Qdrant            |  | Redis              |    |
|  |                   |  |                    |    |
|  | Collection/tenant |  | Sessions (no PII)  |    |
|  | Entity embeddings |  | Rate limits        |    |
|  | HNSW indexes      |  | Presence/PubSub    |    |
|  | No PII in payload |  | BrainBits tracking |    |
|  +-------------------+  +--------------------+    |
|                                                   |
+---------------------------------------------------+

Auth Database

The Auth DB stores all personally identifiable information and billing data. It is the only database that maps tenant_id UUIDs to real company names, user emails, and billing information.

Key tables:

TablePurposeRow Estimate (Year 1)
tenantsOrganizations with billing, plan, storage config~600
usersUser accounts with email, name, auth credentials~2,400
subscriptionsStripe subscription tracking~600
oauth_tokensEncrypted Google Drive/OAuth tokens~1,000
api_keysHashed API keys for programmatic access~200
audit_logsImmutable append-only security log~1M/year
usage_metricsMonthly billing metrics per tenant~7,200
revoked_tokensInvalidated JWTs (for logout)Variable
stripe_eventsWebhook idempotency trackingVariable

Security measures:

  • Encrypted at rest (LUKS or ZFS encryption)
  • TLS required for all connections (sslmode=require)
  • pgaudit extension enabled for query logging
  • Firewall restricts access to the app backend server only
  • OAuth tokens and billing addresses are AES-256 encrypted at the application layer before storage

Content Database

The Content DB stores all entity data, templates, plugin state, and relationships. It contains zero PII. The tenant_id column is a UUID — it never contains names, emails, or company information.

Key tables:

TablePurposeRLS
entitiesAll entity types in one table (schema-agnostic JSON blob)Yes
assetsAsset metadata (filenames, AI descriptions, tags) — not binary filesYes
templatesEntity type definitions (parsed from template markdown)Yes
plugin_dataPer-plugin key-value stateYes
relationshipsEntity-to-entity linksYes
timeline_entriesChronological events for entitiesYes
sync_stateClient sync tracking (last push/pull timestamps)No

Row-Level Security:

Every table (except sync_state) has RLS enabled with a tenant isolation policy:

ALTER TABLE entities ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON entities
    USING (tenant_id = current_setting('app.current_tenant')::UUID);

The backend sets app.current_tenant on each database session from the JWT token. Even if application-level filtering fails, the database itself prevents cross-tenant data access.

Qdrant (Vector Store)

Qdrant stores vector embeddings for RAG (Retrieval-Augmented Generation). Each tenant gets its own collection with isolated HNSW indexes.

Sizing (Year 1):

Vectors:    240 paid tenants x 500 entities x 10 chunks x 384 dims x 4 bytes
            = ~1.8 GB vectors
HNSW:       ~2x vectors = ~3.6 GB indexes
Total:      ~5.5 GB storage, ~4 GB RAM for hot indexes
Collections: 240 (one per paid tenant)

Configuration:

QDRANT_URL=http://qdrant-host:6333
QDRANT_API_KEY=your-api-key

Redis

Redis handles transient state that benefits from sub-millisecond latency:

UsageData StoredPersistence
SessionsJWT metadata (tenant_id, user_id, role — no PII)TTL-based eviction
Rate limitsRequest counts per tenant per endpointTTL-based eviction
PresenceOnline user indicators per tenantTTL-based eviction
BrainBitsCredit balance trackingPersisted (AOF)
PubSubReal-time event channels per tenantTransient
AI cacheCached generation resultsTTL-based eviction

Memory usage (Year 1): ~25 MB active data. 2 GB allocated RAM provides extreme headroom.

Configuration:

REDIS_URL=redis://studiobrain_app:password@redis-host:6379/0

The AI service uses a separate Redis database number to avoid key collisions:

REDIS_URL=redis://studiobrain_ai:password@redis-host:6379/1

Security Zones

Zone 1: Restricted (Auth DB)

The Auth DB is in a restricted security zone. Access rules:

  • Firewall: Only the app backend server IP can connect on port 5432. All other IPs are blocked.
  • AI service is BLOCKED. The AI service cannot reach the Auth DB — enforced by firewall rules.
  • Dedicated DB user: studiobrain_auth with CRUD permissions. No superuser access.
  • All queries logged via pgaudit extension.
  • Full disk encryption on the host.

Zone 2: Standard (Content DB, Qdrant, Redis)

Content services are in a standard security zone. Access rules:

  • App backend: Full CRUD access with RLS enforcement.
  • AI service: Read-only access to Content DB (studiobrain_ai_ro user). Full access to Qdrant and Redis.
  • No PII exposure. The content zone never sees user emails, names, or billing information.

JWT Token Design

JWTs bridge the two zones. The token contains only UUIDs and roles — never PII:

{
  "sub": "550e8400-e29b-41d4-a716-446655440000",
  "tenant_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
  "role": "editor",
  "plan": "indie",
  "iat": 1771882000,
  "exp": 1771968400
}

The content layer never needs to know who you are — only what tenant you belong to and what role you have.

Database Router

StudioBrain uses a DatabaseRouter class that routes SQLAlchemy operations to the correct database engine based on the deployment mode.

DeploymentMode Enum

class DeploymentMode(Enum):
    DESKTOP = "desktop"          # SQLite, single user
    SELF_HOSTED = "self_hosted"  # Single PostgreSQL instance
    CLOUD = "cloud"              # Separate Auth + Content databases

Routing Logic

class DatabaseRouter:
    AUTH_MODELS = {
        'tenants', 'users', 'subscriptions', 'api_keys',
        'oauth_tokens', 'audit_logs', 'revoked_tokens',
        'usage_metrics', 'stripe_events'
    }
 
    def get_engine(self, model_class):
        if self.mode == DeploymentMode.DESKTOP:
            return self.default_engine  # SQLite for everything
 
        if model_class.__tablename__ in self.AUTH_MODELS:
            return self.auth_engine     # Auth DB for PII models
        return self.content_engine      # Content DB for everything else

In desktop mode, all models use the same SQLite engine. In cloud mode, auth-related models are routed to the Auth DB and content models to the Content DB.

Schema Parity

The same SQLAlchemy models work across all modes. The differences are minimal:

FeatureDesktop (SQLite)Self-Hosted (PostgreSQL)Cloud (Split PostgreSQL)
All content tablesYes (tenant_id='local')YesYes (Content DB + RLS)
Users tableYes (single user)YesYes (Auth DB)
Tenants tableNoYesYes (Auth DB)
SubscriptionsNoNoYes (Auth DB)
Audit logsNoYesYes (Auth DB)
RLS policiesNoNoYes

Database Users

Follow the principle of least privilege. Each service gets a dedicated database user with only the permissions it needs.

DB UserInstancePermissionsUsed By
studiobrain_authAuth DBFull CRUD on auth schemaApp backend (auth layer)
studiobrain_auth_roAuth DBSELECT onlyAdmin dashboard (future)
studiobrain_appContent DBFull CRUD with RLSApp backend (content layer)
studiobrain_ai_roContent DBSELECT only with RLSAI service (read-only)
studiobrain_syncContent DBINSERT/UPDATE with RLSSync protocol endpoints

Creating Database Users

-- Auth DB
CREATE USER studiobrain_auth WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO studiobrain_auth;
 
-- Content DB (read-write for app)
CREATE USER studiobrain_app WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO studiobrain_app;
 
-- Content DB (read-only for AI service)
CREATE USER studiobrain_ai_ro WITH PASSWORD 'secure_password';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO studiobrain_ai_ro;

Alembic Migrations

StudioBrain uses Alembic for database schema migrations. The migration system is deployment-mode-aware: it applies different changes based on whether the target is SQLite, self-hosted PostgreSQL, or cloud PostgreSQL.

Generating a Migration

# Enter the backend container
docker exec -it studiobrain-backend bash
 
# Generate a new migration
cd /app/backend
alembic revision --autogenerate -m "add_new_field_to_entities"

Reviewing a Migration

Always review the generated migration before applying:

# View the migration file
cat alembic/versions/xxxx_add_new_field_to_entities.py

Check that:

  • New columns have sensible defaults
  • Indexes are added for frequently queried columns
  • Cloud-only changes are wrapped in deployment mode checks
  • The downgrade() function correctly reverses all changes

Applying Migrations

# Apply all pending migrations
alembic upgrade head
 
# Apply to a specific revision
alembic upgrade abc123
 
# Check current state
alembic current

Rolling Back

# Roll back one migration
alembic downgrade -1
 
# Roll back to a specific revision
alembic downgrade abc123

Mode-Aware Migrations

Migrations can conditionally apply changes based on deployment mode:

from alembic import op
from config import get_deployment_mode, DeploymentMode
 
def upgrade():
    mode = get_deployment_mode()
 
    # Applied in ALL modes (including SQLite)
    op.add_column('entities',
        sa.Column('new_field', sa.String(), server_default=''))
 
    # Applied only in cloud mode (PostgreSQL with RLS)
    if mode == DeploymentMode.CLOUD:
        op.execute('ALTER TABLE entities ENABLE ROW LEVEL SECURITY')
        op.execute("""
            CREATE POLICY tenant_isolation ON entities
            USING (tenant_id = current_setting('app.current_tenant', true))
        """)

Backup and Restore

Auth Database

Daily full backup:

pg_dump -h auth-host -U studiobrain_auth -d studiobrain_auth \
  -F c -f /backups/auth/auth_$(date +%Y%m%d_%H%M%S).dump

Point-in-time recovery (PITR) setup:

# postgresql.conf on Auth DB host
wal_level = replica
archive_mode = on
archive_command = 'cp %p /backups/auth/wal/%f'

Restore from full backup:

pg_restore -h auth-host -U postgres -d studiobrain_auth \
  --clean --if-exists /backups/auth/auth_20260224.dump

Content Database

Daily full backup:

pg_dump -h content-host -U studiobrain_app -d studiobrain_content \
  -F c -f /backups/content/content_$(date +%Y%m%d_%H%M%S).dump

Per-tenant export (for data portability):

pg_dump -h content-host -U studiobrain_app -d studiobrain_content \
  --table='entities' --where="tenant_id='tenant-uuid'" \
  -f /backups/tenant_export.sql

Qdrant

# Create snapshot
curl -X POST "http://qdrant-host:6333/snapshots"
 
# List snapshots
curl "http://qdrant-host:6333/snapshots"
 
# Restore from snapshot
curl -X PUT "http://qdrant-host:6333/snapshots/{snapshot_name}/recover"

Redis

# Trigger snapshot
redis-cli -h redis-host -a password BGSAVE
 
# Copy RDB file
scp redis-host:/var/lib/redis/dump.rdb /backups/redis/redis_$(date +%Y%m%d).rdb

SQLite (Desktop)

# Online backup (no downtime)
sqlite3 /data/db/city_brains.db ".backup '/backups/city_brains_$(date +%Y%m%d).db'"

Backup Strategy Summary

ServiceFrequencyMethodRetentionRPORTO
Auth DBContinuous + daily fullWAL archiving + pg_dump90 days0 (WAL)15 min
Content DBHourly WAL + daily fullWAL archiving + pg_dump30 days0 (WAL)15 min
QdrantDailySnapshot API7 days24h30 min
RedisHourlyRDB snapshot24 hours1h5 min
Content filesHourlyZFS snapshots or rsync30 days1hInstant

The Auth DB has the most aggressive backup strategy because it is the hardest to rebuild. The Content DB can theoretically be rebuilt from user’s local markdown files via the sync protocol.

Connection Strings Reference

Format

DatabaseFormat
SQLitesqlite:////absolute/path/to/database.db
PostgreSQLpostgresql://user:password@host:5432/database
PostgreSQL (SSL)postgresql://user:password@host:5432/database?sslmode=require
Qdranthttp://host:6333
Redisredis://user:password@host:6379/db_number

Service-to-Database Mapping

ServiceAuth DBContent DBQdrantRedis
App BackendRead/WriteRead/WriteRead/WriteRead/Write (DB 0)
AI ServiceBLOCKEDRead-OnlyRead/WriteRead/Write (DB 1)
Admin DashboardRead-Only---------

Scaling

Year 1 Sizing (2,400 users, ~600 tenants)

ServiceStorageRAMNotes
Auth DB10 GB2 GB~515 MB data, mostly audit logs
Content DB50 GB4 GB~1.2 GB data for 240 paid tenants
Qdrant20 GB8 GB~5.5 GB vectors + indexes
Redis5 GB2 GB~25 MB active data

Connection Pooling

For deployments beyond 500 concurrent connections, add PgBouncer in front of PostgreSQL:

# pgbouncer.ini
[databases]
studiobrain_content = host=content-host port=5432 dbname=studiobrain_content
 
[pgbouncer]
listen_port = 6432
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25

Update the connection string to point to PgBouncer:

CONTENT_DATABASE_URL=postgresql://studiobrain_app:password@pgbouncer-host:6432/studiobrain_content

Scaling Path

ScaleAuth DBContent DBQdrantRedis
Year 1 (2,400 users)Single instance, 10 GBSingle instance, 50 GBSingle instance, 20 GBSingle instance, 2 GB
Year 2 (10K users)Add read replicaAdd PgBouncer, consider CitusAdd replica for read scalingRedis Sentinel (HA)
Year 3 (50K+ users)Dedicated HA clusterCitus distributed by tenant_idQdrant distributed modeRedis Cluster
Enterprise tenantStays shared (small data)Dedicated instance optionDedicated Qdrant nodeShared (per-tenant namespace)

Qdrant Sharding

As the vector count grows, Qdrant supports horizontal scaling:

# Create a distributed collection with 3 shards
curl -X PUT "http://qdrant-host:6333/collections/tenant_collection" \
  -H "Content-Type: application/json" \
  -d '{
    "vectors": {"size": 384, "distance": "Cosine"},
    "shard_number": 3,
    "replication_factor": 2
  }'

PostgreSQL Read Replicas

For read-heavy workloads (particularly the AI service reading entity content), add streaming replicas:

# primary postgresql.conf
wal_level = replica
max_wal_senders = 3
 
# replica recovery.conf
primary_conninfo = 'host=primary-host port=5432 user=replicator password=password'

Point the AI service’s read-only connection to the replica:

CONTENT_DATABASE_URL=postgresql://studiobrain_ai_ro:password@replica-host:5432/studiobrain_content

Desktop-to-Cloud Migration

Users who start on the desktop (SQLite) tier can migrate to cloud (PostgreSQL) when they upgrade their plan.

Migration Flow

  1. User upgrades to Indie or Team plan via Stripe checkout.
  2. A tenant record is created in the Auth DB.
  3. The desktop app prompts: “Import your local project to the cloud?”
  4. Desktop sends the SQLite database to POST /api/migrate/import.
  5. Backend reads SQLite entities, rewrites tenant_id from 'local' to the new tenant UUID.
  6. Entities are inserted into the Content DB with RLS.
  7. Qdrant embeddings are generated from entity content.
  8. Desktop switches to cloud sync mode.

Reverse Export

Cloud data can be exported back to SQLite for offline use:

curl -H "Authorization: Bearer $TOKEN" \
  "http://localhost:8201/api/migrate/export" \
  -o studiobrain_export.db

This creates a standalone SQLite database with all tenant entities, with tenant_id rewritten to 'local' for desktop compatibility.