Database Guide

StudioBrain supports two database modes: SQLite for desktop/single-user deployments and PostgreSQL for multi-tenant cloud deployments. This guide covers the full database architecture, security model, migration workflow, and scaling strategy.

Database Modes

SQLite (Desktop / Single-User)

In desktop mode, StudioBrain uses a single SQLite database file. This is the default for self-hosted and open-core installations.

Zero configuration. No external database server required.
Single file. The entire database lives at the path specified by DATABASE_URL (e.g., sqlite:////data/db/city_brains.db).
Single tenant. All entities use tenant_id='local' as a hardcoded value.
No RLS. SQLite does not support row-level security; tenant isolation is enforced at the application layer.

Connection string format:

DATABASE_URL=sqlite:////data/db/city_brains.db

PostgreSQL (Cloud / Multi-Tenant)

In cloud mode, StudioBrain uses a five-service database architecture with strict separation between PII and content data.

Two PostgreSQL instances. Auth (PII/billing) and Content (entities/templates) run on separate servers.
Row-Level Security (RLS). Content database enforces tenant isolation at the database level.
Qdrant. Dedicated vector database for RAG embeddings, one collection per tenant.
Redis. Sessions, rate limits, real-time presence, and PubSub channels.

Connection string format:

AUTH_DATABASE_URL=postgresql://studiobrain_auth:password@auth-host:5432/studiobrain_auth
CONTENT_DATABASE_URL=postgresql://studiobrain_app:password@content-host:5432/studiobrain_content
QDRANT_URL=http://qdrant-host:6333
REDIS_URL=redis://user:password@redis-host:6379/0

Five-Service Architecture

The cloud deployment uses five dedicated database services:

+---------------------------------------------------+
|              ZONE 1: RESTRICTED (PII)             |
|                                                   |
|  +-----------------------------------------+      |
|  |  Auth DB (PostgreSQL)                   |      |
|  |                                         |      |
|  |  Tables:                                |      |
|  |    tenants       (company info, PII)    |      |
|  |    users         (emails, names, PII)   |      |
|  |    subscriptions (Stripe, billing)      |      |
|  |    oauth_tokens  (encrypted)            |      |
|  |    api_keys      (hashed)               |      |
|  |    audit_logs    (immutable)            |      |
|  |    usage_metrics (billing)              |      |
|  |    revoked_tokens                       |      |
|  |    stripe_events                        |      |
|  |                                         |      |
|  |  Access: App backend ONLY               |      |
|  |  Firewall: Block all except app server  |      |
|  +-----------------------------------------+      |
|                                                   |
+---------------------------------------------------+

+---------------------------------------------------+
|              ZONE 2: STANDARD (Content)           |
|                                                   |
|  +-----------------------------------------+      |
|  |  Content DB (PostgreSQL + RLS)          |      |
|  |                                         |      |
|  |  Tables:                                |      |
|  |    entities        (all entity types)   |      |
|  |    assets          (metadata, not files)|      |
|  |    templates       (type definitions)   |      |
|  |    plugin_data     (plugin state)       |      |
|  |    relationships   (entity links)       |      |
|  |    timeline_entries                     |      |
|  |    sync_state                           |      |
|  |                                         |      |
|  |  Access: App backend + AI service (RO)  |      |
|  |  All tables: RLS by tenant_id           |      |
|  |  NO PII. tenant_id is a UUID.           |      |
|  +-----------------------------------------+      |
|                                                   |
|  +-------------------+  +--------------------+    |
|  | Qdrant            |  | Redis              |    |
|  |                   |  |                    |    |
|  | Collection/tenant |  | Sessions (no PII)  |    |
|  | Entity embeddings |  | Rate limits        |    |
|  | HNSW indexes      |  | Presence/PubSub    |    |
|  | No PII in payload |  | BrainBits tracking |    |
|  +-------------------+  +--------------------+    |
|                                                   |
+---------------------------------------------------+

Auth Database

The Auth DB stores all personally identifiable information and billing data. It is the only database that maps tenant_id UUIDs to real company names, user emails, and billing information.

Key tables:

Table	Purpose	Row Estimate (Year 1)
`tenants`	Organizations with billing, plan, storage config	~600
`users`	User accounts with email, name, auth credentials	~2,400
`subscriptions`	Stripe subscription tracking	~600
`oauth_tokens`	Encrypted Google Drive/OAuth tokens	~1,000
`api_keys`	Hashed API keys for programmatic access	~200
`audit_logs`	Immutable append-only security log	~1M/year
`usage_metrics`	Monthly billing metrics per tenant	~7,200
`revoked_tokens`	Invalidated JWTs (for logout)	Variable
`stripe_events`	Webhook idempotency tracking	Variable

Security measures:

Encrypted at rest (LUKS or ZFS encryption)
TLS required for all connections (sslmode=require)
pgaudit extension enabled for query logging
Firewall restricts access to the app backend server only
OAuth tokens and billing addresses are AES-256 encrypted at the application layer before storage

Content Database

The Content DB stores all entity data, templates, plugin state, and relationships. It contains zero PII. The tenant_id column is a UUID — it never contains names, emails, or company information.

Key tables:

Table	Purpose	RLS
`entities`	All entity types in one table (schema-agnostic JSON blob)	Yes
`assets`	Asset metadata (filenames, AI descriptions, tags) — not binary files	Yes
`templates`	Entity type definitions (parsed from template markdown)	Yes
`plugin_data`	Per-plugin key-value state	Yes
`relationships`	Entity-to-entity links	Yes
`timeline_entries`	Chronological events for entities	Yes
`sync_state`	Client sync tracking (last push/pull timestamps)	No

Row-Level Security:

Every table (except sync_state) has RLS enabled with a tenant isolation policy:

ALTER TABLE entities ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON entities
    USING (tenant_id = current_setting('app.current_tenant')::UUID);

The backend sets app.current_tenant on each database session from the JWT token. Even if application-level filtering fails, the database itself prevents cross-tenant data access.

Qdrant (Vector Store)

Qdrant stores vector embeddings for RAG (Retrieval-Augmented Generation). Each tenant gets its own collection with isolated HNSW indexes.

Sizing (Year 1):

Vectors:    240 paid tenants x 500 entities x 10 chunks x 384 dims x 4 bytes
            = ~1.8 GB vectors
HNSW:       ~2x vectors = ~3.6 GB indexes
Total:      ~5.5 GB storage, ~4 GB RAM for hot indexes
Collections: 240 (one per paid tenant)

Configuration:

QDRANT_URL=http://qdrant-host:6333
QDRANT_API_KEY=your-api-key

Redis

Redis handles transient state that benefits from sub-millisecond latency:

Usage	Data Stored	Persistence
Sessions	JWT metadata (tenant_id, user_id, role — no PII)	TTL-based eviction
Rate limits	Request counts per tenant per endpoint	TTL-based eviction
Presence	Online user indicators per tenant	TTL-based eviction
BrainBits	Credit balance tracking	Persisted (AOF)
PubSub	Real-time event channels per tenant	Transient
AI cache	Cached generation results	TTL-based eviction

Memory usage (Year 1): ~25 MB active data. 2 GB allocated RAM provides extreme headroom.

Configuration:

REDIS_URL=redis://studiobrain_app:password@redis-host:6379/0

The AI service uses a separate Redis database number to avoid key collisions:

REDIS_URL=redis://studiobrain_ai:password@redis-host:6379/1

Security Zones

Zone 1: Restricted (Auth DB)

The Auth DB is in a restricted security zone. Access rules:

Firewall: Only the app backend server IP can connect on port 5432. All other IPs are blocked.
AI service is BLOCKED. The AI service cannot reach the Auth DB — enforced by firewall rules.
Dedicated DB user: studiobrain_auth with CRUD permissions. No superuser access.
All queries logged via pgaudit extension.
Full disk encryption on the host.

Zone 2: Standard (Content DB, Qdrant, Redis)

Content services are in a standard security zone. Access rules:

App backend: Full CRUD access with RLS enforcement.
AI service: Read-only access to Content DB (studiobrain_ai_ro user). Full access to Qdrant and Redis.
No PII exposure. The content zone never sees user emails, names, or billing information.

JWT Token Design

JWTs bridge the two zones. The token contains only UUIDs and roles — never PII:

{
  "sub": "550e8400-e29b-41d4-a716-446655440000",
  "tenant_id": "7c9e6679-7425-40de-944b-e07fc1f90ae7",
  "role": "editor",
  "plan": "indie",
  "iat": 1771882000,
  "exp": 1771968400
}

The content layer never needs to know who you are — only what tenant you belong to and what role you have.

Database Router

StudioBrain uses a DatabaseRouter class that routes SQLAlchemy operations to the correct database engine based on the deployment mode.

DeploymentMode Enum

class DeploymentMode(Enum):
    DESKTOP = "desktop"          # SQLite, single user
    SELF_HOSTED = "self_hosted"  # Single PostgreSQL instance
    CLOUD = "cloud"              # Separate Auth + Content databases

Routing Logic

class DatabaseRouter:
    AUTH_MODELS = {
        'tenants', 'users', 'subscriptions', 'api_keys',
        'oauth_tokens', 'audit_logs', 'revoked_tokens',
        'usage_metrics', 'stripe_events'
    }
 
    def get_engine(self, model_class):
        if self.mode == DeploymentMode.DESKTOP:
            return self.default_engine  # SQLite for everything
 
        if model_class.__tablename__ in self.AUTH_MODELS:
            return self.auth_engine     # Auth DB for PII models
        return self.content_engine      # Content DB for everything else

In desktop mode, all models use the same SQLite engine. In cloud mode, auth-related models are routed to the Auth DB and content models to the Content DB.

Schema Parity

The same SQLAlchemy models work across all modes. The differences are minimal:

Feature	Desktop (SQLite)	Self-Hosted (PostgreSQL)	Cloud (Split PostgreSQL)
All content tables	Yes (`tenant_id='local'`)	Yes	Yes (Content DB + RLS)
Users table	Yes (single user)	Yes	Yes (Auth DB)
Tenants table	No	Yes	Yes (Auth DB)
Subscriptions	No	No	Yes (Auth DB)
Audit logs	No	Yes	Yes (Auth DB)
RLS policies	No	No	Yes

Database Users

Follow the principle of least privilege. Each service gets a dedicated database user with only the permissions it needs.

DB User	Instance	Permissions	Used By
`studiobrain_auth`	Auth DB	Full CRUD on auth schema	App backend (auth layer)
`studiobrain_auth_ro`	Auth DB	SELECT only	Admin dashboard (future)
`studiobrain_app`	Content DB	Full CRUD with RLS	App backend (content layer)
`studiobrain_ai_ro`	Content DB	SELECT only with RLS	AI service (read-only)
`studiobrain_sync`	Content DB	INSERT/UPDATE with RLS	Sync protocol endpoints

Creating Database Users

-- Auth DB
CREATE USER studiobrain_auth WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO studiobrain_auth;
 
-- Content DB (read-write for app)
CREATE USER studiobrain_app WITH PASSWORD 'secure_password';
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO studiobrain_app;
 
-- Content DB (read-only for AI service)
CREATE USER studiobrain_ai_ro WITH PASSWORD 'secure_password';
GRANT SELECT ON ALL TABLES IN SCHEMA public TO studiobrain_ai_ro;

Alembic Migrations

StudioBrain uses Alembic for database schema migrations. The migration system is deployment-mode-aware: it applies different changes based on whether the target is SQLite, self-hosted PostgreSQL, or cloud PostgreSQL.

Generating a Migration

# Enter the backend container
docker exec -it studiobrain-backend bash
 
# Generate a new migration
cd /app/backend
alembic revision --autogenerate -m "add_new_field_to_entities"

Reviewing a Migration

Always review the generated migration before applying:

# View the migration file
cat alembic/versions/xxxx_add_new_field_to_entities.py

Check that:

New columns have sensible defaults
Indexes are added for frequently queried columns
Cloud-only changes are wrapped in deployment mode checks
The downgrade() function correctly reverses all changes

Applying Migrations

# Apply all pending migrations
alembic upgrade head
 
# Apply to a specific revision
alembic upgrade abc123
 
# Check current state
alembic current

Rolling Back

# Roll back one migration
alembic downgrade -1
 
# Roll back to a specific revision
alembic downgrade abc123

Mode-Aware Migrations

Migrations can conditionally apply changes based on deployment mode:

from alembic import op
from config import get_deployment_mode, DeploymentMode
 
def upgrade():
    mode = get_deployment_mode()
 
    # Applied in ALL modes (including SQLite)
    op.add_column('entities',
        sa.Column('new_field', sa.String(), server_default=''))
 
    # Applied only in cloud mode (PostgreSQL with RLS)
    if mode == DeploymentMode.CLOUD:
        op.execute('ALTER TABLE entities ENABLE ROW LEVEL SECURITY')
        op.execute("""
            CREATE POLICY tenant_isolation ON entities
            USING (tenant_id = current_setting('app.current_tenant', true))
        """)

Backup and Restore

Auth Database

Daily full backup:

pg_dump -h auth-host -U studiobrain_auth -d studiobrain_auth \
  -F c -f /backups/auth/auth_$(date +%Y%m%d_%H%M%S).dump

Point-in-time recovery (PITR) setup:

# postgresql.conf on Auth DB host
wal_level = replica
archive_mode = on
archive_command = 'cp %p /backups/auth/wal/%f'

Restore from full backup:

pg_restore -h auth-host -U postgres -d studiobrain_auth \
  --clean --if-exists /backups/auth/auth_20260224.dump

Content Database

Daily full backup:

pg_dump -h content-host -U studiobrain_app -d studiobrain_content \
  -F c -f /backups/content/content_$(date +%Y%m%d_%H%M%S).dump

Per-tenant export (for data portability):

pg_dump -h content-host -U studiobrain_app -d studiobrain_content \
  --table='entities' --where="tenant_id='tenant-uuid'" \
  -f /backups/tenant_export.sql

Qdrant

# Create snapshot
curl -X POST "http://qdrant-host:6333/snapshots"
 
# List snapshots
curl "http://qdrant-host:6333/snapshots"
 
# Restore from snapshot
curl -X PUT "http://qdrant-host:6333/snapshots/{snapshot_name}/recover"

Redis

# Trigger snapshot
redis-cli -h redis-host -a password BGSAVE
 
# Copy RDB file
scp redis-host:/var/lib/redis/dump.rdb /backups/redis/redis_$(date +%Y%m%d).rdb

SQLite (Desktop)

# Online backup (no downtime)
sqlite3 /data/db/city_brains.db ".backup '/backups/city_brains_$(date +%Y%m%d).db'"

Backup Strategy Summary

Service	Frequency	Method	Retention	RPO	RTO
Auth DB	Continuous + daily full	WAL archiving + `pg_dump`	90 days	0 (WAL)	15 min
Content DB	Hourly WAL + daily full	WAL archiving + `pg_dump`	30 days	0 (WAL)	15 min
Qdrant	Daily	Snapshot API	7 days	24h	30 min
Redis	Hourly	RDB snapshot	24 hours	1h	5 min
Content files	Hourly	ZFS snapshots or rsync	30 days	1h	Instant

The Auth DB has the most aggressive backup strategy because it is the hardest to rebuild. The Content DB can theoretically be rebuilt from user’s local markdown files via the sync protocol.

Connection Strings Reference

Format

Database	Format
SQLite	`sqlite:////absolute/path/to/database.db`
PostgreSQL	`postgresql://user:password@host:5432/database`
PostgreSQL (SSL)	`postgresql://user:password@host:5432/database?sslmode=require`
Qdrant	`http://host:6333`
Redis	`redis://user:password@host:6379/db_number`

Service-to-Database Mapping

Service	Auth DB	Content DB	Qdrant	Redis
App Backend	Read/Write	Read/Write	Read/Write	Read/Write (DB 0)
AI Service	BLOCKED	Read-Only	Read/Write	Read/Write (DB 1)
Admin Dashboard	Read-Only	---	---	---

Scaling

Year 1 Sizing (2,400 users, ~600 tenants)

Service	Storage	RAM	Notes
Auth DB	10 GB	2 GB	~515 MB data, mostly audit logs
Content DB	50 GB	4 GB	~1.2 GB data for 240 paid tenants
Qdrant	20 GB	8 GB	~5.5 GB vectors + indexes
Redis	5 GB	2 GB	~25 MB active data

Connection Pooling

For deployments beyond 500 concurrent connections, add PgBouncer in front of PostgreSQL:

# pgbouncer.ini
[databases]
studiobrain_content = host=content-host port=5432 dbname=studiobrain_content
 
[pgbouncer]
listen_port = 6432
pool_mode = transaction
max_client_conn = 1000
default_pool_size = 25

Update the connection string to point to PgBouncer:

CONTENT_DATABASE_URL=postgresql://studiobrain_app:password@pgbouncer-host:6432/studiobrain_content

Scaling Path

Scale	Auth DB	Content DB	Qdrant	Redis
Year 1 (2,400 users)	Single instance, 10 GB	Single instance, 50 GB	Single instance, 20 GB	Single instance, 2 GB
Year 2 (10K users)	Add read replica	Add PgBouncer, consider Citus	Add replica for read scaling	Redis Sentinel (HA)
Year 3 (50K+ users)	Dedicated HA cluster	Citus distributed by tenant_id	Qdrant distributed mode	Redis Cluster
Enterprise tenant	Stays shared (small data)	Dedicated instance option	Dedicated Qdrant node	Shared (per-tenant namespace)

Qdrant Sharding

As the vector count grows, Qdrant supports horizontal scaling:

# Create a distributed collection with 3 shards
curl -X PUT "http://qdrant-host:6333/collections/tenant_collection" \
  -H "Content-Type: application/json" \
  -d '{
    "vectors": {"size": 384, "distance": "Cosine"},
    "shard_number": 3,
    "replication_factor": 2
  }'

PostgreSQL Read Replicas

For read-heavy workloads (particularly the AI service reading entity content), add streaming replicas:

# primary postgresql.conf
wal_level = replica
max_wal_senders = 3
 
# replica recovery.conf
primary_conninfo = 'host=primary-host port=5432 user=replicator password=password'

Point the AI service’s read-only connection to the replica:

CONTENT_DATABASE_URL=postgresql://studiobrain_ai_ro:password@replica-host:5432/studiobrain_content

Desktop-to-Cloud Migration

Users who start on the desktop (SQLite) tier can migrate to cloud (PostgreSQL) when they upgrade their plan.

Migration Flow

User upgrades to Indie or Team plan via Stripe checkout.
A tenant record is created in the Auth DB.
The desktop app prompts: “Import your local project to the cloud?”
Desktop sends the SQLite database to POST /api/migrate/import.
Backend reads SQLite entities, rewrites tenant_id from 'local' to the new tenant UUID.
Entities are inserted into the Content DB with RLS.
Qdrant embeddings are generated from entity content.
Desktop switches to cloud sync mode.

Reverse Export

Cloud data can be exported back to SQLite for offline use:

curl -H "Authorization: Bearer $TOKEN" \
  "http://localhost:8201/api/migrate/export" \
  -o studiobrain_export.db

This creates a standalone SQLite database with all tenant entities, with tenant_id rewritten to 'local' for desktop compatibility.

Admin Guide Troubleshooting