DeveloperExport System Architecture

Export System Architecture

This document provides a comprehensive overview of the StudioBrain Export System architecture, implementation details, and guidelines for extending the system.

Table of Contents

  1. Architecture Overview
  2. Service Architecture
  3. Template Engine
  4. PDF Generation Pipeline
  5. Async Task Processing
  6. File System Organization
  7. Adding Custom Export Templates
  8. Extending Export Formats
  9. Testing Custom Formats

Architecture Overview

The Export System follows a layered architecture pattern:

┌─────────────────────────────────────────────────────────────┐
│                     API Layer (FastAPI)                     │
│                   /backend/routes/export_routes.py          │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐  │
│  │ /api/export  │  │ /api/export  │  │  /api/export/{id} │  │
│  │   /layouts   │  │    /html     │  │    /pdf, /docx   │  │
│  └──────────────┘  └──────────────┘  └──────────────────┘  │
└───────────────────────────┬─────────────────────────────────┘

┌───────────────────────────▼─────────────────────────────────┐
│                  Service Layer (Python)                     │
│              /backend/services/export_service.py            │
│  ┌──────────────────────────────────────────────────────┐  │
│  │              DocumentExporter Class                  │  │
│  │  - export_html()                                     │  │
│  │  - export_pdf()                                      │  │
│  │  - export_docx()                                     │  │
│  │  - _render_entity_data()                             │  │
│  │  - _embed_images()                                   │  │
│  └──────────────────────────────────────────────────────┘  │
└───────────────────────────┬─────────────────────────────────┘

┌───────────────────────────▼─────────────────────────────────┐
│                Template Layer (Jinja2)                      │
│         /backend/services/export_layouts.py                 │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐  │
│  │  GDD Layout  │  │  Script      │  │   Style Guide    │  │
│  │   Template   │  │   Format     │  │    Layout        │  │
│  └──────────────┘  └──────────────┘  └──────────────────┘  │
└─────────────────────────────────────────────────────────────┘

Components

1. API Layer (export_routes.py)

Purpose: RESTful HTTP endpoints for export operations

Responsibilities:

  • Validate request parameters
  • Authenticate users
  • Fetch entity data from database
  • Call service layer for export processing
  • Return formatted responses

Key Functions:

@router.post("/{entity_type}/{entity_id}/html")
async def export_html(...)
 
@router.post("/{entity_type}/{entity_id}/pdf")
async def export_pdf(...)
 
@router.post("/{entity_type}/{entity_id}/docx")
async def export_docx(...)

2. Service Layer (export_service.py)

Purpose: Core export logic and document generation

Main Class: DocumentExporter

Responsibilities:

  • Entity data rendering
  • Template processing
  • Image embedding
  • Format-specific generation

Key Methods:

def export_html(entity, template, layout_name, theme, embed_images)
def export_pdf(entity, template, layout_name, theme, embed_images)
def export_docx(entity, template, layout_name, theme, embed_images)

3. Template Layer (export_layouts.py)

Purpose: Define layout templates and styles

Data Structures:

  • LIST_OF_LAYOUTS - Available layouts
  • LAYOUT_TEMPLATES - Template definitions

Key Functions:

def get_layout_template(layout_name)
def get_layout_css(layout_name, theme)
def list_layouts()

Service Architecture

DocumentExporter Class

The DocumentExporter class is the heart of the export system.

Initialization

from services.export_service import DocumentExporter
 
exporter = DocumentExporter(
    templates_path=Path("/data/content/_Templates/Export")
)

Entity Data Rendering

The _render_entity_data() method transforms entity data into template-ready format:

def _render_entity_data(self, entity, template) -> Dict[str, Any]:
    """
    Convert entity to template data structure.
    
    Returns:
    {
        'entity_type': 'character',
        'entity_id': 'uuid',
        'name': 'Character Name',
        'status': 'active',
        'created_date': '2026-03-09',
        'last_updated': '2026-03-09',
        'fields': {...},
        'tags': [...],
        'assets': [...],
        'primary_asset': {...},
        'template': {...},
        'frontmatter': {...}
    }
    """

Entity-Specific Fields

The _add_entity_specific_fields() method adds type-specific data:

Character:

{
    'full_name': fields.get('full_name', ''),
    'age': fields.get('age', ''),
    'occupation': fields.get('occupation', ''),
    'affiliation': fields.get('affiliation', ''),
}

Location:

{
    'location_type': fields.get('location_type', ''),
    'district': fields.get('district', ''),
    'description': fields.get('description', ''),
}

Brand:

{
    'brand_name': fields.get('brand_name', ''),
    'industry': fields.get('industry', ''),
    'founded': fields.get('founded', ''),
}

Image Embedding

The _embed_images() method converts image references to base64 data URIs:

def _embed_images(self, html_content: str, entity: Any) -> str:
    """
    Convert image references to base64 data URIs.
    
    Args:
        html_content: HTML with image src references
        entity: Entity with assets
    
    Returns:
        HTML with embedded base64 images
    
    Supported formats:
    - PNG (image/png)
    - JPEG (image/jpeg)
    - GIF (image/gif)
    - WebP (image/webp)
    - SVG (image/svg+xml)
    """

Process:

  1. Iterate through entity assets
  2. Read image file from disk
  3. Encode as base64
  4. Determine MIME type from file extension
  5. Replace src="path/to/image.png" with src="data:image/png;base64,..."

Template Engine

Simple Template Rendering

The export system uses a custom template engine (not full Jinja2) that supports:

Variable Substitution

# Replace {{variable}} with value
html_content = html_template.replace('{{name}}', render_data['name'])

Supported patterns:

  • {{variable}} - Simple variable
  • {{fields.field_name}} - Nested field access
  • {{entity_type|title}} - Filter support

Loop Support

# Handle lists (e.g., tags)
list_html = ''.join(f'<span class="tag">{item}</span>' for item in value)
html_content = html_content.replace('{{tags}}', list_html)

Template Structure

A complete template consists of:

HTML Template (.html.jinja)

<div class="document">
    {/* Cover Page */}
    <div class="cover-page">
        <h1 class="document-title">{{name}}</h1>
        <h2 class="document-subtitle">{{entity_type|title}} Document</h2>
        <p class="status">Status: {{status|title}}</p>
    </div>

    {/* Content Sections */}
    <div class="content">
        <section id="overview">
            <h1>Overview</h1>
            <p>{{fields.description}}</p>
        </section>
        
        <section id="details">
            <h1>Details</h1>
            <table class="details-table">
                {% for key, value in fields.items() %}
                <tr>
                    <th>{{key|replace('_', ' ')|title}}</th>
                    <td>{{value}}</td>
                </tr>
                {% endfor %}
            </table>
        </section>
    </div>
</div>

CSS Template (.css)

:root {
    --primary-color: #2c3e50;
    --secondary-color: #3498db;
    --font-family: 'Segoe UI', sans-serif;
}
 
.document {
    font-family: var(--font-family);
    color: var(--text-color);
    max-width: 210mm;
    padding: 20mm;
}
 
.cover-page {
    page-break-after: always;
    text-align: center;
}

Template Loading

Templates are loaded at startup:

def _load_layout_templates(self):
    """Load all layout templates from files."""
    for layout_name in LIST_OF_LAYOUTS:
        html_file = self.templates_path / f"{layout_name}.html.jinja"
        css_file = self.templates_path / f"{layout_name}.css"
        
        layout_data = {}
        
        if html_file.exists():
            with open(html_file, 'r') as f:
                layout_data['html'] = f.read()
        
        if css_file.exists():
            with open(css_file, 'r') as f:
                layout_data['css'] = f.read()
        
        self._layout_cache[layout_name] = layout_data

PDF Generation Pipeline

The PDF export pipeline uses Playwright headless browser for accurate rendering.

Pipeline Steps

Step 1: HTML Generation

html_content = self.export_html(
    entity=entity,
    template=template,
    layout_name='gdd_standard',
    theme='default',
    embed_images=True
)

Step 2: Browser Setup

from playwright.sync_api import sync_playwright
 
with sync_playwright() as p:
    browser = p.chromium.launch(headless=True)
    page = browser.new_page()

Step 3: Content Rendering

page.set_content(html_content, wait_until='networkidle')

Options:

  • wait_until='networkidle' - Wait for network to be idle
  • wait_until='domcontentloaded' - Wait for DOM content loaded
  • wait_until='load' - Wait for complete page load

Step 4: PDF Generation

pdf_bytes = page.pdf(
    format='A4',
    print_background=True,
    margin={
        'top': '20mm',
        'right': '20mm',
        'bottom': '20mm',
        'left': '20mm',
    }
)

CSS @media print rules are applied:

@media print {
    .document {
        padding: 0;
        max-width: none;
    }
    
    .cover-page,
    .toc-page {
        page-break-after: always;
    }
    
    .section {
        page-break-inside: avoid;
    }
}

Requirements

# Install Playwright browser binaries
playwright install chromium
 
# Install Python package (usually in requirements.txt)
pip install playwright

Async Task Processing

For large exports or batch operations, the export system supports async task processing (future enhancement).

Task Structure

{
    "task_id": "uuid",
    "entity_type": "character",
    "entity_id": "uuid",
    "format": "pdf",
    "status": "processing",  # pending, processing, completed, failed
    "progress": 0.5,  # 0.0 to 1.0
    "created_at": "2026-03-09T00:00:00Z",
    "started_at": "2026-03-09T00:00:01Z",
    "completed_at": "2026-03-09T00:00:10Z",
    "result": {
        "download_url": "/api/export/download/task-uuid",
        "file_size": 1048576,
        "filename": "character.pdf"
    }
}

Task Workflow


File System Organization

Directory Structure

/data/content/
├── _Templates/
│   └── Export/
│       ├── gdd_standard.html.jinja
│       ├── gdd_standard.css
│       ├── script_format.html.jinja
│       ├── script_format.css
│       ├── style_guide.html.jinja
│       └── style_guide.css

├── characters/
│   └── uuid-1/
│       └── markdown.md

└── assets/
    └── images/
        └── character-portrait.png

Template Files

Each layout requires two files:

  1. .html.jinja - HTML template structure
  2. .css - CSS styling (themes in :root variables)

Asset Storage

Entity assets are stored in:

/data/content/assets/{asset_type}/{entity_id}/

Or globally:

/data/content/assets/{file_path}

Content Base Path

The CONTENT_BASE_PATH environment variable defines the root:

# From shared.py
CONTENT_BASE_PATH = Path(os.environ.get('CONTENT_BASE_PATH', '/data/content'))

Adding Custom Export Templates

Step 1: Create Layout Files

Create two files in _Templates/Export/:

my_layout.html.jinja:

<div class="document">
    <div class="cover">
        <h1>{{name}}</h1>
        <p>{{entity_type|title}} Document</p>
    </div>
    
    <div class="content">
        {{fields.description}}
    </div>
</div>

my_layout.css:

:root {
    --primary-color: #4a90d9;
    --font-family: 'Arial', sans-serif;
}
 
.document {
    font-family: var(--font-family);
    padding: 20mm;
}
 
.cover {
    text-align: center;
    margin-bottom: 30mm;
}
 
.cover h1 {
    font-size: 36pt;
    color: var(--primary-color);
}

Step 2: Register the Layout

Add to LIST_OF_LAYOUTS in export_layouts.py:

LIST_OF_LAYOUTS = [
    'gdd_standard',
    'script_format',
    'style_guide',
    'my_layout',  # New layout
]

Step 3: Update Template Registry

Add to LAYOUT_TEMPLATES dictionary:

LAYOUT_TEMPLATES = {
    'gdd_standard': {...},
    'script_format': {...},
    'style_guide': {...},
    'my_layout': {
        'html': MY_LAYOUT_HTML,  # Load from file or inline
        'css': {
            'default': MY_LAYOUT_CSS,
        },
        'description': 'My custom layout',
        'features': ['cover', 'content'],
    },
}

Step 4: Restart Service

The template will be automatically loaded on next request or after service restart.


Extending Export Formats

Adding a New Format (e.g., Markdown)

1. Create Export Method

Add to DocumentExporter class:

def export_markdown(
    self,
    entity: Any,
    template: Dict[str, Any],
    include_assets: bool = True,
) -> bytes:
    """
    Export entity to Markdown format.
    """
    # Get template
    md_content = template.get('markdown_body', '')
    
    # Add entity metadata
    metadata = f"""---
title: {entity.name}
entity_type: {entity.entity_type}
status: {entity.status}
created: {entity.created_date.isoformat() if entity.created_date else 'N/A'}
last_updated: {entity.last_updated.isoformat() if entity.last_updated else 'N/A'}
---
 
"""
    
    # Add fields section
    fields_md = "\n## Fields\n\n"
    for key, value in (entity.fields or {}).items():
        fields_md += f"### {key.replace('_', ' ').title()}\n\n"
        fields_md += f"{value}\n\n"
    
    # Add assets section
    if include_assets and entity.assets:
        assets_md = "\n## Assets\n\n"
        for asset in entity.assets:
            assets_md += f"- **{asset.get('file_name', 'Asset')}**: {asset.get('description', '')}\n"
            if asset.get('file_path'):
                assets_md += f"  - Path: `{asset['file_path']}`\n"
        assets_md += "\n"
    else:
        assets_md = ""
    
    return (metadata + md_content + fields_md + assets_md).encode('utf-8')

2. Add API Endpoint

Add to export_routes.py:

@router.post("/{entity_type}/{entity_id}/markdown")
async def export_markdown(
    entity_type: str,
    entity_id: str,
    include_assets: bool = Query(default=True),
    current_user: User = Depends(get_current_user),
    db: Session = Depends(get_db),
):
    entity = _get_entity_or_404(entity_type, entity_id, db)
    template = _get_template_or_default(entity_type, db)
    
    exporter = get_document_exporter()
    md_bytes = exporter.export_markdown(entity, template, include_assets)
    
    return Response(
        content=md_bytes,
        media_type="text/markdown",
        headers={
            "Content-Disposition": f'attachment; filename="{entity.name}_{entity_type}.md"',
        },
    )

Testing Custom Formats

Unit Testing

# tests/test_export_service.py
 
def test_export_markdown():
    exporter = DocumentExporter()
    
    # Create mock entity
    mock_entity = Mock()
    mock_entity.entity_type = 'character'
    mock_entity.entity_id = 'test-uuid'
    mock_entity.name = 'Test Character'
    mock_entity.status = 'active'
    mock_entity.created_date = datetime(2026, 3, 9)
    mock_entity.last_updated = datetime(2026, 3, 9)
    mock_entity.fields = {'description': 'Test description', 'age': 25}
    mock_entity.assets = []
    
    mock_template = {
        'name': 'Character',
        'frontmatter': {},
        'markdown_body': '# Character\n\nTest content',
    }
    
    result = exporter.export_markdown(mock_entity, mock_template)
    
    assert b'Character' in result
    assert b'25' in result
    assert b'Test description' in result

Integration Testing

# tests/test_export_routes.py
 
def test_export_markdown_endpoint(client, authenticated_user):
    response = client.post(
        '/api/export/character/test-uuid/markdown',
        headers={'Authorization': f'Bearer {authenticated_user.token}'}
    )
    
    assert response.status_code == 200
    assert response.headers['content-type'] == 'text/markdown; charset=utf-8'
    assert b'Test Character' in response.content

Manual Testing

# Test HTML export
curl -X POST \
  -H "Authorization: Bearer YOUR_TOKEN" \
  "http://localhost:8201/api/export/character/uuid/markdown" \
  -o character.md
 
# View result
cat character.md

Error Handling

Common Errors

Entity Not Found

# In _get_entity_or_404()
if not entity:
    raise HTTPException(
        status_code=404,
        detail=f"Entity '{entity_type}/{entity_id}' not found"
    )

Template Not Found

# In _render_html_template()
if not layout:
    logger.warning("Layout template '%s' not found, using default", layout_name)
    layout_name = 'gdd_standard'
    layout = get_layout_template(layout_name)

Missing Dependencies

# In export_pdf()
try:
    from playwright.sync_api import sync_playwright
except ImportError:
    raise RuntimeError(
        "Playwright is required for PDF export. Install with: playwright install"
    )

Logging

All export operations are logged:

logger.info("Export started: %s/%s to %s", entity_type, entity_id, format)
logger.error("Export failed: %s", str(e))

Performance Optimization

Caching

Template data is cached in memory:

self._layout_cache: Dict[str, Dict[str, str]] = {}

Async Processing

For batch exports, consider:

# Use asyncio for concurrent exports
async def export_multiple(entities):
    tasks = [export_single(entity) for entity in entities]
    results = await asyncio.gather(*tasks)
    return results

Image Optimization

For large images:

# Compress before embedding
from PIL import Image
 
def compress_image(image_path, max_size=(800, 800)):
    img = Image.open(image_path)
    img.thumbnail(max_size)
    # Save to bytes...

Security Considerations

Input Validation

All entity types are validated:

ALLOWED_ENTITY_TYPES = ['character', 'location', 'item', 'brand', ...]
 
if entity_type not in ALLOWED_ENTITY_TYPES:
    raise HTTPException(status_code=400, detail="Invalid entity type")

Template Injection Prevention

The simple template engine doesn’t support arbitrary Python code:

# Safe: Only variable substitution
html_content = html_content.replace('{{name}}', entity.name)
 
# Not supported (no eval, no exec)
# html_content = eval(html_content)  # NEVER DO THIS

XSS Prevention

Images are embedded as data URIs:

data_uri = f'data:{mime_type};base64,{image_data}'
html_content = html_content.replace(
    f'src="{file_path}"',
    f'src="{data_uri}"'
)

See Also