Export System Architecture
This document provides a comprehensive overview of the StudioBrain Export System architecture, implementation details, and guidelines for extending the system.
Table of Contents
- Architecture Overview
- Service Architecture
- Template Engine
- PDF Generation Pipeline
- Async Task Processing
- File System Organization
- Adding Custom Export Templates
- Extending Export Formats
- Testing Custom Formats
Architecture Overview
The Export System follows a layered architecture pattern:
┌─────────────────────────────────────────────────────────────┐
│ API Layer (FastAPI) │
│ /backend/routes/export_routes.py │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ /api/export │ │ /api/export │ │ /api/export/{id} │ │
│ │ /layouts │ │ /html │ │ /pdf, /docx │ │
│ └──────────────┘ └──────────────┘ └──────────────────┘ │
└───────────────────────────┬─────────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────────┐
│ Service Layer (Python) │
│ /backend/services/export_service.py │
│ ┌──────────────────────────────────────────────────────┐ │
│ │ DocumentExporter Class │ │
│ │ - export_html() │ │
│ │ - export_pdf() │ │
│ │ - export_docx() │ │
│ │ - _render_entity_data() │ │
│ │ - _embed_images() │ │
│ └──────────────────────────────────────────────────────┘ │
└───────────────────────────┬─────────────────────────────────┘
│
┌───────────────────────────▼─────────────────────────────────┐
│ Template Layer (Jinja2) │
│ /backend/services/export_layouts.py │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ GDD Layout │ │ Script │ │ Style Guide │ │
│ │ Template │ │ Format │ │ Layout │ │
│ └──────────────┘ └──────────────┘ └──────────────────┘ │
└─────────────────────────────────────────────────────────────┘Components
1. API Layer (export_routes.py)
Purpose: RESTful HTTP endpoints for export operations
Responsibilities:
- Validate request parameters
- Authenticate users
- Fetch entity data from database
- Call service layer for export processing
- Return formatted responses
Key Functions:
@router.post("/{entity_type}/{entity_id}/html")
async def export_html(...)
@router.post("/{entity_type}/{entity_id}/pdf")
async def export_pdf(...)
@router.post("/{entity_type}/{entity_id}/docx")
async def export_docx(...)2. Service Layer (export_service.py)
Purpose: Core export logic and document generation
Main Class: DocumentExporter
Responsibilities:
- Entity data rendering
- Template processing
- Image embedding
- Format-specific generation
Key Methods:
def export_html(entity, template, layout_name, theme, embed_images)
def export_pdf(entity, template, layout_name, theme, embed_images)
def export_docx(entity, template, layout_name, theme, embed_images)3. Template Layer (export_layouts.py)
Purpose: Define layout templates and styles
Data Structures:
LIST_OF_LAYOUTS- Available layoutsLAYOUT_TEMPLATES- Template definitions
Key Functions:
def get_layout_template(layout_name)
def get_layout_css(layout_name, theme)
def list_layouts()Service Architecture
DocumentExporter Class
The DocumentExporter class is the heart of the export system.
Initialization
from services.export_service import DocumentExporter
exporter = DocumentExporter(
templates_path=Path("/data/content/_Templates/Export")
)Entity Data Rendering
The _render_entity_data() method transforms entity data into template-ready format:
def _render_entity_data(self, entity, template) -> Dict[str, Any]:
"""
Convert entity to template data structure.
Returns:
{
'entity_type': 'character',
'entity_id': 'uuid',
'name': 'Character Name',
'status': 'active',
'created_date': '2026-03-09',
'last_updated': '2026-03-09',
'fields': {...},
'tags': [...],
'assets': [...],
'primary_asset': {...},
'template': {...},
'frontmatter': {...}
}
"""Entity-Specific Fields
The _add_entity_specific_fields() method adds type-specific data:
Character:
{
'full_name': fields.get('full_name', ''),
'age': fields.get('age', ''),
'occupation': fields.get('occupation', ''),
'affiliation': fields.get('affiliation', ''),
}Location:
{
'location_type': fields.get('location_type', ''),
'district': fields.get('district', ''),
'description': fields.get('description', ''),
}Brand:
{
'brand_name': fields.get('brand_name', ''),
'industry': fields.get('industry', ''),
'founded': fields.get('founded', ''),
}Image Embedding
The _embed_images() method converts image references to base64 data URIs:
def _embed_images(self, html_content: str, entity: Any) -> str:
"""
Convert image references to base64 data URIs.
Args:
html_content: HTML with image src references
entity: Entity with assets
Returns:
HTML with embedded base64 images
Supported formats:
- PNG (image/png)
- JPEG (image/jpeg)
- GIF (image/gif)
- WebP (image/webp)
- SVG (image/svg+xml)
"""Process:
- Iterate through entity assets
- Read image file from disk
- Encode as base64
- Determine MIME type from file extension
- Replace
src="path/to/image.png"withsrc="data:image/png;base64,..."
Template Engine
Simple Template Rendering
The export system uses a custom template engine (not full Jinja2) that supports:
Variable Substitution
# Replace {{variable}} with value
html_content = html_template.replace('{{name}}', render_data['name'])Supported patterns:
{{variable}}- Simple variable{{fields.field_name}}- Nested field access{{entity_type|title}}- Filter support
Loop Support
# Handle lists (e.g., tags)
list_html = ''.join(f'<span class="tag">{item}</span>' for item in value)
html_content = html_content.replace('{{tags}}', list_html)Template Structure
A complete template consists of:
HTML Template (.html.jinja)
<div class="document">
{/* Cover Page */}
<div class="cover-page">
<h1 class="document-title">{{name}}</h1>
<h2 class="document-subtitle">{{entity_type|title}} Document</h2>
<p class="status">Status: {{status|title}}</p>
</div>
{/* Content Sections */}
<div class="content">
<section id="overview">
<h1>Overview</h1>
<p>{{fields.description}}</p>
</section>
<section id="details">
<h1>Details</h1>
<table class="details-table">
{% for key, value in fields.items() %}
<tr>
<th>{{key|replace('_', ' ')|title}}</th>
<td>{{value}}</td>
</tr>
{% endfor %}
</table>
</section>
</div>
</div>CSS Template (.css)
:root {
--primary-color: #2c3e50;
--secondary-color: #3498db;
--font-family: 'Segoe UI', sans-serif;
}
.document {
font-family: var(--font-family);
color: var(--text-color);
max-width: 210mm;
padding: 20mm;
}
.cover-page {
page-break-after: always;
text-align: center;
}Template Loading
Templates are loaded at startup:
def _load_layout_templates(self):
"""Load all layout templates from files."""
for layout_name in LIST_OF_LAYOUTS:
html_file = self.templates_path / f"{layout_name}.html.jinja"
css_file = self.templates_path / f"{layout_name}.css"
layout_data = {}
if html_file.exists():
with open(html_file, 'r') as f:
layout_data['html'] = f.read()
if css_file.exists():
with open(css_file, 'r') as f:
layout_data['css'] = f.read()
self._layout_cache[layout_name] = layout_dataPDF Generation Pipeline
The PDF export pipeline uses Playwright headless browser for accurate rendering.
Pipeline Steps
Step 1: HTML Generation
html_content = self.export_html(
entity=entity,
template=template,
layout_name='gdd_standard',
theme='default',
embed_images=True
)Step 2: Browser Setup
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()Step 3: Content Rendering
page.set_content(html_content, wait_until='networkidle')Options:
wait_until='networkidle'- Wait for network to be idlewait_until='domcontentloaded'- Wait for DOM content loadedwait_until='load'- Wait for complete page load
Step 4: PDF Generation
pdf_bytes = page.pdf(
format='A4',
print_background=True,
margin={
'top': '20mm',
'right': '20mm',
'bottom': '20mm',
'left': '20mm',
}
)Print Optimization
CSS @media print rules are applied:
@media print {
.document {
padding: 0;
max-width: none;
}
.cover-page,
.toc-page {
page-break-after: always;
}
.section {
page-break-inside: avoid;
}
}Requirements
# Install Playwright browser binaries
playwright install chromium
# Install Python package (usually in requirements.txt)
pip install playwrightAsync Task Processing
For large exports or batch operations, the export system supports async task processing (future enhancement).
Task Structure
{
"task_id": "uuid",
"entity_type": "character",
"entity_id": "uuid",
"format": "pdf",
"status": "processing", # pending, processing, completed, failed
"progress": 0.5, # 0.0 to 1.0
"created_at": "2026-03-09T00:00:00Z",
"started_at": "2026-03-09T00:00:01Z",
"completed_at": "2026-03-09T00:00:10Z",
"result": {
"download_url": "/api/export/download/task-uuid",
"file_size": 1048576,
"filename": "character.pdf"
}
}Task Workflow
File System Organization
Directory Structure
/data/content/
├── _Templates/
│ └── Export/
│ ├── gdd_standard.html.jinja
│ ├── gdd_standard.css
│ ├── script_format.html.jinja
│ ├── script_format.css
│ ├── style_guide.html.jinja
│ └── style_guide.css
│
├── characters/
│ └── uuid-1/
│ └── markdown.md
│
└── assets/
└── images/
└── character-portrait.pngTemplate Files
Each layout requires two files:
.html.jinja- HTML template structure.css- CSS styling (themes in:rootvariables)
Asset Storage
Entity assets are stored in:
/data/content/assets/{asset_type}/{entity_id}/Or globally:
/data/content/assets/{file_path}Content Base Path
The CONTENT_BASE_PATH environment variable defines the root:
# From shared.py
CONTENT_BASE_PATH = Path(os.environ.get('CONTENT_BASE_PATH', '/data/content'))Adding Custom Export Templates
Step 1: Create Layout Files
Create two files in _Templates/Export/:
my_layout.html.jinja:
<div class="document">
<div class="cover">
<h1>{{name}}</h1>
<p>{{entity_type|title}} Document</p>
</div>
<div class="content">
{{fields.description}}
</div>
</div>my_layout.css:
:root {
--primary-color: #4a90d9;
--font-family: 'Arial', sans-serif;
}
.document {
font-family: var(--font-family);
padding: 20mm;
}
.cover {
text-align: center;
margin-bottom: 30mm;
}
.cover h1 {
font-size: 36pt;
color: var(--primary-color);
}Step 2: Register the Layout
Add to LIST_OF_LAYOUTS in export_layouts.py:
LIST_OF_LAYOUTS = [
'gdd_standard',
'script_format',
'style_guide',
'my_layout', # New layout
]Step 3: Update Template Registry
Add to LAYOUT_TEMPLATES dictionary:
LAYOUT_TEMPLATES = {
'gdd_standard': {...},
'script_format': {...},
'style_guide': {...},
'my_layout': {
'html': MY_LAYOUT_HTML, # Load from file or inline
'css': {
'default': MY_LAYOUT_CSS,
},
'description': 'My custom layout',
'features': ['cover', 'content'],
},
}Step 4: Restart Service
The template will be automatically loaded on next request or after service restart.
Extending Export Formats
Adding a New Format (e.g., Markdown)
1. Create Export Method
Add to DocumentExporter class:
def export_markdown(
self,
entity: Any,
template: Dict[str, Any],
include_assets: bool = True,
) -> bytes:
"""
Export entity to Markdown format.
"""
# Get template
md_content = template.get('markdown_body', '')
# Add entity metadata
metadata = f"""---
title: {entity.name}
entity_type: {entity.entity_type}
status: {entity.status}
created: {entity.created_date.isoformat() if entity.created_date else 'N/A'}
last_updated: {entity.last_updated.isoformat() if entity.last_updated else 'N/A'}
---
"""
# Add fields section
fields_md = "\n## Fields\n\n"
for key, value in (entity.fields or {}).items():
fields_md += f"### {key.replace('_', ' ').title()}\n\n"
fields_md += f"{value}\n\n"
# Add assets section
if include_assets and entity.assets:
assets_md = "\n## Assets\n\n"
for asset in entity.assets:
assets_md += f"- **{asset.get('file_name', 'Asset')}**: {asset.get('description', '')}\n"
if asset.get('file_path'):
assets_md += f" - Path: `{asset['file_path']}`\n"
assets_md += "\n"
else:
assets_md = ""
return (metadata + md_content + fields_md + assets_md).encode('utf-8')2. Add API Endpoint
Add to export_routes.py:
@router.post("/{entity_type}/{entity_id}/markdown")
async def export_markdown(
entity_type: str,
entity_id: str,
include_assets: bool = Query(default=True),
current_user: User = Depends(get_current_user),
db: Session = Depends(get_db),
):
entity = _get_entity_or_404(entity_type, entity_id, db)
template = _get_template_or_default(entity_type, db)
exporter = get_document_exporter()
md_bytes = exporter.export_markdown(entity, template, include_assets)
return Response(
content=md_bytes,
media_type="text/markdown",
headers={
"Content-Disposition": f'attachment; filename="{entity.name}_{entity_type}.md"',
},
)Testing Custom Formats
Unit Testing
# tests/test_export_service.py
def test_export_markdown():
exporter = DocumentExporter()
# Create mock entity
mock_entity = Mock()
mock_entity.entity_type = 'character'
mock_entity.entity_id = 'test-uuid'
mock_entity.name = 'Test Character'
mock_entity.status = 'active'
mock_entity.created_date = datetime(2026, 3, 9)
mock_entity.last_updated = datetime(2026, 3, 9)
mock_entity.fields = {'description': 'Test description', 'age': 25}
mock_entity.assets = []
mock_template = {
'name': 'Character',
'frontmatter': {},
'markdown_body': '# Character\n\nTest content',
}
result = exporter.export_markdown(mock_entity, mock_template)
assert b'Character' in result
assert b'25' in result
assert b'Test description' in resultIntegration Testing
# tests/test_export_routes.py
def test_export_markdown_endpoint(client, authenticated_user):
response = client.post(
'/api/export/character/test-uuid/markdown',
headers={'Authorization': f'Bearer {authenticated_user.token}'}
)
assert response.status_code == 200
assert response.headers['content-type'] == 'text/markdown; charset=utf-8'
assert b'Test Character' in response.contentManual Testing
# Test HTML export
curl -X POST \
-H "Authorization: Bearer YOUR_TOKEN" \
"http://localhost:8201/api/export/character/uuid/markdown" \
-o character.md
# View result
cat character.mdError Handling
Common Errors
Entity Not Found
# In _get_entity_or_404()
if not entity:
raise HTTPException(
status_code=404,
detail=f"Entity '{entity_type}/{entity_id}' not found"
)Template Not Found
# In _render_html_template()
if not layout:
logger.warning("Layout template '%s' not found, using default", layout_name)
layout_name = 'gdd_standard'
layout = get_layout_template(layout_name)Missing Dependencies
# In export_pdf()
try:
from playwright.sync_api import sync_playwright
except ImportError:
raise RuntimeError(
"Playwright is required for PDF export. Install with: playwright install"
)Logging
All export operations are logged:
logger.info("Export started: %s/%s to %s", entity_type, entity_id, format)
logger.error("Export failed: %s", str(e))Performance Optimization
Caching
Template data is cached in memory:
self._layout_cache: Dict[str, Dict[str, str]] = {}Async Processing
For batch exports, consider:
# Use asyncio for concurrent exports
async def export_multiple(entities):
tasks = [export_single(entity) for entity in entities]
results = await asyncio.gather(*tasks)
return resultsImage Optimization
For large images:
# Compress before embedding
from PIL import Image
def compress_image(image_path, max_size=(800, 800)):
img = Image.open(image_path)
img.thumbnail(max_size)
# Save to bytes...Security Considerations
Input Validation
All entity types are validated:
ALLOWED_ENTITY_TYPES = ['character', 'location', 'item', 'brand', ...]
if entity_type not in ALLOWED_ENTITY_TYPES:
raise HTTPException(status_code=400, detail="Invalid entity type")Template Injection Prevention
The simple template engine doesn’t support arbitrary Python code:
# Safe: Only variable substitution
html_content = html_content.replace('{{name}}', entity.name)
# Not supported (no eval, no exec)
# html_content = eval(html_content) # NEVER DO THISXSS Prevention
Images are embedded as data URIs:
data_uri = f'data:{mime_type};base64,{image_data}'
html_content = html_content.replace(
f'src="{file_path}"',
f'src="{data_uri}"'
)