Multimodal Story Generation System
Transform visual inputs into structured narratives using cutting-edge AI technologies. This system combines computer vision and large language models to generate dynamic, multi-chapter stories from images.
Features
- 🖼️ Image Analysis - Extract narrative elements from images using LLaVA
- 📖 Adaptive Story Generation - Generate 5-chapter stories with Gemma2-27B
- 🧠 Context Awareness - Maintain narrative consistency with ChromaDB RAG
- 📊 Interactive Visualization - ReactFlow-powered story graph interface
- 🚀 Production Ready - Dockerized microservices architecture
Table of Contents
Quick Start
Local Development Setup
- Clone Repository
git clone https://github.com/kliewerdaniel/ITB02 cd ITB02
- Create Virtual Environment
python -m venv venv source venv/bin/activate # Linux/Mac venv\Scripts\activate # Windows
- Install Dependencies
pip install -r requirements.txt # Apple Silicon Special Setup pip install --pre torch --extra-index-url https://download.pytorch.org/whl/nightly/cpu brew install libjpeg webp
- Initialize AI Models
ollama pull gemma2:27b ollama pull llava
- Start Services
# Backend (FastAPI) uvicorn backend.main:app --reload # Frontend (new terminal) cd frontend npm install && npm run dev
- Verify Installation
curl http://localhost:8000/health # Expected response: {"status":"healthy"}
System Requirements
- Python 3.11+
- Node.js 18+
- Ollama runtime
- 16GB RAM (24GB+ recommended for GPU acceleration)
- 10GB+ Disk Space
Architecture
[Frontend] ←HTTP→ [FastAPI]
↓ ↑
[Ollama] ←→ [ChromaDB]
↓
[Redis]
↓
[Celery Workers]
Key Components
Component | Technology Stack | Function |
---|---|---|
Image Analysis | LLaVA, Pillow | Visual narrative extraction |
Story Engine | Gemma2-27B, LangChain | Context-aware chapter generation |
Knowledge Base | ChromaDB | Narrative consistency management |
API Layer | FastAPI | REST endpoint management |
Visualization | ReactFlow, Zustand | Interactive story mapping |
Production Deployment
Docker Setup
# Build and launch all services
docker-compose up --build
# Initialize vector store
docker exec -it backend python -c "from backend.core.rag_manager import NarrativeRAG; NarrativeRAG()"
Cluster Configuration
# docker-compose.yml excerpt
services:
ollama:
deploy:
resources:
limits:
memory: 12G
cpus: '4'
Troubleshooting
Common Issues
- Missing Vector Store
rm -rf chroma_db && mkdir chroma_db
- Out-of-Memory Errors
export OLLAMA_MAX_LOADED_MODELS=2
- CUDA Compatibility Issues
pip uninstall torch pip install torch --extra-index-url https://download.pytorch.org/whl/cu117
Daniel Kliewer
GitHub Profile
AI Systems Developer