User Guide

AI Models

Complete reference of all AI models available in OpenStory

OpenStory integrates with a wide range of AI models across four categories: script analysis, image generation, motion/video generation, and music/audio generation. All media models are accessed via Fal.ai, while script analysis uses OpenRouter.

Script Analysis Models

These LLM models analyze your script, extract scenes, characters, and locations, and generate prompts. You can select multiple models to generate parallel sequences for comparison.

ModelProviderContext WindowLicense
Grok 4.1 FastxAI2M tokensProprietary
Claude Sonnet 4.6Anthropic1M tokensProprietary
Grok 4.2xAI2M tokensProprietary
Claude Opus 4.6Anthropic1M tokensProprietary
Mistral Small 4Mistral262K tokensOpen Source (Apache 2.0)
DeepSeek V3.2DeepSeek164K tokensOpen Source (MIT)
GLM-5Z.ai203K tokensOpen Source (MIT)
Gemini 3.1 ProGoogle1M tokensProprietary
GPT-5.4OpenAI1M tokensProprietary
Gemini 3 FlashGoogle1M tokensProprietary
GPT-5.4 MiniOpenAI400K tokensProprietary
Seed 2.0 MiniByteDance262K tokensProprietary
GPT-5.4 NanoOpenAI400K tokensProprietary

Image Generation Models

These models create the visual images for each scene. You can select multiple models to generate variant images for comparison.

ModelProviderLicenseNotes
Nano Banana 2GoogleProprietaryFast generation and editing (default)
Nano Banana ProGoogleProprietaryEnhanced realism and typography
Grok Imagine ImageGrokProprietaryAesthetic with low censoring
FLUX.2 MaxBlack Forest LabsProprietaryExceptional realism
PhotaPhotaProprietaryCharacter consistency via profiles
Hunyuan Image v3TencentOpen SourceStrong composition
FLUX.2 DevBlack Forest LabsOpen Source32B open weights with native editing
Qwen Image 2 ProAlibabaOpen Source (Apache 2.0)Native 2K, text rendering
HiDream I1HiDreamOpen Source (MIT)17B parameters
Seedream 5ByteDanceProprietaryUnified generation and editing

Edit Endpoints

Most image models support reference image editing via dedicated edit endpoints. This allows the AI to use character and location reference images when generating scenes, improving visual consistency.

Motion/Video Models

These models animate still images into video clips.

ModelProviderEst. TimeLicenseNotes
LTX 2.3 ProLightricks~15sOpen SourceBest quality ranking
Veo 3.1Google~25sProprietary20K max prompt length
Kling v3 ProKling~20sProprietaryDefault model
Grok Imagine VideoGrok~20sProprietary
MiniMax Hailuo 02MiniMax~15sProprietary
Seedance 1.5 ProByteDance~12sProprietary4K max prompt
Seedance 2ByteDance~20sProprietaryAnimation styles only

Aspect Ratio Compatibility

Not all motion models support all aspect ratios. OpenStory automatically filters to show only compatible models and will switch to a compatible default if your current model doesn't support the selected ratio.

Audio Support

Some motion models can generate audio alongside video. OpenStory checks each model's capabilities to determine audio support.

Music & Audio Models

ModelProviderMax DurationTypeLicense
ElevenLabs MusicElevenLabs600s (10 min)MusicProprietary
MiniMax Music v2MiniMax300s (5 min)MusicProprietary
ACE-Step 1.5ACE Studio240s (4 min)MusicOpen Source
Lyria 2Google30sMusicProprietary
MMAudio V2MMAudio8sSFXOpen Source
ElevenLabs SFXElevenLabs22sSFXProprietary

Music vs. Sound Effects

Music models generate background music tracks from text prompts and optional tags. SFX models generate short sound effects — MMAudio V2 is unique in that it can generate audio from video input (video-to-audio).

Capabilities

FeatureElevenLabs MusicMiniMax v2ACE-StepLyria 2
Prompt-basedYesYesYesYes
Lyrics supportNoYesYesNo
InstrumentalYesYesYesYes
Long-formYes (10 min)Yes (5 min)Yes (4 min)No (30s)