Happy Horse 1.0 Features

Apr 7, 2026

Joint Video + Audio Synthesis

Stop syncing by hand. Dialogue, ambient sound, and foley are all generated together, perfectly in phase.

Happy Horse 1.0's core architecture treats video and audio as one unified sequence. This means no more stitching audio tracks or fighting with cross-attention modules. Dialogue, ambient sounds, and foley effects are generated simultaneously in a single pass.

Global Lip Sync

Speak to the world in their language. 7 languages supported with phoneme-perfect accuracy.

LanguageSupported
English
Mandarin
Cantonese
Japanese
Korean
German
French

8-Step Fast Rendering

From prompt to preview in ~38 seconds. No more waiting around for high-quality renders.

Using 8-step DMD-2 distillation on an H100 GPU, you can get a 1080p video in about 38 seconds. With MagiCompiler, it's even faster.

Multi-Shot Consistency

Your characters stay exactly as you designed them. Consistent identity across every cut and scene.

Maintain character identity and scene continuity across an entire sequence. No jarring cuts, no flickering faces—just a coherent story from the first frame to the last.

15B Sandwich Transformer

40 layers of architectural brilliance. It understands the difference between a camera pan and a character turn.

The 40-layer unified sandwich Transformer architecture handles video and sound as one seamless flow. Modality-specific and shared layers work together to deliver exceptional motion realism.

Open Source Commitment

The code, the weights, the future. Everything you need to build on top of it.

The team has committed to releasing the full open-source package, including the base model, distilled versions, and inference code by mid-2026.

Image to Video

Animate anything. Give life to products, concepts, and memories with a single click.

Text to Video

Describe it and watch it come to life. From a rough idea to a polished video in minutes.

World Model Physics

Explosions feel heavy. Liquids flow naturally. Motion respects the physical world.

Built for complex, multi-layered scenes. Generate realistic explosions, particle debris, and chaotic weather with frame-perfect consistency.