I built this Wan 2.6 prompt template library after a month of back-to-back tests on i2v (image-to-video) runs. If you've ever nailed the look but lost the words, you know the pain: the image is perfect, but the text is wrong. That's the problem I'm here to solve. Below you'll find reusable, production-ready templates that keep motion clean and on-model while preserving readable titles, labels, packaging, and UI. I also call out where these help produce realistic AI images for marketing and, yes, AI images with accurate text (the hardest part).
How to Use Wan 2.6 Prompt Template Library

I treat Wan 2.6 like a camera with a small, picky crew: it needs structure, continuity, and a clear job. When I followed the structure below, my text accuracy and motion stability jumped.
Here's how I run it every time:
- Start from a clean reference image (i2v): consistent lighting, minimal lens distortion, and text at comfortable legibility (don't cram tiny fonts).
- Lock the seed for A/B tests. Change only one variable per iteration (camera, motion, or style).
- Add a small but strong negative prompt. It does more than a long, messy one.
- Keep motion short and purposeful (2–4 seconds) before trying longer.
Prompt Structure Explained
I use a labeled block so I can swap parts quickly. This is the backbone I apply across all templates:
Template scaffold
- Subject: {what/who}, framing {close-up/medium/wide}, environment {indoor/outdoor, time of day}.
- Style: {cinematic/clean/commercial/brand keywords}, color palette {warm/cool}, lighting {soft/hard, key/fill/rim}.
- Camera: {lens in mm}, movement {push-in/pan/dolly/orbit}, speed {slow/steady}, composition {rule of thirds/center}.
- Motion Cues: {hair sway, hand gesture, product rotation, liquid pour}. One motion, one verb.
- Text Fidelity: "preserve legible text on {object/sign/packaging}: {EXACT TEXT}." Use quotes.
- Technical Hints: "sharp details, no flicker, stable facial features, clean edges."
- Negative: see library below (anti-flicker, anti-blur…).
- Parameters I keep consistent in testing:
- Duration: 3–4s for stability
- FPS: 24
- Guidance/CFG: moderate (around 6–8 tends to behave: I lower if overfitting)
- Seed: locked for comparisons
Why this matters: Wan 2.6 respects short, clear verbs for motion and gets confused by stacked actions. If I want "AI images with accurate text," I explicitly tell it what to keep legible and where.
If you want to turn these Wan 2.6 templates into repeatable, client-ready results, try running them inside z-image.ai. It’s built for testing i2v variations, comparing seeds, and checking text fidelity before anything goes live.
i2v Portrait Prompts (5 Templates)
These portrait templates aim for expressive but stable motion. I wrote them to work with clean studio or natural light and to avoid the dreaded face-drift. Great for social intros, talking heads, founder clips, and realistic AI images for marketing where captions or name plates must stay readable.
Subtle Movement

- Subject: {person} in soft window light, medium close-up, neutral background.
- Style: natural skin tone, commercial clean retouch, shallow depth of field (85mm look).
- Camera: slow push-in, steady, centered framing.
- Motion Cues: gentle head nod + eye blink only.
- Text Fidelity: preserve legible lower-third name tag: "{NAME – TITLE}".
- Technical: sharp eyes, smooth skin texture, no face warp.
- Negative: anti-flicker, anti-face-drift, anti-blur.
Talking Head
- Subject: {speaker} in studio, key light + soft fill, medium shot.
- Style: corporate yet friendly, true-to-life color.
- Camera: static camera.
- Motion Cues: natural mouth movement (2–3 words), light hand raise.
- Text Fidelity: keep crisp subtitle card: "{SHORT LINE, 3–5 WORDS}".
- Technical: clear lip sync impression without over-exaggeration.
- Negative: anti-flicker, anti-artifact.
Emotional Expression
- Subject: {person} warm backlight, close-up.
- Style: cinematic, soft contrast, slight film grain.
- Camera: slow push-in.
- Motion Cues: brief inhale + micro-smile (or subtle brow furrow).
- Text Fidelity: preserve quote card at frame edge: "{QUOTE}".
- Technical: steady skin detail, no jitter.
- Negative: anti-face-drift, anti-blur.
Fashion Showcase
- Subject: {model} full-body in seamless studio.
- Style: editorial, high key, crisp fabric detail.
- Camera: left-to-right pan.
- Motion Cues: one-step turn, fabric sway.
- Text Fidelity: preserve brand wordmark on T-shirt: "{BRAND}".
- Technical: sharp edges, accurate garment seams.
- Negative: anti-artifact, anti-flicker.
Professional Headshot
- Subject: {professional} medium close-up, neutral gray backdrop.
- Style: corporate headshot, even light, realistic skin.
- Camera: static, minimal breathing.
- Motion Cues: tiny head tilt + blink.
- Text Fidelity: preserve badge text: "{COMPANY}".
- Technical: natural pores, no plastic look.
- Negative: anti-blur, general quality boost.
Testing note: I compared 35mm vs 85mm looks. 85mm framing gave me fewer edge distortions and better read on badges and lower-thirds. If you're hunting for the best AI image generator for text in portraits, lens-equivalent cues and static camera help more than cranking guidance.
i2v Product Prompts (5 Templates)

Product videos are where text accuracy gets judged the hardest. Labels, packaging, and UI must hold. I kept motion simple and foreground lighting consistent.
Product Rotation
- Subject: {product} on matte turntable, seamless background.
- Style: clean commercial, high micro-contrast.
- Camera: static: product rotates 45–90°.
- Motion Cues: slow, even rotation only.
- Text Fidelity: preserve packaging text and logo: "{BRAND} {MODEL}".
- Technical: shadow grounding, no reflections over text.
- Negative: anti-flicker, anti-artifact.
Product in Use
- Subject: {hand model} interacting with {product}.
- Style: lifestyle, natural window light.
- Camera: gentle push-in.
- Motion Cues: one tap, one slide, or one pour (choose one).
- Text Fidelity: preserve UI/label text on product: "{KEY TEXT}".
- Technical: keep fingertips sharp, avoid motion smear.
- Negative: anti-blur, anti-face-drift (if face on screen).
Luxury Reveal
- Subject: premium {product} with soft top light, dark backdrop.
- Style: high-end, specular highlights controlled.
- Camera: slow dolly from shadow to light.
- Motion Cues: reveal cloth lift or case opening.
- Text Fidelity: preserve embossed logo: "{BRAND}".
- Technical: clean edges, no glittering noise.
- Negative: anti-artifact, anti-flicker.
Tech Demo
- Subject: device on desk with simple UI.
- Style: tech clean, cool palette.
- Camera: static.
- Motion Cues: finger triggers one UI state change.
- Text Fidelity: preserve on-screen UI labels: "{MENU}", "{CTA}".
- Technical: screen moiré minimized, crisp icons.
- Negative: anti-blur, general quality boost.
Food / Beverage Appeal
- Subject: drink bottle/can: chilled, condensation beads.
- Style: appetizing, saturated but believable.
- Camera: slow orbit 20–30°.
- Motion Cues: quick spritz or cap twist.
- Text Fidelity: preserve label nutrition line and brand name: "{BRAND} {FLAVOR}".
- Technical: keep droplets, avoid smear over text.
- Negative: anti-flicker, anti-artifact.
Tip from my runs: If label text starts drifting mid-rotation, shorten duration to 2.5–3s and reduce the orbit angle. That alone rescued a lot of clips I'd normally throw away.
Wan 2.6 Multi-shot Story Prompts (5 Templates)

Multi-shot asks more of the model: continuity, matching light, and text that stays consistent across shots (like a repeating logo). I keep the same seed for each shot block when possible and repeat key phrases.
3-Shot Ad Structure
- Shot 1 (Hook): close-up detail of {product}, static, sharp logo: "{BRAND}".
- Shot 2 (Use): mid shot, hand interaction, single action (tap/pour/zip).
- Shot 3 (CTA): hero angle with lower-third card: "{CTA TEXT}".
- Style: cohesive color and light across all shots.
- Negative: anti-flicker, anti-blur.
- Note: reuse "preserve {BRAND}" in each shot text.
Tutorial Sequence
- Step 1: top-down desk, tools aligned, overlay label: "Step 1".
- Step 2: hands perform one simple action, overlay: "Step 2".
- Step 3: finished result, overlay: "Step 3".
- Style: minimalist, readable typography cues.
- Camera: static or tiny push-in per step.
- Negative: anti-artifact.
Mini Drama
- Scene 1: person discovers {problem}: label on note reads: "{KEY WORD}".
- Scene 2: tries {solution} with product: logo visible.
- Scene 3: reaction close-up, text bubble: "{SHORT LINE}".
- Style: cinematic but not noisy.
- Camera: straightforward blocking, no wild moves.
- Negative: anti-face-drift, anti-flicker.
Day-in-Life
- Morning: coffee cup with brand text: "{BRAND}".
- Work: laptop close-up with clean UI labels: "{APP}".
- Evening: gym bottle label legible: "{BRAND}".
- Style: natural light shifts, consistent LUT.
- Camera: pans between frames.
- Negative: general quality boost.
Transformation Story
- Before: product on cluttered background, small logo visible.
- During: wipe/replace action: motion limited.
- After: clean hero setup, bold label: "{BRAND} {MODEL}".
- Style: commercial reveal, crisp typography.
- Camera: dolly out ending.
- Negative: anti-blur.
I tested variations with and without on-screen text. Repeating the exact logo phrase in each shot ("preserve logo: ‘BRAND'") reduced drift and gave me more consistent clips that are actually usable for campaign edits, exactly what AI tools for designers and marketers need.
Wan 2.6 Style-Specific Prompts (5 Templates)
Style is where people tend to overload prompts. I've had better luck steering with small, strong cues and one lighting statement.
Cinematic / Film Look

- Subject: {subject}, moody backlight, soft haze.
- Style: filmic contrast, subtle grain, rich skin tones.
- Camera: slow push-in, 50mm look.
- Motion Cues: one micro-gesture (blink, glance).
- Text Fidelity: lower-third card: "{TITLE}".
- Negative: anti-flicker, anti-blur.
Anime / Illustration
- Subject: {character}, cel-shaded, clean linework.
- Style: flat colors, controlled highlights, studio anime look.
- Camera: pan left to right.
- Motion Cues: hair sway only.
- Text Fidelity: signage or caption: "{TEXT}".
- Negative: anti-artifact.
Vintage / Retro
- Subject: {scene} with period props.
- Style: 70s photo vibe, warm cast, halation.
- Camera: static.
- Motion Cues: light subject motion (turn, smile).
- Text Fidelity: retro label remains sharp: "{BRAND}".
- Negative: anti-blur.
Minimalist / Clean
- Subject: {object/person} on white, soft shadow.
- Style: high key, clinical sharpness.
- Camera: slow dolly.
- Motion Cues: single gesture or rotation.
- Text Fidelity: crisp product text: "{MODEL}".
- Negative: anti-flicker, general quality boost.
High Energy / Dynamic
- Subject: {athlete/product} with rim light, saturated color.
- Style: punchy contrast, fast cuts feel (but single shot).
- Camera: quick 15–20° orbit.
- Motion Cues: snap movement (jump, toss).
- Text Fidelity: short lockup card: "{TAGLINE}".
- Negative: anti-artifact.
These give me realistic AI images for marketing without devolving into noisy, over-stylized motion. Less is more here.
Camera Movement Prompts (5 Templates)
Camera is a big lever. In my tests, over-ambitious moves caused text drift. Keep it short and specific.
Push In / Zoom
- Move: slow push-in 5–8% scale over 3s.
- Best for: talking heads, hero products.
- Tip: combine with static subject to keep overlay text stable.
Pan Left / Right
- Move: lateral pan across static subject.
- Best for: fashion full-body, desk setups.
- Tip: keep pan under 20° equivalent to protect edge text.
Dolly / Track
- Move: forward/backward track with parallax.
- Best for: luxury reveals, cinematic scenes.
- Tip: lock label phrase ("preserve {BRAND}") to fight perspective smear.
Static with Subject Motion
- Move: none. Subject performs one clear action.
- Best for: UI taps, precise label reads.
- Tip: this wins when you need the best AI image generator for text clarity.
Orbit / 360
- Move: gentle 15–30° orbit.
- Best for: bottles, shoes, gadgets.
- Tip: shorten duration if label drifts: add anti-flicker + anti-artifact.
Negative Prompt Library (5 Templates)

I keep these short. They stack well with the templates above and are tuned to Wan 2.6 i2v behavior in my tests.
Anti-Flicker
- Negative: "no flicker, no jitter, stable exposure, stable colors, consistent lighting."
Anti-Face-Drift
- Negative: "no face warp, no mouth misalignment, stable facial features, no eye mis-track."
Anti-Blur
- Negative: "no motion blur, no smear, sharp edges, crisp micro-contrast."
Anti-Artifact
- Negative: "no ringing, no glitter noise, no compression blocks, clean gradients."
General Quality Boost
- Negative: "no banding, no aliasing, no ghosting: clean edges: preserve text legibility."
If you try these today, start with a simple push-in, lock your seed, and keep a single verb for motion. The image will be right, and this time, the text will be right too.


