Hi, I'm Dora. I built this "product photo image to image" workflow because I kept seeing the same problem: the image was right, but the text was wrong. If you sell, pitch, or design for real clients, you can't have melted labels, fuzzy fonts, or fake-looking shadows. In the first 100 words, let's get practical. I'll show you a tested, step-by-step process to turn a decent product shot into a polished, realistic AI image for marketing, including how I keep text accurate and materials intact. If you've been hunting for the best AI image generator for text, I'll also explain where each tool fits and where it doesn't. If you just want to test a product photo image-to-image tool quickly before diving into the full workflow, tools like Z-Image can get you a clean first pass without heavy setup.

Product Photo Img2Img Workflow 2.png

What You’ll Achieve

  • Clean, studio-quality product shots from your base image, quick and repeatable.
  • Accurate, readable text on labels, packaging, and tags (no melted letters).
  • Realistic shadows and reflections that match your light direction.
  • Preserved materials and micro-textures: glass stays glass, paper stays paper, brushed metal still looks like metal.
  • A modular setup you can apply across tools (Stable Diffusion, Adobe Firefly, Midjourney) and adapt for different campaigns.
Product Photo Img2Img Workflow 3.png

I tested this on everyday items (bottles, boxes, jars, cosmetics, apparel tags). It works best when you start with a sharp, well-lit photo. When clients ask for realistic AI images for marketing, this is the workflow I lean on.

Best Input Product Photo Checklist

Before any product photo image to image run, I audit the source photo. Yes, this saves me 30–50% of cleanup.

  • Lighting: Soft, single key light at ~45Β° angle. Avoid mixed color temps. No heavy color casts.
  • Background: Neutral gray or white. Busy backgrounds confuse the model.
  • Framing: Leave margin around the product for clean crops and shadow generation.
  • Sharpness: Shutter >1/125s, f/8–f/11 if you're shooting it yourself. No motion blur.
  • Exposure: Slightly underexposed is safer than blown highlights, specular highlights on glass are hard to recover.
  • Label/Text: If the product has text, ensure the original is readable. Garbage in β†’ garbage out.
  • File: 2048 px on the long edge minimum: 16-bit PNG is ideal.

I also record the seed when the tool allows it. Seed control lets me iterate on small changes without losing the overall look, essential for campaign consistency.

Step-by-Step Product Workflow for Product Photo Image to Image

Product Photo Img2Img Workflow 4.png

Here's the compact version of my tested workflow. I'll note settings where relevant.

1. Prep and crop

  • Duplicate your original. Non-destructive edits only.
  • Crop to target aspect (1:1 or 4:5 for storefronts). Keep 10–15% breathing room around the product.

2. First img2img pass (structure lock)

  • Strength/Denoise: 0.25–0.35 (Stable Diffusion) to preserve shape and labels.
  • Guidance/CFG: 5–7. Lower values keep your original closer.
  • Sampler: DPM++ 2M Karras or Euler a, both are stable for product surfaces.
  • Prompt: "clean studio product photography, accurate label text, crisp edges, true-to-life materials, commercial lighting, minimal post"
  • Negative prompt: "warped text, melted labels, extra reflections, fingerprints, excessive contrast, extreme stylization"

3. Lighting and background refinements

  • If shadows look off, run a light-only pass (see Shadow Direction below) with denoise 0.15–0.25.
  • For white seamless, generate a soft ground shadow and keep falloff subtle.

4. Label/text correction

  • If the model drifts: inpaint just the label area with denoise 0.2–0.35.
  • For critical text, I often comp the real label as a final step (vector/PDF), clients care about exact typefaces. This is the most reliable way to deliver AI images with accurate text.

5. Final polish

  • Micro-contrast with clarity or local dodge/burn.
  • Check for color consistency across variants: lock seed when possible.

Tool notes

  • Stable Diffusion (SDXL/SD 1.5): Best control via img2img + inpaint + ControlNet Reference/Tile. Strong for labels.
  • Adobe Firefly: Great for background cleanup and realistic shadows. Text is improving, but I still verify.
  • Midjourney: Beautiful renders via image prompts, but less direct control for precise label text.

In practice, Z-Image sits in a different lane than SDXL or Midjourney. It’s fast, predictable, and good for everyday product shots where readable text and clean lighting matter more than full prompt control. For a quick practical overview of how Z-Image delivers fast, HD product-oriented results with readable text and minimal setup, see β€œZ-Image AI: Free & Fast AI Image Generator” β€” it breaks down its workflow, text handling strengths, and simple prompt templates.

White Background Prompt for Clean Product Shots

Use this after your structure-lock pass:

Prompt: "e-commerce studio shot on seamless white background, soft gradient floor, natural ground shadow, high-CRI lighting, no halo, no cutout edges"

Settings: Denoise 0.2–0.3: Guidance 5–6. Keep negative prompt: "floating product, hard rim, blown highlights, fake reflection".

Shadow Direction Control to Maintain Realism

I match the original light when possible. If I need to change it:

Prompt: "key light from left at 45Β°, softbox 70cm, gentle falloff to the right, single shadow, no multiple light sources"

Tip: Add "shadow length proportional, contact shadow anchored under base." If reflections are needed (glass), include "subtle planar reflection on glossy surface".

Preserve Materials and Texture During Editing

  • Lower denoise. Texture dies above ~0.4.
  • Add material terms: "matte paper label with slight tooth", "brushed aluminum", "translucent PET", "glossy glass with soft specular".
  • Use ControlNet Reference to keep label layout. For boxes, Reference + Tile helps preserve edges.
  • If you see plastic where it should be glass, add: "index-of-refraction accurate highlights, not plastic". Small detail, big difference.

6 Product Templates (copy prompts)

Copy, paste, and swap brackets. I keep denoise at 0.25–0.35 unless noted.

1. Bottle on white

"clean studio bottle product photo, [material: frosted glass], [label: matte paper], accurate text, seamless white background, soft ground shadow, key light 45Β° left, f/8, 85mm look"

Product Photo Img2Img Workflow 5.png

2. Box with emboss

"premium packaging box, [color], subtle emboss logo, accurate typography, soft gradient background, single softbox, realistic contact shadow, commercial catalog style"

3. Jar with metal lid

"cosmetic jar with [finish: brushed aluminum lid], clear glass, readable label text, controlled reflections, white sweep, gentle falloff, no hotspots"

4. Apparel tag

"close-up apparel hangtag, textured kraft paper, crisp printed text, macro studio lighting, shallow shadow, neutral gray background, true fibers visible"

5. Can with reflection

"aluminum can, correct cylindrical label warp, legible text, soft front light, subtle planar reflection on glossy surface, no condensation, product catalog"

Product Photo Img2Img Workflow 6.png

6. Lifestyle swap (lower control)

Denoise 0.4–0.5 for environment change.

"on-counter kitchen scene, natural window light from right, product remains unchanged, realistic shadow integration, no props blocking label, balanced color"

These templates work well for AI tools for designers who need repeatable results under time pressure.

Common Failures (melted labels, weird reflections)

What goes wrong when you push too hard:

  • Melted or wavy text: Denoise too high: no layout reference. Fix with inpaint + lower denoise, or composite real label.
  • Double shadows: Multiple light cues in the prompt. Keep a single key light.
  • Plastic-y glass: Add material constraints and reduce contrast boosts.
  • Wrong brand colors: White balance drift. Lock WB and correct in post.
  • Weird reflections: Too many light sources in prompt or glossy backgrounds. Simplify and control angles.

I share failures because they're how we shave hours off the process.

FAQ for Commercial Use of Product Photo Image to Image

  • Can I use these images for ads? Yes, if your base assets and model license allow commercial use. Check the model's terms (e.g., SDXL local models are typically fine: some cloud tools restrict logos or sensitive subjects).
  • How do I guarantee accurate text? For regulated packaging or legal copy, I inpaint the label area lightly and, if needed, overlay the exact vector/PDF label. It's the most reliable way to get AI images with accurate text.
  • What about trademarks and logos? Use assets you own or have rights to. Avoid generating known logos from scratch.
  • Which tool is best? For text control, Stable Diffusion with ControlNet is my current pick for the best AI image generator for text: Firefly is great for cleanup: Midjourney shines for mood/lifestyle when text precision is less critical.
Product Photo Img2Img Workflow 6.png
  • What resolution should I export? 2048–4096 px on the long edge for web ads: 300 DPI at print size for packaging mocks.

Try It Now – Start Your Product Image-to-Image Workflow

Grab one solid product shot and run the structure-lock pass at denoise 0.3 with the clean studio prompt. Then fix the shadow, and only then touch the label. Small moves, fast checks. If you get stuck, send me the before/after and your settings, I'll point to the exact step that slipped. That's the fun part for me.