Wan 2.7 Reference to Video is Now on Segmind: Character-Consistent Video from Any Photo
Generate character-consistent AI videos from reference images at up to 1080P. Wan 2.7-R2V is now live on the Segmind API.
Search interest in AI video generation has been climbing steadily since the start of 2026, but most tools still struggle with one fundamental problem: keep a face consistent across frames. Every time you try to generate a branded spokesperson, a recurring character for a YouTube series, or a digital actor for a film pre-visualization, you get a different person in every clip. Wan 2.7 Reference to Video solves that.
What is Wan 2.7 Reference to Video?
Wan 2.7-R2V is a video generation model built specifically for character-consistent outputs from reference images. You pass in one or more portrait photos, write a scene description, and get back a video where that person is doing exactly what you described, in your chosen environment, at up to 1080P resolution. It also supports multi-subject inputs (so you can have two distinct characters in the same scene) and voice cloning, where the character in the video can be made to speak in a specific voice using a reference audio clip. I ran it across a set of industry scenarios and the character fidelity is noticeably better than general-purpose text-to-video models for this specific task.
What you can build with it
- Marketing agencies can generate product spokesperson clips, lifestyle brand walkthroughs, and ad-style videos from a single photo of a model or brand ambassador, at a fraction of the cost of a shoot.
- Film studios and VFX teams can run rapid character pre-visualization, placing a digital actor into a scene and iterating on direction and lighting before committing to production time.
- Production houses and MCNs can create custom YouTube channel intros, talking head segments, and recurring on-screen characters at scale without repeat shoots.
See it in action
Here is a sample I generated using a single reference image. The model keeps the character's features stable across all five seconds of output.
Wan 2.7-R2V output — character-consistent garden walk from a single reference image
Get started on Segmind
Wan 2.7-R2V is live on the Segmind API right now. 720P clips come in at $0.625 per request, 1080P at $0.9375. No infrastructure to set up, no queue to manage. You call the endpoint, pass in your reference image URL and a prompt, and get back a ready-to-use MP4 in seconds.
import requests
response = requests.post(
"https://api.segmind.com/v1/wan2.7-r2v",
headers={"x-api-key": "YOUR_API_KEY"},
json={
"prompt": "Image1 walks into a bright product launch event, gestures at camera, confident smile",
"reference_images": '["https://your-cdn.com/your-photo.jpg"]',
"resolution": "720P",
"duration": 5,
"seed": 42
}
)
with open("output.mp4", "wb") as f:
f.write(response.content)
Check the full docs and try it live at segmind.com/models/wan2.7-r2v.