When Motion Control Becomes a Lie Machine: Generative Filmmaking on the Edge of Truth and Falsehood

With Motion Control, Kling shows how quickly “just a cool meme” can turn into a convincing forgery. And suddenly film is no longer just illusion—it becomes a real identity problem.

By Thomas Fenkart · 6 min read

For a pretty long time in film production, we’ve gotten used to the idea that “realistic” often simply means: expensive. Expensive sets, expensive lighting, expensive faces. And if you wanted to cheat, you needed time: VFX people, editing, sound, approvals, then another edit. That whole “we’ll make it work somehow,” just… with a calendar and a budget. Now I’m sitting in front of an interface, uploading an image plus a short video as motion reference—and a video model calculates how my subject moves. Not in a rough way, but with unsettling precision. Kling calls it Motion Control. And yes, I catch myself thinking: this isn’t just “better.” It feels different. Like one of those moments where generative filmmaking… shifts. The issue isn’t that we can suddenly produce more content. The issue is: it’s going to get harder and harder to prove what actually happened. Motion Control sounds like a tool, but it’s really a trick with leverage The concept is simple at first: you give Kling a reference video that defines the movement, and an image that represents “your” subject. Kling then takes the motion from the video and transfers it onto the image subject. The Motion Control guide lays this out pretty clearly as a workflow: reference video as the motion source, image as the target subject, plus prompting/parameters to narrow down the result and style. That combination of motion constraint and identity constraint is what makes it so powerful. And yes—exactly for that reason, so easy to abuse. Because if motion and identity can be “plugged in” separately, then “making someone do something” suddenly becomes nothing more than assembling inputs. And if we’re honest: we wanted this. For previs, animated storyboards, fast iterations. For that classic “can you do the shot once more with more energy?” Now you can. In minutes. A few days ago, out of curiosity, I built a clip exactly like that—nothing crazy, just a still image, a reference video, some parameter tweaking. And for a moment I was pleased. Then the second thought hit: hold on. This is… too good for what it is. Of course it won’t stay in previs. Over the past few weeks I’ve seen enough clips being passed around on social media: characters get swapped, and suddenly a well-known person is in a scene they were never in. It no longer feels like deepfake-2019 (shaky, uncanny, kind of funny). It feels like: damn real. Here are two examples: [embed:https://x.com/AIWarper/status/2011491633649631436] [embed:https://x.com/pabloprompt/status/2003820745265463633] And yes—also damn dangerous. Although “dangerous” already sounds so grand, and that’s exactly what’s annoying about it: you don’t even need some massive conspiracy. Even meme-level content is enough to scratch away at trust. If every video is potentially “just” a clever motion transfer, then every real recording becomes automatically suspect. And at the same time, every fake becomes easier to sell, because it blends into the normal noise. The political doomsday scenario here isn’t science fiction—it’s timing. Not because someone might someday make fake videos. That was always obvious. What’s unsettling is how perfectly it fits our current climate: everywhere it’s already “Fake!”, “Manipulation!”, “Propaganda!” If you then have a system that builds a scene from a photo plus reference motion—one that looks like it was “leaked,” “shot on a phone,” “randomly captured”—then the threshold for escalation is ridiculously low. And there’s a second effect that almost worries me more: even when a fake gets exposed, something sticks. Not necessarily the lie, but the reflex: “You can’t believe anything anymore.” I’m not sure we’ve fully grasped how badly that reflex can break public debate. What makes me especially nervous: the typical social-media flow doesn’t reward truth—it rewards the clip. And Motion Control is, at its core, a clip machine. “Let’s shoot the deceased star one more time”—yes, that’s possible now Then there’s this morally sticky corner: bringing long-dead performers back to life. The technology is seductive—even from a kind of cinematic romanticism: an iconic face showing up again in a new scene. Maybe in a truly great film. Maybe with respect. Maybe as a tribute. But whether that person would have wanted it remains unanswered. And even if an estate signs off—what does consent even mean here? Consent to a specific film? To a particular scene? Or just to the possibility that this face now sits in a library like an asset? I can feel myself tipping back and forth: part of me thinks “artistic freedom,” another part thinks “this is… weird.” Motion Control amplifies this because it’s not just about a still image or an “avatar that says something,” but about physicality. Gestures. Timing. Body language. In other words, exactly the things we unconsciously read as “authentic.” As an aside: for security processes that rely on photos or video-based ID, this is a kind of worst case. If a system is designed to verify “a person,” and at the same time we get tools that can freely combine identity and motion, you’re not picking a lock—you’re making a key. And the core principle stays brutal: the better the synthesis, the less useful “looks real” is as a criterion. Note: currently, a 30-second video still takes around 8 minutes to render. As early as 2027, it could be created in real time. Example: [embed:https://x.com/levelsio/status/2012205057521902041] And that puts us in front of an uncomfortable question: as an industry (film/audio/music), do we really want to keep moving toward “everything is generatable” without, in parallel, building a culture of provenance, labeling, and signatures? I notice this internal contradiction in myself: as a filmmaker, I love control. As a software entrepreneur, I want to put the best tools into our customers’ hands. As a citizen, I need trust. That doesn’t fit together neatly—at least not without new rules, new standards, new habits. Maybe that’s the new dividing line of competence: not only who can generate the best images, but who can credibly show where they came from. And anyone who can’t will eventually have a hard time—not because of quality, but because of belief. And when I look at how quickly Motion Control clips get shared today as “haha funny”: when does that tip into “wait, that really happened, didn’t it”? Or worse: “I saw it, so it must be true.” I don’t know if we’re prepared for that. For anyone interested in the topic, here’s a video that shows what’s possible: https://www.youtube.com/watch?v=O-WFLK3em5I