I’m currently exploring the new feature Kling released alongside their 2.0 Master model. Multi Elements is still running on the 1.6 model for now, but I’m sure it’s only a matter of time. In the meantime, I’m diving into what this new feature can do.

One of the users in the Kling Discord server had a usecase where he wanted a viking woman hold a yar. And I gave him the advice to try Multi Elements. And it got me curious how it would work so tried it myself. I learned a lot, and I want to share that with you all to understand mulit elements better.

First I took the material of the user, but since I don’t want to repost his ideas here, I decided look for material on pexels.com from Ron Lach where I downloaded this video:

Next, I designed a tube, and for that I used ChatGPT, since Kling’s image feature still struggles a bit with rendering text. They’re making progress, but for now my personal preference is either ChatGPT or Ideogram.

I was pretty sure this would be a one-shot success. But… yeah. See for yourself how that turned out:

swap the tube with cream from @Reference Video the tube from  @Image1 , a hand squeezing a small amount of white cream from a metallic tube top onto the fingertip of the other hand. The tube is held firmly, angled downward, and the cream forms a soft, rounded dollop on the fingertip.

This video was actually really helpful—it taught me that getting it right means paying close attention to the details. The metal screw cap on the tube in the video is different from this plastic click-on cap, which is why my version didn’t work. So I had to create a new tube, just like the one in the video, with a metal screw cap and the lid already off

But it seems the AI model struggles when the object isn’t in the same position as in the reference video—the text tends to get distorted.

So I flipped the image and tried again:

We’re almost there—but as you can see, the end of the tube has a long silver tip. That means size matters—the replacement tube needs to match that length visually. So I went back, edited my image to make the tube longer, and gave it one more try.

Actually, it’s not about the size this time. I’m not entirely sure what the issue is—especially since we didn’t see this problem in the earlier video where the text got garbled. But I suspect the main problem is that the AI model can’t tell where the tube ends.

So here’s a tip: 
make sure your reference videos clearly show the entire object you’re trying to swap.

I now 3 out of 4 where pretty good:

Not perfect, but I am pretty happy with this result.

To wrap it up: Multi Elements has a lot of potential, but it really comes down to how clear and consistent your references are. Matching shape, position, and full object visibility makes all the difference. It’s not perfect yet—but with the right prep, you can get some seriously solid results. I’ll definitely keep experimenting and share more along the way.

If you’re curious to try Kling yourself, I’d really appreciate it if you use my affiliate link: https://klingaiaffiliate.pxf.io/XmG52M 💗
And if you hop into the Discord—come say hi! I’m there as a moderator: https://discord.gg/V2yPFTbD 🌼