Postby Viridian » Tue Aug 22, 2023 8:17 am
Now for today's updates. It's been a pretty busy day trying to upskill. This is what we have so far.
I've refined my process for voiceovers and lip sync. I currently use D-ID for AI voices, though instead of generating the still image with lip sync, I just record the audio. Instead, I run the video through Wav2Lip, which is a free lip-sync encoder. This results in "saf_runway", the same Safari GIrl image made with Stable Diffusion), voiceover with D-ID, animated with Runway, and synced with Wav2Lip. That's quite a workflow for 4 seconds - and that's not even a polished product.
I've also learned that Runway is, at this point, not very good at using image to video. As MDLambert has tried, it very often morphs quicksand images because it doesn't recognise them. Either the character is jumping out of the quicksand, or the AI thinks it's a plus-sized model. Very specific images work - the shoulder-deep ones like I demonstrated earlier have the most consistent results. You otherwise end up burning through a lot of credits gambling for a good generation.
However, the TEXT to video is perhaps the best AI tool right now, edging out Pika. It is exceptionally good at creating an AI-generated dynamic scene. "tombtest" shows both an environment and a character pan. "saf_runwaytest" is my attempt at rendering Safari Girl in this style, with Wav2Lip sync.
It's still in its infancy, but after seeing animators create animated trailers, I had to try it myself. Safari Girl ends up looking ghoulish when rendered in Runway through text, but Runway has the function of training AI portraits. All it takes is 15-30 samples. I created around 20 AI portraits of Quicky Sanders, which allows Runway to generate a closer version of my OC whenever I type her name in the prompt. That includes the videos.
That brings me to my great work for today: "Into The Woods", a proof of concept for a quicksand film trailer.
You do not have the required permissions to view the files attached to this post.