Midjourney has released the first version of its video generation model to the public. Currently, the tool can generate short videos based on images uploaded or created on the platform, but Midjourney plans to roll out more features in the future.
After creating an image using Midjourney, the service will show a new “animate” button that users can click to create a 5-second clip based on a text prompt. There will also be an option to add an image uploaded to the platform as the “opening frame” for the video. By default, the tool generates a generic cue that “just makes things move,” but a “manual” button allows users to describe how they want the movement to look.
Users can extend the four-second animation up to four times, creating a total of 21-second videos. There are also high and low motion settings that control whether the object and the camera are moving or just the object.
Midjourney’s AI video generator is currently available only online and on the startup’s Discord server. It requires a subscription to the service, which starts at $10/month for 3.3 hours of “fast” graphics time (about 200 image generations). The startup claims that it will charge “about 8 times more for video than for images,” which will amount to about “the cost of one image” per second of video.
Midjourney is currently the subject of a lawsuit from Disney and Universal, who have cited the prospect of a video generator as a particular cause for concern. They claim that Midjourney offers “a virtual vending machine that generates endless unauthorized copies of Disney and Universal copyrighted works.” The video generation model was first announced in January, and Disney and Universal argued that its learning process meant that “Midjourney is likely already infringing on Plaintiffs’ copyrights.”
In a post announcing the generator, Midjourney founder David Holtz says this first version is just a “stepping stone” as the startup works to create “models capable of real-time simulations of the open world.” Google, OpenAI, and Meta have also launched AI video generators, each of which can create videos with text prompts.