A model for generating videos from text

A model for generating videos from text, with prompts that can change over time, and videos that can be as long as multiple minutes.
Phenaki: A Model for Generating Realistic Videos from Text

Phenaki is a model that can generate realistic videos based on a series of text prompts, with video lengths of up to several minutes. This model utilizes a novel causal model that compresses videos into discrete token sequences, allowing for video generation from text. To address data limitations, Phenaki leverages joint training, combining a large number of image-text pairs with a small number of video-text pairs, enabling the generation of videos of arbitrary lengths in an open domain context. Additionally, the video encoder-decoder of this model surpasses all existing benchmark methods in terms of spatio-temporal quality and token count per video.

Recommended Users: Phenaki is suitable for individuals interested in artificial intelligence and video synthesis.

Keywords: Artificial intelligence, video synthesis, natural language processing.

Pricing Model: Free to use.

Compatible Devices: Computers, smartphones, tablets, and other devices.

