Sora’s AI Video Revolution is Still a Long Way to Go

OpenAI’s Sora, the company’s new AI-powered video generator, made its debut recently, showcasing the ability to create videos based on a wide range of prompts. Whether it’s superheroes, cityscapes, or animated animals, Sora’s versatility is apparent. However, the results often fall short, plagued with visual glitches and inconsistencies that hinder its practicality.

Released on Monday after nearly a year of teasers, Sora has already garnered significant attention. However, accessing its features isn’t straightforward. Due to overwhelming demand, account creation was suspended shortly after launch. Those fortunate enough to sign up discovered that video generation features require a paid subscription. The $20 monthly “Plus” tier allows users to create videos at 480p or 720p resolution, limited to five or ten seconds depending on the quality. For higher resolutions, including 1080p, and longer clips of up to 20 seconds, users need the $200 per month “Pro” subscription.

Testing Sora on the Plus tier revealed mixed results. Basic prompts with minimal detail, such as “a cat playing with a ball of yarn,” produced somewhat realistic outcomes. The generated cat appeared lively and convincing, but anomalies like a temporary second tail and jittery, CGI-like yarn disrupted the realism. Complex prompts with detailed descriptions led to even worse results. Human motion, in particular, was a challenge. For instance, a video of someone applying makeup resulted in erratic hand movements, and prompts featuring people eating salad or sausage rolls resembled bizarre viral AI videos, like the infamous clip of Will Smith eating spaghetti.

Sora’s Storyboard feature, designed to help users create more structured videos by breaking prompts into segments, seemed promising at first. This tool resembles a video editing timeline, allowing users to specify details for every two seconds of a clip. However, the results deteriorated with increased detail, introducing more distortions and oddities.

Despite its shortcomings, Sora does have strengths. Video generation is relatively fast, often taking under 30 seconds for a ten-second clip. Details like fur patterns, textiles, lighting, and shadows are handled well, even during rapid motion. Effects such as sunlight streaming through a window are impressively realistic, creating glare and interacting naturally with objects in the scene. At lower resolutions, objects maintain a high level of detail without becoming pixelated, which is notable for an AI video generator.

When compared to existing tools like Runway AI, Sora shows promise. It produces more realistic outputs with fewer distortions than its competitors. Sora’s quality is also comparable to Adobe’s Firefly Video Model, previewed in October. However, OpenAI has yet to implement safeguards similar to Adobe’s, which ensures its outputs are commercially safe by training its AI exclusively on licensed or public-domain content.

Despite these advantages, Sora’s videos remain unsuitable for professional or commercial use. The platform struggles with narrative coherence, making it impractical for tasks like storytelling or creating polished stock footage. While skilled users might refine the results with time and effort, this undermines the idea of democratizing video production. Instead, Sora feels inaccessible due to the steep subscription costs for its higher-quality features, rivaling traditional filmmaking tools in price.

Sora’s content moderation systems also present challenges. To prevent copyright violations or inappropriate content, prompts involving political figures such as Donald Trump or Kamala Harris are blocked, and celebrity names like Taylor Swift yield generic, unrecognizable characters. The system also avoids generating recognizable brand icons, even with descriptive prompts. However, its guardrails aren’t foolproof. While some violent prompts are blocked, others, like an explosion at the Empire State Building, were generated—albeit with laughably cartoonish results. Other questionable outputs, such as toddlers modeling swimsuits or holding weapons, suggest inconsistencies in enforcement.

The platform includes a feature allowing users to upload reference images, but strict rules apply. Users must confirm ownership of uploaded images and avoid content featuring minors, violence, or explicit themes. Pro-tier subscribers can upload images containing people, which likely contributed to the polished Sora demos showcased before its release. This limitation, combined with the high subscription cost, makes the feature less accessible to the average user.

While Sora represents a step forward in AI video generation, it’s far from ready to revolutionize the industry. The subscription model puts high-quality features out of reach for many, and the technology’s current state leaves much to be desired in terms of usability and output quality. For now, Sora is better suited for experimentation or niche applications than for creating content that demands a professional finish.

Interestingly, the imperfections haven’t stopped users from attempting to profit from AI-generated videos. Platforms like YouTube are already flooded with low-quality, nonsensical AI-generated content aimed at young children. Sora could easily contribute to this trend, offering users the ability to churn out similar videos for just $20 a month. However, until the technology matures, it’s unlikely to have a transformative impact on video production.

Latest articles