How AI Tool Sora Generates Videos from Text

On February 16, 2024, Open AI announced on X (formerly Twitter) the introduction of its new text-to-video model – Sora.

This model can generate videos up to 60 seconds long, and during this process, it can switch camera angles by itself and even provide close-ups. Below are the translated video prompts and the “works” generated by Sora based on the original English prompts.

A fashionable lady walks down the neon-lit streets of Tokyo, wearing a black leather jacket, a red long skirt, and black boots, carrying a black handbag. She wears sunglasses and red lipstick, walking confidently and casually. The street is wet, and the water on the ground reflects the colorful lights like a mirror, with many pedestrians coming and going.

Video source: Open AI official website

A 3D animation shows a small, round, furry creature exploring a vibrant, magical forest. This creature is a mix between a rabbit and a squirrel, with soft blue fur and a fluffy striped tail. It hops along a sparkling stream, its eyes filled with curiosity. The forest is filled with magical elements: flowers that glow and change colors, trees with purple and silver leaves, and floating lights similar to fireflies. The creature eventually stops to play with a group of fairies dancing around a mushroom. It looks up in awe at a giant glowing tree that seems to be the heart of the forest.

Video source: Open AI official website

At first glance, you might think these videos were produced by a professional filming team or an animation company. In the OpenAI community, there are also comments from users expressing concerns that Sora might take away jobs from animators.

How AI Tool Sora Generates Videos from Text

The image is a screenshot from machine translation: community.openai.com

Some people are also concerned about whether such technology could be used to forge videos, or even be used to commit perjury in court.

How AI Tool Sora Generates Videos from Text

The image is a screenshot from machine translation: X

So how does Sora generate such videos? Is it really omnipotent and will it take away human jobs?

How does Sora generate videos?

How AI Tool Sora Generates Videos from Text

Video source: X message posted by Gabor Cselle

How AI Tool Sora Generates Videos from Text

Sora is a diffusion model, image source: Open AI official website

How AI Tool Sora Generates Videos from Text

Adding noise and removing noise, image source: Reference [3]

How AI Tool Sora Generates Videos from Text

Sora processes video data, image source: Open AI official website

Sora’s Powerful Video Creation Ability

How AI Tool Sora Generates Videos from Text

These three videos ultimately lead to the same ending, image screenshot from: Open AI official website

How AI Tool Sora Generates Videos from Text

Image screenshot from: Open AI official website

How AI Tool Sora Generates Videos from Text

Video taken from: OpenAI official website

“Powerful Sora” still has some flaws

How AI Tool Sora Generates Videos from Text

So will Sora replace human video workers?

It is certain that the emergence of Sora may threaten some creators of animation materials.

For example, in January this year, The Hollywood Reporter conducted a survey of 300 entertainment industry leaders, and three-quarters of respondents said that AI would reduce future job opportunities, with approximately 200,000 positions affected in the next three years. Sora’s excellent performance will exacerbate this impact.

But looking at it from another angle,every emerging technology brings new opportunities along with threats.

Video generation AIs, including Sora, are just tools; the creativity for videos still needs to come from humans. Sora may help humans produce videos more efficiently, while also giving everyone the chance to create their own creative videos.

References

[1]https://openai.com/research/video-generation-models-as-world-simulators

[2]https://openai.com/Sora[3]https://scholar.harvard.edu/binxuw/classes/machine-learning-scratch/materials/foundation-diffusion-generative-models

[4]https://www.hollywoodreporter.com/business/business-news/ai-hollywood-workers-job-cuts-1235811009/

How AI Tool Sora Generates Videos from Text

Source：Popular Science China

Editor：Hao Hao

Proofreader：Chen Peng

Reviewer：Xia Wanxiang

Understanding Life | Quality Focus | Love Science

Long press QR code to follow Science Popularization Jiangxia

Leave a Comment Cancel reply