Sora AI Video In-Depth Review: Save $200 with This Guide

Remember February 16, 2024, during the Lunar New Year, everyone was so excited they couldn’t sleep all night, shocked by the Demo released by Sora.

Almost a year has passed,OpenAI announced a 12-day live stream and for the first two days, everyone was staying up late complaining.

On the third day, Ultraman finally released Sora, and the shocking news was that the subscription for unlimited generation in 1080p pro mode costs up to $200!

Everyone grumbled and opened their computers, ready to subscribe.

As soon as the website was released,Sora’s traffic surged, forcing registration to stop on the morning of the 10th. Luckily, we managed to get in just before it closed. After a night of payment processing, we finally reached the successful subscription page!

Almost 1500 Chinese Yuan, what kind of results can we expect…

Actually, after nearly a year of explosive development in DIT technology, people’s expectations for Sora aren’t that high anymore, especially since there are many domestic video models launched like KeLing, JiMeng, HaiLuo, Vidu, PixVerse, HunYuan, ZhiPu, etc., with good results from various closed-source and open-source DIT architectures. People are already used to the generation of real AI images.

However, Sora, as the first DIT to release a Demo, still holds a special place in the hearts of AI enthusiasts, and everyone is biting the bullet to place an order. (Honestly, the cumbersome payment process can deter 95% of people; domestic Visa cards are not accepted, and we finally found a beautiful Visa card from the U.S. to complete the payment, thanks to zs classmate…)

Now, let’s get to the main topic!

Website:https://sora.com/

Everyone can first watch our Sora test comparison video and the experimental short film made with Sora!

—-

1. Basic Function Settings

Style presets, aspect ratio, resolution, duration, concurrency, text-to-video, image-to-video, etc.

Sora AI Video In-Depth Review: Save $200 with This Guide

Once you enter the webpage, you can intuitively see the UI for generating content, includingaspect ratio, resolution, duration, concurrency, which is very intuitive. The UI design is quite good, but why can’t you just drag and drop images to the webpage for uploading? You have to open a file selection window, which is a bit cumbersome…

Sora AI Video In-Depth Review: Save $200 with This Guide

Currently, it supports 3 aspect ratios, 3 resolutions, 4 durations, and 3 concurrency levels.

Sora AI Video In-Depth Review: Save $200 with This Guide

Next to the image upload button, there is a style preset, which is essentially a style filter pre-set for text-to-video, allowing for a unified generation style. However, this may not be very meaningful for professional users, but having style customization is a good effort to add some highlights.

Text-to-video and image-to-video functions don’t need much introduction; just write the text prompt and upload the image, and you’re good to go…

Sora Text-to-Video/Image-to-Video Effects:

The following videos are sourced from Vicky

Sora’s text-to-video still has many issues, such as semantic understanding, understanding of physical space, and understanding of direction, often resulting in incorrect movement directions for the main subject.

· Multiple limbs appearing is still a frequent occurrence.

· Unnatural hand movements, character twitching, overly strong 3D effects, poor anime quality, etc. Text-to-video is Sora’s main feature, and if it’s this abstract, then image generation is even worse.

· The quality of image generation is so poor that it cannot be described in natural language; only the results that can be edited into the video are relatively better.

· Image generation quality mainly suffers in several areas: first, it cannot recognize both foreground and background simultaneously, with poor understanding of physical space; the background often remains static; second, there are frequent and random cuts, as creators will edit, theoretically we don’t need cutting functionality, we just need good movement within a single shot; Sora’s cuts not only change the composition but also directly alter the style, which is very frustrating; third, there are issues with movement range and generalization; the poor movement range of Sora’s image generation makes me not want to complain anymore, and it often changes color saturation and has flickering issues.

I also made a comparison video of Sora’s text-to-image and image-to-video:

The following videos are sourced from Vicky

With the same set of prompts, a direct comparison between text-to-image and image-to-video shows that image generation is still too disastrous.The aesthetics and lighting of the text-to-image are significantly worse, and bugs frequently occur even in movement; when I tested this, I was almost unable to continue testing…

—-

2. Storyboard Function

The storyboard function allows for more precise control of camera movements within the same time segment using text, images, and videos. Previously, generating 5-10 seconds of footage could only be controlled by prompts, but this storyboard function allows you to specify the exact duration you want to control, including text, images, and videos,but currently does not support uploading videos with people.

The maximum duration can reach 20 seconds, and multiple layouts can be added; the idea is great, but the reality is…

Storyboard Function Effects:

I input 4 shots of text-to-video prompts, pulled the text storyboard to the corresponding time position, clicked generate, and then it would output a 10s video.

It’s actually quite convenient and quick, usually sufficient for making concept videos and trailers. However, it often encounters semantic understanding issues. Sometimes when you input multiple prompts, like 4 segments, it often only recognizes one or two segments, which can be quite frustrating.

The image storyboard function is currently a disaster. Uploading images is really slow…

Are you all going crazy?

I even suspect that Sora’s image generation has nothing to do with DIT… how can it feel so much like a PowerPoint presentation?!

I basically don’t test text-to-video because it cannot serve as a productivity tool, but apart from text-to-video, Sora has nothing else to test.

Honestly, if it weren’t for Sora, and if I hadn’t already spent $200, I definitely wouldn’t spend so much time on this, trying to seek the value of that $200 and find Sora’s highlights…

—-

3. Video Editing Function

The video editing function is another highlight feature that Sora is promoting this time, but under the premise of such poor image generation model capabilities, these features seem useless to me…

Sora has put a lot of effort into researching how to create a product that serves creators well, designing features and interactions with great creativity, but with the current poor capabilities of the image generation model, how can we use it! How to use it!!! Oh dear!!!

Let’s still politely introduce the features…

Recut: Trim and extend the video in the new storyboard

Remix: Describe changes and create new videos based on your era

Blend: Transition elements from one video to another

Loop: Create seamless loops from any part of the video

The only potentially useful feature here is recut; if I generated a 10-second video with the storyboard and there’s a shot I don’t want, I can use recut to adjust the video.

The content of remix is actually not very controllable, blend is more like the overlay function in traditional editing, and loop can also be implemented in editing.

Sora may want to include editing features within its platform, which sounds great, but when the model capabilities are lacking, other features cannot be prioritized.

I didn’t like the tears on the girl’s face at the end of the video, so I used the recut function to trim it off and rewrote the prompts for the last two shots.

Sora AI Video In-Depth Review: Save $200 with This Guide

Recut output:

Other features will not be elaborated on; let me show you some screenshots of the features.

The storyboard also has a video upload feature; let me show you a video I uploaded of Tangyuan (the puppy) after adding prompts before and after.

It’s a bit hard to evaluate; 2-4 seconds is my original video, and the front and back are generated, which don’t quite look like Tangyuan.I also tested a few videos where the puppy was placed in front, and after adding text generation, if you draw cards frantically, you can barely get a naturally transitioning video of the puppy.

Basically, that’s all the features!

The subscription has two tiers:$20 only generates 50 720p watermarked videos,$200 can generate 500 1080p videos + unlimited slow motion + watermark-free downloads.

Special thanks to the beautiful, kind, and generous JessyJang for supporting the hefty Sora account! She stayed up from night to day to help roll the videos; where can you find such a good friend!!!

—-

4. Sora Generation Cases

Some artists have created Sora samples, and to achieve this, I really don’t know how many times they had to draw cards…

Sora’s realistic light and shadow performance is quite good, but this certainly isn’t something that can be achieved in just one or two draws; most shots look like text-to-video.

This video has a great editing rhythm, but there are some interspersed scenes with different styles that I don’t quite understand the creator’s intention.I’m really curious, for a product that requires so many draws, how long does it take to achieve such smooth footage…

This video should have been released earlier.Sora’s restoration of Japanese street and retro-style videos is still quite good.

All of these contents are basically created based on text-to-video; the current mainstream path for AI video creation is actually image-to-video. Why do artists choose to use Sora for creating text-to-video? Everyone can ponder the creative logic behind this.

—-

Conclusion:

1. Sora’s image-to-video is currently really hard to use.

Why is Sora’s image recognition so poor, why!!! I tested image-to-video and really reached the point of breakdown; it’s too hard to achieve good quality.

2. Sora’s text-to-video effects are acceptable but require drawing cards, making it difficult to create continuous narrative content.

Text generation struggles to maintain consistency between characters and scenes, so it can only lean towards consciousness flow and visually conceptual “art films” in text generation.

3. There are two highlights: first, the storyboard concept is quite good, but the model capabilities do not keep up; second, text-to-video has a strong sense of realism, but image-to-video is lacking, making it redundant.

Writing good text prompts for text generation and familiarizing yourself with product rules may still lead to some good works, even in the case of drawing cards to the point of madness…

People’s expectations for it were indeed too high before.

Currently, domestic AI video products are catching up quickly and are very competitive; people have already seen too many good results, and the era of casually producing 4-second videos is gone. Besides model capabilities, users also care about draw rates, interactivity, package cost-performance ratios, etc. From any angle, Sora currently does not have enough advantages.

4. If you just want to make AI videos, there’s no need to buy it; $200 is not worth it, and even $20 is not necessary. However, if there are Sora fans or rich folks who enjoy making text-to-video concept films, then let’s pretend I didn’t say anything…

Sora no longer brings surprises, and domestic products are closely following.

AI technology can iterate, innovate, and truly mature in the future, which is what every creator hopes to see.

But we are still in the early stages; which manufacturers can persist? What kind of technological ecosystem will lead to the end?

No one knows.

It’s still too early to draw conclusions. But we sincerely hope that day will come.

—-

A little Easter egg:

I hastily created a Sora AI experimental short film called “MOVIE ELEMENTS”

Can everyone guess what classic movie elements are in the short film?

Benefits

In the knowledge base, we have organized 448 documents, with a total of 576831 words related to AIGC and other related articles, a continuously updated platform for AIGC frontier consulting, advanced tutorials, selected resources, practical cases, and business exploration. It is free and has cited sources; I hope it helps everyone, just copy it into the browser!!

Address:https://www.yuque.com/frannnk7/aidesign?# 《AI Knowledge Base Collection PRO》