
Written by: Xu Tao
Edited by: Wu Yangyang
Key Points
Creative individuals are the earliest user group of Runway;
Runway made rapid progress in its early days, completing early financing within 4 weeks;
The Series A financing changed Runway’s strategic direction, shifting from an open-source creative model community to a “next-generation creative toolkit,” targeting image editing companies like Adobe;
After conflicts with Stability, Runway shifted its research focus from image generation to video generation;
Runway’s competitors are not only AI peers but also visual effects companies – they have started developing video generation models themselves.
In April 2023, a sensational advertisement appeared on Twitter (now renamed X). Accompanied by dynamic background music, a middle-aged male voice promoted a pizzeria called “Pepperoni Hug Spot,” highlighting its ample cheese and delivery service.
The entire advertisement lasts 30 seconds, with nothing particularly novel; what draws clicks is the production method. A Twitter user named Pizza Later created this video, using AI for everything from the script to the shots, from the voiceover to the music: the restaurant’s name and ad script were generated by GPT-4, including the line “(the pizza from this restaurant) is like family, but with more cheese”; the still frames in the video were generated by Midjourney, which produced images with a “1980s pizzeria look and grainy texture”; then, he opened the text-to-video tool Gen-2, allowing it to generate more than 30 video clips based on the script, Later selected the best 16 clips for final editing; he also used another AI service, ElevenLabs, to input the GPT-4 script into a series of preset AI voices, adjusting the tone until satisfied; finally, he used Adobe’s editing tool After Effects to assemble all the AI-generated content into a completely AI-created advertisement.

The author with the username Pizza Later created a 30-second advertisement for a pizzeria using Runway.
This advertisement is far from excellent in quality and even appears a bit strange; the customers’ jaws sometimes twitch uncontrollably, and their mouths do not actually eat the pizza but deform due to the eating motion. However, tens of thousands of people eagerly wanted to watch this ad, which garnered 350,000 clicks on Twitter and reached 1.16 million views on YouTube, surpassing many pizza brands’ meticulously crafted works by large advertising companies.
The entire advertisement production involved text generation, image generation, sound generation, and video generation, with video being the most challenging. The AI tool Gen-2, which generated video clips for the above advertisement, comes from the American AI company Runway ML (hereinafter referred to as “Runway”).
Developing AI Image Editing Tools for Creative Individuals
The founding of Runway resembles the “American Dream” entrepreneurial story glorified in Hollywood films.
Cristóbal Valenzuela, while working in Chile, discovered the work of American new media artist Gene Kogan on neural style transfer and became interested in neural networks. He later quit his job and went to New York University in 2016 to study the Interactive Telecommunications Program (ITP). It was during his time in the ITP program that he met Chilean Alejandro Matamala-Ortiz and Greek Anastasis Germanidis, and they formed a startup team.
Valenzuela himself has no technical background; he studied economics and business management for his undergraduate degree and later obtained a master’s degree in design. He has taught design at Adolfo Ibañez University in Santiago, Chile.
In contrast, the other two co-founders have work experience in technology and product fields. Germanidis studied computer science at Wesleyan University, worked as a product engineer, and was a computer vision researcher at IBM. Ortiz has product design and front-end development experience and founded an online dental appointment platform, Deenty, in Chile.
Valenzuela’s entrepreneurial project stemmed from research outcomes in the ITP program. While studying at NYU, he spent two years learning AI technology and knowledge, including Fei-Fei Li’s image database ImageNET and convolutional neural networks like AlexNET. He attempted to integrate models for image segmentation, image understanding, and video understanding into Photoshop and Premiere to help users accelerate image stylization, coloring, or editing work, and shared his research outcomes on Twitter. Many artists and designers showed interest in the presentation effects, and some were willing to try it due to its simplicity of operation. These creative individuals, similar to Valenzuela, were the earliest user group of Runway.
When he presented Runway as a research paper at NYU, an employee from Adobe offered him a job, which could be seen as a dream job: “I had been an immigrant in New York for two years, and a perfect, dream company offered me a dream job, visa, and perfect salary – this was a dream.” However, he ultimately rejected Adobe and wanted to transform Runway from a research project into a business.
From Open Source Model Community to Proprietary Model Products
Runway made rapid progress in its early days; when Valenzuela started the company, many venture capital firms had already expressed interest, and they completed early financing within 4 weeks.
However, the difficulty of transforming a research project into a startup began to reveal itself later.
“The Series A financing in 2020 was arguably our most difficult round.” In the UK podcast “20VC,” Valenzuela recounted the company’s financing journey. He pitched the idea of “building a generative AI company” to investors, receiving hundreds of rejection letters as investors were uninterested in generative AI at that time.
Before this, Runway was essentially a model community, hosting dozens of models created by independent developers and their own, catering to the diverse needs of creative individuals. Some models even appeared somewhat strange. For instance, a developer trained a special version of the GPT-2 text generator using lyrics from the Korean pop group BTS.
The concept of a model community may have been too advanced. Without a sufficient number of models or recognition from end users, the appeal of such a platform model was not significant. In 2023, Alibaba Cloud, Amazon, and Baidu successively launched large model platforms, with release times generally following the launch of their self-developed large models.
The Series A financing changed Runway’s strategic direction, shifting from an open-source creative model community to a “next-generation creative toolkit,” targeting image editing companies like Adobe. Valenzuela aimed to leverage the models and algorithms on the platform to build interfaces at the system level, helping end users improve efficiency, with product development focusing on film and television creation. In the financing news, the lead investor of Series A, Amplify Partners, stated, “We believe this will have a profound impact, akin to that of the camera.” At that time, reports mentioned Runway’s development of a video editing tool called Green Screen, which helps creative individuals remove unwanted elements from their footage.
This pivot was welcomed by the film industry. The visual effects team of the hit film “Everything Everywhere All At Once” approached Valenzuela for technical solutions and used AI tools, including Green Screen. In a scene where two stones converse, “Everything Everywhere All At Once” used a slider to move the two stones on set, then used Green Screen to erase the slider in post-production.
Today, Runway’s tools focus on film and television creation, including video generation and editing, image generation and editing, 3D capture and texturing, etc.
I Firmly Believe 2023 Is the Year of Video
In October 2022, a dispute arose between the open-source model hosting platform Hugging Face and Runway, where Stability AI (hereinafter referred to as “Stability”) accused Runway of leaking company intellectual property by releasing the text-to-image product Stable Diffusion version 1.5, demanding Runway to remove the already published models. At that time, the public perception was that Stability was the algorithm developer of Stable Diffusion, an emerging AIGC star company, while Runway was a much less recognized name. Stable Diffusion is the foundational algorithm for most current text-to-image models.
Valenzuela’s response indicated that Runway was indeed the main developer behind Stable Diffusion: In April 2022, Runway’s chief research scientist Patrick Esser collaborated with Robin Rombach from the Machine Vision and Learning Research Group at the University of Munich (who later joined Stability as head of the research team) to develop the first version of the text-to-image tool Stable Diffusion, and Runway continued to participate in version iterations. Stability provided computational resources and funding during the later stages of research to transform the research project into a commercial product.
This dispute gained Runway, founded in 2018, more exposure. By the end of June 2023, Runway completed a Series C+ financing of $141 million, with investments from Google, NVIDIA, and Salesforce. Bloomberg reported that this Series C+ financing had raised the startup’s valuation from $500 million to $1.5 billion.
It was also after the conflict with Stability that Runway shifted its research focus from image generation to video generation.
In February 2023, Runway released its first-generation video generation model Gen-1, and a month later, it released the second-generation model Gen-2. Among the two models, Gen-1 allows users to modify existing video clips, transforming them into anything from watercolor to claymation, while Gen-2 can convert text prompts into short, AI-generated moving clips. Furthermore, the main difference between the two is the duration of generation; Gen-1 can generate 15 seconds of video, while Gen-2 extends the duration to 18 seconds.
Duration is one of the biggest challenges for text-to-video models. Breaking it down, video is merely a series of frames (still images) combined in a certain way to create the illusion of motion. However, the human eye, trained to detect the slightest flaws in video frames, requires that the output of the model generating videos be sufficiently good to maintain the illusion of motion, and the core of text-to-video models lies in understanding the relationships and consistency between frames.
The competition for generation duration continues, and for the short term, this technology cannot replace filming, nor can it shake the vast film industry. However, Valenzuela sees growth potential in video generation. At the beginning of 2023, he firmly believes that “2023 will be the year of video.”
In the past two years, the combination of AI and video has indeed become an important niche market. In September 2022, Meta, the parent company of Facebook, released the video generation model Make-A-Video, which is similarly based on image generation. The following month, Google launched the video generation model Imagen Video. However, neither Meta nor Google has pushed the models to the market. Google cited concerns that Imagen Video could generate violent or explicit sexual videos due to problematic content in the training data. In March 2023, Adobe launched its generative AI engine Firefly, gradually integrating it into its audio, image, and video editing tools.On February 16, OpenAI launched a more powerful video generation model Sora, with videos lasting up to one minute.
In January 2023, OpenAI founder Sam Altman also disclosed plans to launch a video model but did not provide a specific timeline.
For Runway, its competitors are not only AI peers but also visual effects companies – they have started developing video generation models themselves. In June 2023, the film “Indiana Jones and the Dial of Destiny” featured a young version of Indiana Jones created through technology developed by Industrial Light & Magic, rather than being portrayed by a live actor. Robert Weaver, visual effects supervisor at Industrial Light & Magic, stated that the company used computers to compile footage of Harrison Ford from previous Indiana Jones films and employed its developed FaceSwap technology to make the actor appear younger in the film, combined with machine learning and other technologies to realize film segments.
When major companies in the film industry are willing to develop their own video generation technologies, their willingness to purchase Runway’s products may decrease, which is not good news for Runway, which focuses on the film industry. According to sources cited by Forbes at the end of 2022, Runway’s annual revenue hovered around $1 million, which is insufficient for the expensive model training and video generation.
Company Profile:
Runway

Founded:
2018
Founding Team:
Cristóbal Valenzuela, Alejandro Matamala-Ortiz, and Anastasis Germanidis
Core Products:
Video Generation Model Gen-1, Text-to-Video Model Gen-2
Financing History:
· 2018.12 Seed round, raised $2 million
· 2020.12 Series A, raised $8.5 million
· 2021.12 Series B, raised $35 million, led by Coatue
· 2022.12 Series C, raised $50 million, led by Felicis
· 2023.6 Series C+, raised $141 million, led by Google
Valuation:
$1.5 billion
-END-

AI Unicorn | ⑥ Covariant: Aiming to Equip Robots with the Same Brain
AI Unicorn | ⑤ Character AI More Popular Among Young People than ChatGPT, Here’s Why
AI Unicorn | ④ Cohere: On the Edge of Collapse
AI Unicorn | ③ Stability AI, The Business of AI Open-Source Models
AI Unicorn | ② Anthropic: The Company Most Likely to Challenge OpenAI
AI Unicorn | ① Inflection: Turning the Movie “Her” into Reality



