Introduction
Once upon a time, creating a comic book was a tedious process that required writers, illustrators, and countless hours of effort. Today, artificial intelligence serves as a powerful tool that enhances the capabilities of creative professionals. Imagine handing a short story to an AI and watching it help transform that story into a vibrant, visually stunning comic book—while retaining the creator’s unique perspective. This is no longer just a fantasy; it is a reality made possible by cutting-edge generative AI models. In this blog, we will explore how CrewAI LLM agents enhance the creative process of comic book creation, delving into the architecture and implementation that make this magic possible.
Example: Creating a Storybook from Panchatantra
To demonstrate this process, let’s use a short story from the Panchatantra—a collection of ancient Indian fables known for their wisdom and moral lessons. Consider the story of “The Lion and the Rabbit”:
Short Story: “Once upon a time, there was a powerful lion named Basuraka who roamed the jungle. The animals grew tired of his tyranny and decided to deliver one prey to him each day. One day, it was the clever rabbit’s turn, and he devised a plan to get rid of the lion. He led Basuraka to a deep well, convincing him that another lion lived there. Basuraka, seeing his reflection in the water, jumped into the well in a fit of rage and never returned.”
Using the CrewAI framework, we will follow these steps:
-
1. Scriptwriting Agent: The scriptwriter will break the story down into scenes, such as:
-
• Scene 1: The lion roams the jungle. -
• Scene 2: The animals decide to deliver one prey each day. -
• Scene 3: The rabbit plans to deceive the lion. -
• Scene 4: The lion jumps into the well.
-
1. Visual Artist Agent: The visual artist will generate illustrations for each scene, depicting key moments such as the lion roaring in the jungle, the rabbit leading the lion to the well, and the final scene of the lion jumping into the water. -
2. Composer Agent: Finally, the composer will combine all these scenes and images into a coherent storybook, ready for viewing and sharing.
For more details about the Panchatantra stories, you can refer to external resources such as Panchatantra on Wikipedia or Panchatantra Story Collection.
Automating Creation Generation with LLM Agents
Generative AI agents can be seen as a digital “team” working together to execute complex creative processes. By assigning specific tasks to individual AI agents, the process of creating an entire comic book becomes efficient and automated. In the illustration below, the specialized agents work in collaboration:

-
1. Scriptwriter: Responsible for transforming the short story into detailed scene scripts. -
2. Visual Artist: Responsible for turning each scene into engaging visual artworks. -
3. Composer: Responsible for merging all generated scenes and their corresponding images into a complete comic book. The composer ensures a smooth narrative flow and that the final product is ready for publication.
The synergy between these agents automates the comic book creation process, achieving an efficient and creative workflow. The key lies in leveraging the capabilities of generative language models and image-generating AI systems in a coordinated manner.
Architecture Overview
The automated architecture is simple yet effective. The process starts with a short story, flowing as follows:
-
1. Short Story Input: The process begins with a short narrative as input, serving as the foundation for the comic book. -
2. Scriptwriting Agent: This agent breaks the short story down into discrete scenes, each capturing a significant part of the storyline. In the illustration, this is shown as scenes labeled “Scene 1, Scene 2, Scene 3,” etc., until the entire story is divided into several smaller scenes. -
3. Visual Artist Agent: The visual artist is responsible for translating each scene description into visual representations, effectively illustrating the comic. Visual elements are created as images representing scenes like the lion in the sunlight, the lion meeting the rabbit, etc. -
4. Composer: Finally, all scenes and their corresponding images are combined by the composer agent to create a complete picture book.
The entire process is designed to seamlessly transform the narrative into an engaging comic book, with minimal human intervention required.
Implementing the CrewAI Framework
To bring this vision to life, we implemented the CrewAI framework, where three agents work in harmony. Below are the detailed steps of the implementation process, along with placeholders for code snippets to help you replicate this process step by step:
Defining Agents and Tasks: We define two agents using the CrewAI framework—Agent 1 (Scriptwriter) and Agent 2 (Visual Artist). Both agents have specific roles, and their tasks are interconnected to achieve an efficient workflow.
## Agents
scriptwriter:
role: >
Write scene scripts for children's short stories
goal: >
Write simple, clear, and engaging scene scripts for children's picture books.
backstory: >
You are a scriptwriter focused on transforming children's short stories into scripts for performances or animations.
llm: llm_model
## Tasks
scriptwriting:
description: >
You will receive a children's short story about important life lessons. This story needs to be transformed into an engaging picture book for children to read and engage with. You are responsible for breaking the story down into {number_of_scenes} unique scenes, each focusing on a specific event or moment in the story. Each scene will be transformed into an image. You must generate the following information, following the specified Pydantic model:
- A suitable name for the story
- A brief summary of the story
- A short background introduction to provide the reader with important information.
- A detailed narration for each scene in the story, at least one to two sentences.
- The final lesson learned from the story.
<short_story>
{story_text}
</short_story>
expected_output: >
Output must strictly follow the Pydantic model. If not followed, there will be penalties.
agent: scriptwriter
## Agents
visualartist:
role: >
Visual illustrations for the storybook
goal: >
Create engaging picture books.
backstory: >
An expert in creating illustrated storybooks.
llm: llm_model
## Tasks
illustration:
description: >
You will receive a children's short story about important life lessons. This story will be transformed into an engaging picture book for children to read and engage with. The story has been broken down into unique scenes.
Below is a description of a scene, along with a brief summary provided below.
Generate a prompt that can be used for a text-to-image model to create an image for that scene. Send the prompt to the provided tool to generate an image that meets the scene's requirements for characters and background. The characters should be in a cartoon style. The prompt should be less than 40 words.
<story_summary>
{story_summary}
</story_summary>
<scene_description>
{scene_description}
</scene_description>
expected_output: >
Output must strictly follow the Pydantic model. If not followed, there will be penalties.
agent: visualartist
Team Configuration: Define a structured model of the agents to generate responses and LLM models, such as OpenAI and DALL-E models, and bind the agents to their tasks.
dalle_tool = DallETool(model="dall-e-3",
size="1024x1024",
quality="standard",
n=1)
## Define a class for a single scene
class StoryScene(BaseModel):
scene_number: int
narration: str
## Define a class for the list of story scenes
class StoryScenes(BaseModel):
story_name: str
summary: str
background: str
lesson: str
scenes: List[StoryScene]
## Define a class for a single scene
class SceneImage(BaseModel):
prompt: str = Field(description = "Prompt for generating images using the text-to-image model.", max_length = 50)
image_url: str = Field(description = "URL of the image generated by the tool")
@CrewBase
class StoryCrew():
"""Story Team"""
agents_config = 'config/story/agents.yaml'
tasks_config = 'config/story/tasks.yaml'
@llm
def llm_model(self):
return ChatOpenAI(temperature=0.0, # Set to 0 for deterministic output
model="gpt-4o-mini", # Use the GPT-4 Turbo model
max_tokens=8000)
@agent
def scriptwriter(self) -> Agent:
return Agent(
config=self.agents_config['scriptwriter'],
output_pydantic = StoryScenes,
verbose=True
)
@task
def scriptwriting(self) -> Task:
return Task(
config=self.tasks_config['scriptwriting'],
output_pydantic = StoryScenes,
)
@crew
def crew(self) -> Crew:
"""Create story team"""
script_crew = Crew(
agents=self.agents, # Automatically created by @agent decorator
tasks=self.tasks, # Automatically created by @task decorator
process=Process.sequential,
verbose=True,
# process=Process.hierarchical, # If you want to switch to this, refer to https://docs.crewai.com/how-to/Hierarchical/
)
return script_crew
@CrewBase
class ArtistCrew():
agents_config = 'config/visual/agents.yaml'
tasks_config = 'config/visual/tasks.yaml'
@llm
def llm_model(self):
return ChatOpenAI(temperature=0.0, # Set to 0 for deterministic output
model="gpt-4o-2024-08-06", # Use the GPT-4 Turbo model
max_tokens=8000)
@agent
def visualartist(self) -> Agent:
return Agent(
config=self.agents_config['visualartist'],
tools=[dalle_tool],
verbose=True
)
@task
def illustration(self) -> Task:
return Task(
config=self.tasks_config['illustration'],
output_pydantic = SceneImage,
output_file='report.md'
)
@crew
def crew(self) -> Crew:
"""Create illustrated book team"""
artist_crew = Crew(
agents=self.agents, # Automatically created by @agent decorator
tasks=self.tasks, # Automatically created by @task decorator
process=Process.sequential,
verbose=True,
# process=Process.hierarchical, # If you want to switch to this, refer to https://docs.crewai.com/how-to/Hierarchical/
)
return artist_crew
Main Workflow: Ensure proper handoff between the two agents. For example, once the scriptwriter completes a scene, it is automatically passed to the visual artist, ensuring continuity in the workflow.
agentops.start_session( tags = ['story', 'scripts'] )
## Using QuestCrew to create assumptions or generate questions
inputs = {
'number_of_scenes': int(number_of_scenes),
'story_text': story_text,
}
scenes_list = StoryCrew().crew().kickoff(inputs=inputs)
agentops.end_session("Success")
if scenes_list is not None:
print(f"Raw result from script writing: {scenes_list.raw}")
slist = scenes_list.pydantic
story_summary = slist.summary
for scene in slist.scenes:
print(f"Scene: {scene.narration}")
scene_input = [{ "story_summary": story_summary,
'scene_description': scene.narration} for i, scene in enumerate(slist.scenes)]
agentops.start_session(tags = ['scene', 'illustration'])
## Run agents
result_images = ArtistCrew().crew().kickoff_for_each(inputs = scene_input)
print("result_images : {result_images.raw}")
Conclusion
The power of generative artificial intelligence lies in its ability to enhance and support the creative process, providing content creators with new tools to bring their ideas to life. CrewAI LLM agents assist in transforming simple short stories into engaging comic picture books, helping storytellers at every stage of the journey. By automating repetitive tasks such as script breakdown and visual generation, AI allows artists and writers to focus more on core creative elements, preserving their unique artistic styles. This implementation demonstrates how generative AI can enhance the creative industry, offering a vision of a future where creativity and technology collaborate seamlessly.
Let me know if you want to learn more about CrewAI or if you have any questions about implementing similar creative solutions for your projects!
