
Click the “Blue Words” to Follow Us


1. Basics of LlamaIndex Workflows
1.1 Definition of Workflow
A workflow is an event-driven, step-based application execution control method. It consists of multiple steps, each responsible for handling a specific type of event and emitting new events. This design allows workflows to flexibly handle various application scenarios, from simple single-step processes to complex multi-step processes with multiple branches and loops.
In LlamaIndex, workflows are implemented by subclassing the <span>Workflow</span>
class and defining specific steps. Each step is decorated with the <span>@step</span>
decorator, which is used to infer the input and output types of each step, ensuring the validity and correctness of the workflow.
A simple workflow is as follows:
# single_step_workflow.py
from llama_index.core.workflow import ( StartEvent, StopEvent, Workflow, step,)from llama_index.llms.openai import OpenAI
# Define the workflow class for generating a haikuclass SingleStepWorkflow(Workflow): llm = OpenAI()
@step async def generate_haiku(self, ev: StartEvent) -> StopEvent: try: theme = self.context.get("theme", "nature") # Default to "nature" prompt = f"Write a traditional haiku about {theme}." haiku = await self.llm.acomplete(prompt) return StopEvent(result=str(haiku)) except Exception as e: return StopEvent(result=f"Error occurred: {str(e)}")
async def main(): # Run the workflow w = SingleStepWorkflow(timeout=60, verbose=False) result = await w.run(context={"theme": "nature"}) print(f"Result: {str(result)}")
if __name__ == "__main__": import asyncio
asyncio.run(main())
1.2 Events and Steps
Events are a key concept in workflows, serving as data carriers between steps. In LlamaIndex, events are user-defined Pydantic objects that can have custom attributes and methods. Each step can receive one or more events as input and output one or more new events.
<span>StartEvent</span>
and <span>StopEvent</span>
are two predefined special events in LlamaIndex workflows. The <span>StartEvent</span>
serves as the entry point of the workflow, indicating the initial input position of the workflow; while the <span>StopEvent</span>
marks the end of the workflow and can carry the final result of the workflow.
A step is a single task within a workflow, with each step defining an asynchronous function to handle specific events and emit new events. The connection between steps is achieved through events; when a step outputs an event, the steps subscribed to that event are triggered for execution.
2. Execution Modes of LlamaIndex Workflows
- Sequential Execution Mode
- Definition
In sequential execution mode, the steps of the workflow are executed in order, with each step depending on the output of the previous step as input. - Code Example
# sequential_step_workflow.pyfrom llama_index.core.workflow import ( StartEvent, StopEvent, Workflow, step, Event,)from llama_index.llms.openai import OpenAI# Define custom event classesclass HaikuEvent(Event): # Subclass Event def __init__(self, haiku: str): self.haiku = haiku# Define the workflow classclass SequentialStepWorkFlow(Workflow): llm = OpenAI() @step async def generate_haiku(self, ev: StartEvent) -> HaikuEvent: theme = self.context.get("theme", "nature") prompt = f"Write a traditional haiku about {theme}." haiku = await self.llm.acomplete(prompt) return HaikuEvent(haiku=str(haiku)) @step async def generate_limerick(self, ev: HaikuEvent) -> StopEvent: haiku = ev.haiku prompt = f"Write a limerick inspired by this haiku: {haiku}" limerick = await self.llm.acomplete(prompt) return StopEvent(result=str(limerick))async def main(): # Run the workflow w = SequentialStepWorkFlow(timeout=60, verbose=False) result = await w.run(context={"theme": "nature"}) print(f"Result:\n{str(result)}")
- Advantages
This mode is suitable for scenarios where there are clear dependencies between steps, making the logic clear and easy to understand and maintain.
- Definition
The concurrent execution mode allows multiple steps to run independently at the same time, increasing the execution efficiency of the workflow. - Code Example
# concurrent_step_workflow.pyfrom llama_index.core.workflow import ( StartEvent, StopEvent, Workflow, step, Event,)from llama_index.llms.openai import OpenAI# Define custom event classesclass HaikuEvent(Event): def __init__(self, haiku: str): self.haiku = haikuclass LimerickEvent(Event): def __init__(self, limerick: str): self.limerick = limerickclass ConcurrentStepWorkflow(Workflow): llm = OpenAI() @step async def generate_haiku(self, ev: StartEvent) -> HaikuEvent: theme = self.context.get("theme", "nature") prompt = f"Write a traditional haiku about {theme}." haiku = await self.llm.acomplete(prompt) return HaikuEvent(haiku=str(haiku)) @step async def generate_limerick(self, ev: StartEvent) -> LimerickEvent: theme = self.context.get("theme", "nature") prompt = f"Write a limerick about {theme}." limerick = await self.llm.acomplete(prompt) return LimerickEvent(limerick=str(limerick)) @step async def combine_results(self, ev: HaikuEvent | LimerickEvent) -> StopEvent: # Store data in the workflow context for merging if isinstance(ev, HaikuEvent): self.haiku = ev.haiku elif isinstance(ev, LimerickEvent): self.limerick = ev.limerick # Check if both results are available if hasattr(self, 'haiku') and hasattr(self, 'limerick'): combined_result = f"Haiku:\n{self.haiku}\n\nLimerick:\n{self.limerick}" return StopEvent(result=combined_result) # Wait for both steps to complete return Noneasync def main(): # Run the workflow w = ConcurrentStepWorkflow(timeout=60, verbose=False) result = await w.run(context={"theme": "nature"}) print(f"Result:\n{str(result)}")if __name__ == "__main__": import asyncio # Run the main workflow asyncio.run(main())
-
Advantages: Concurrent execution can significantly reduce the total execution time of the workflow and enhance the overall performance of the system without relying on the output of the previous steps.
3. Advantages of LlamaIndex Workflows
LlamaIndex workflows offer various advantages, making them an ideal choice for building complex query pipelines and data pipelines.
3.1 Modularity
Each step in LlamaIndex workflows is independent, reusable, and can be tested individually. This modular design allows developers to easily build and assemble complex workflows without worrying about dependencies between components. Additionally, the modular design facilitates maintenance and updates of workflows since each step can be modified and tested separately.
3.2 Customization
LlamaIndex workflows provide a high degree of customization. Developers can tailor the steps and events of the workflow according to specific requirements to meet the demands of different scenarios. For instance, when building a RAG (Retrieval-Augmented Generation) workflow, developers can customize the handling of steps and events based on specific business logic and algorithmic needs.
3.3 Scalability
LlamaIndex workflows exhibit good scalability. As demands grow and change, developers can easily add new steps and events to extend the functionality of the workflow. Moreover, since the steps within the workflow are independent, new algorithms or components can be integrated into the workflow without needing to restructure the entire workflow.
3.4 Visualization and Debugging
LlamaIndex workflows provide visualization and debugging capabilities, allowing developers to intuitively understand the execution flow and status of the workflow. By generating flowcharts of the workflow, developers can clearly see the connections and dependencies between each step. Additionally, LlamaIndex workflows support step-by-step execution and manual event triggering, which facilitates debugging and testing.
LlamaIndex workflows offer an efficient, flexible, and manageable way to handle complex data tasks. With their modular, customizable, and scalable features, along with support for sequential and concurrent execution modes, they can adapt to various business scenario needs. Whether in content creation, data processing, or intelligent interaction, LlamaIndex workflows demonstrate powerful capabilities, helping developers and enterprises better leverage large language models, enhance work efficiency, and foster innovation, gaining a competitive edge in an ever-changing digital environment.