Understanding LangChain in Depth

Click the above “Yudao Source Code“, select “Set as Star”

What about the previous wave or the next wave?

The wave that can roll is the good wave!

Every day at 10:33 we update articles, losing a bit of hair every day…

Source Code Quality Column

  • Original | The Super Road of Java 2021, very hardcore~

  • Open source projects with detailed Chinese comments

  • RPC framework Dubbo source code analysis

  • Network application framework Netty source code analysis

  • Message middleware RocketMQ source code analysis

  • Database middleware Sharding-JDBC and MyCAT source code analysis

  • Job scheduling middleware Elastic-Job source code analysis

  • Distributed transaction middleware TCC-Transaction source code analysis

  • Eureka and Hystrix source code analysis

  • Java concurrency source code

Source: Distributed Laboratory

  • What is LangChain?
  • Practical Applications of LangChain
  • LangChain Tokens and Models
  • Summary

Understanding LangChain in Depth

In daily life, we usually focus on building end-to-end applications. There are many automated machine learning platforms and continuous integration/continuous delivery (CI/CD) pipelines available to automate our machine learning processes. We also have tools like Roboflow and Andrew N.G.’s Landing AI that can automate or create end-to-end computer vision applications.

If we want to create applications based on large language models with OpenAI or Hugging Face, we used to have to do it manually. Now, to achieve the same goal, we have two of the most famous libraries, Haystack and LangChain, which can help us create end-to-end applications or processes based on large language models.

Let’s dive deep into LangChain.

What is LangChain?

LangChain is an innovative framework that is revolutionizing the way we develop applications driven by language models. By introducing advanced principles, LangChain is redefining the limitations of what traditional APIs can achieve. Additionally, LangChain applications have the characteristics of intelligent agents, allowing language models to interact with and adapt to their environments.

LangChain consists of multiple modules. As its name suggests, the main purpose of LangChain is to chain these modules together. This means we can link each module in series and call all modules at once using this chain structure.

These modules consist of the following parts:

Model

As discussed in the introduction, models primarily cover large language models (LLMs). Large language models refer to neural network models with a large number of parameters trained on vast amounts of unlabeled text. Tech giants have launched a variety of large language models, such as:

  • Google’s BERT
  • OpenAI’s GPT-3
  • Google LaMDA
  • Google PaLM
  • Meta AI’s LLaMA
  • OpenAI’s GPT-4
  • ……

With LangChain, interacting with large language models becomes much more convenient. The interfaces and functionalities provided by LangChain help seamlessly integrate the powerful capabilities of LLMs into your work applications. LangChain utilizes the asyncio library to provide asynchronous support for LLMs.

For network-bound scenarios that require concurrent calls to multiple LLMs, LangChain also provides asynchronous support. By freeing up the threads that handle requests, the server can allocate them to other tasks until the response is ready, maximizing resource utilization.

Currently, LangChain supports asynchronous support for models like OpenAI, PromptLayerOpenAI, ChatOpenAI, and Anthropic, but plans to expand asynchronous support for other LLMs in the future. You can use the agenerate method to asynchronously call OpenAI LLMs. Additionally, you can write custom LLM wrappers beyond those supported by LangChain.

I have used OpenAI in my applications, primarily using the Davinci, Babbage, Curie, and Ada models to solve my problems. Each model has its own advantages, token usage, and use cases.

For more information about these models, please read:

https://subscription.packtpub.com/book/data/9781800563193/2/ch02lvl1sec07/introducing-davinci-babbage-curie-and-ada

Example 1:

> A backend management system + user mini-program implemented based on Spring Boot + MyBatis Plus + Vue & Element, supporting RBAC dynamic permissions, multi-tenancy, data permissions, workflows, third-party login, payment, SMS, mall, etc.
>
> * Project address: <https://github.com/YunaiV/ruoyi-vue-pro>
> * Video tutorial: <https://doc.iocoder.cn/video/>

# Importing modules 
from langchain.llms import OpenAI

#Here we are using text-ada-001 but you can change it 
llm = OpenAI(model_name="text-ada-001", n=2, best_of=2)

#Ask anything
llm("Tell me a joke")

Output 1:

'

Why did the chicken cross the road?

To get to the other side.'

Example 2:

llm_result = llm.generate(["Tell me a poem"]*15)

Output 2:

[Generation(text="

What if love neverspeech

What if love never ended

What if love was only a feeling

I'll never know this love

It's not a feeling

But it's what we have for each other

We just know that love is something strong

And we can't help but be happy

We just feel what love is for us

And we love each other with all our heart

We just don't know how

How it will go

But we know that love is something strong

And we'll always have each other

In our lives."),  
 Generation(text='\n\nOnce upon a time\n\nThere was a love so pure and true\n\nIt lasted for centuries\n\nAnd never became stale or dry\n\nIt was moving and alive\n\nAnd the heart of the love-ick\n\nIs still beating strong and true.')] 

Prompt

As we all know, a prompt is the input we provide to the system to adjust the answer according to our use case precisely or specifically. Many times, what we want is not just text but more structured information. Many new object detection and classification algorithms based on contrastive pre-training and zero-shot learning use prompts as effective input for result prediction. For example, OpenAI’s CLIP and META’s Grounding DINO both use prompts as input for predictions.

In LangChain, we can set prompt templates as needed and connect them with the main chain for output prediction. Additionally, LangChain provides output parsers to further refine results. The role of the output parser is (1) to guide the formatting of the model’s output, and (2) to parse the output into the desired format (including retries if necessary).

In LangChain, we can provide prompt templates as input. A template refers to the specific format or blueprint we want to receive answers in. LangChain provides pre-designed prompt templates that can be used to generate prompts for different types of tasks. However, in some cases, preset templates may not meet your needs. In such cases, we can use custom prompt templates.

Example:

from langchain import PromptTemplate

> A backend management system + user mini-program implemented based on Spring Cloud Alibaba + Gateway + Nacos + RocketMQ + Vue & Element, supporting RBAC dynamic permissions, multi-tenancy, data permissions, workflows, third-party login, payment, SMS, mall, etc.
>
> * Project address: <https://github.com/YunaiV/yudao-cloud>
> * Video tutorial: <https://doc.iocoder.cn/video/>

# This template will act as a blue print for prompt

template = """
I want you to act as a naming consultant for new companies.
What is a good name for a company that makes {product}?
"""

prompt = PromptTemplate(
    input_variables=["product"],
    template=template,
)
prompt.format(product="colorful socks")
# -> I want you to act as a naming consultant for new companies.
# -> What is a good name for a company that makes colorful socks?

Memory

In LangChain, chains and agents run in a stateless mode by default, meaning they handle each incoming query independently. However, in certain applications (like chatbots), retaining previous interaction records is important for both short-term and long-term contexts. This is where the concept of “memory” comes in.

LangChain provides two forms of memory components. First, LangChain offers auxiliary tools for managing and manipulating previous chat messages, designed to be modular and usable regardless of the use case. Second, LangChain provides a straightforward method to integrate these tools into the chain structure, making it very flexible and adaptable to various situations.

Example:

from langchain.memory import ChatMessageHistory  
  
history  = ChatMessageHistory()  
history.add_user_message("hi!")  
  
history.add_ai_message("whats up?")  
history.messages

Output:

[HumanMessage(content='hi!', additional_kwargs={}),  
 AIMessage(content='whats up?', additional_kwargs={})]

Chain

Chains provide a way to combine various components into a unified application. For example, a chain can be created that receives user input, formats it using a PromptTemplate, and then passes the formatted reply to an LLM (large language model). By integrating multiple chains with other components, more complex chain structures can be generated.

LLMChain is considered one of the most commonly used methods for querying LLM objects. It formats the provided input key-value and memory key-value (if present) according to the prompt template, then sends the formatted string to the LLM, which generates and returns the output result.

After calling the language model, a series of steps can be performed, allowing for a sequence of multiple model calls. This practice is particularly valuable when the output of one call is desired as the input for another. In this chain sequence, each chain has an input and an output, with the output of one step serving as the input for the next.

#Here we are chaining everything
from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
)
human_message_prompt = HumanMessagePromptTemplate(
        prompt=PromptTemplate(
            template="What is a good name for a company that makes {product}?",

Leave a Comment