LangChain Refine Chain: Iteratively Summarize Documents (2026)

The Refine Chain in LangChain doesn’t actually refine anything; it’s a way to summarize documents by repeatedly applying a summarization model, passing the intermediate summary along with the next chunk of text.

Let’s see it in action. Imagine you have a long document, too long for a single LLM call. The Refine Chain breaks it down.

from langchain.chains.combine_documents.refine import RefineDocumentsChain
from langchain.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate
from langchain.docstore.document import Document

# Sample documents (replace with your actual document loading)
docs = [
    Document(page_content="This is the first part of a very long document. It talks about the initial setup of a project and its early goals. The project aims to revolutionize the way we interact with data."),
    Document(page_content="The second part delves into the technical architecture. We discuss the use of microservices, the database choices, and the deployment strategy. Scalability and resilience are key considerations here."),
    Document(page_content="Finally, the third part covers the future roadmap. This includes planned features, potential partnerships, and the long-term vision for the project's impact on the industry. We anticipate significant growth."),
]

# Initialize the LLM
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

# Define the prompt for the initial summary
initial_prompt = PromptTemplate.from_template(
    "This is the first document I'm giving you. Summarize it:\n\n{text}"
)

# Define the prompt for refining the summary
refine_prompt = PromptTemplate.from_template(
    "This is not the first document, so I'll give you an existing summary and the next document to process.\n"
    "Please refine the existing summary based on the text in the next document.\n\n"
    "Existing Summary:\n{text}\n\n"
    "Next Document:\n{text}\n\n"
    "Refined Summary:"
)

# Create the RefineDocumentsChain
refine_chain = RefineDocumentsChain(
    llm_chain=LLMChain(llm=llm, prompt=initial_prompt), # This is actually the initial summary chain
    question_prompt=refine_prompt, # This is the prompt used for subsequent documents
    return_intermediate_steps=False, # Set to True to see intermediate summaries
)

# Run the chain
summary = refine_chain.run(docs)
print(summary)

The core problem this solves is exceeding LLM token limits for single documents. Instead of erroring out, the Refine Chain processes documents in batches, building a summary incrementally. It starts with the first document, generates an initial summary, and then takes that summary plus the next document to produce a refined summary. This process repeats until all documents are consumed.

Internally, it uses two main prompt templates: one for the very first document (the "initial prompt") and another for all subsequent documents (the "refine prompt"). The RefineDocumentsChain orchestrates this, passing the output of one LLM call as input to the next, effectively threading the summary through the document chunks. The llm_chain parameter is used for the initial summarization, and the question_prompt is used for the refinement steps.

When return_intermediate_steps is False (the default), you only get the final, cumulative summary. If you set it to True, you’ll get a list of all the intermediate summaries generated after processing each document, which can be useful for debugging or understanding how the summary evolved.

The most surprising thing is how the "refine" step works: it’s not about making the existing summary better in isolation. The refine_prompt is designed to take the Existing Summary and the Next Document and produce a new summary that incorporates information from both. The LLM is essentially instructed to re-summarize the combined context. This means the "existing summary" is just one piece of context for the LLM, alongside the new document chunk.

The next concept you’ll likely explore is how to handle different types of processing beyond summarization, such as question answering over multiple documents using chains like MapReduceDocumentsChain or StuffDocumentsChain.