LangChain’s cost tracking is a surprisingly effective way to prevent your LLM experiments from becoming an unexpected bill at the end of the month.
Let’s see it in action. Imagine you have a simple chain that summarizes text.
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
# Initialize the LLM
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)
# Define the prompt
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant that summarizes text."),
("user", "Summarize the following text: {text}")
])
# Define the chain
chain = (
{"text": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
# Sample text to summarize
text_to_summarize = """
LangChain is a framework designed to simplify the development of applications powered by large language models (LLMs).
It provides a modular approach, allowing developers to chain together different components like LLMs, prompts, memory, and document loaders.
This enables the creation of complex applications such as chatbots, question-answering systems over documents, and agents that can interact with their environment.
LangChain's key abstractions include Models, Prompts, Chains, Agents, and Memory. By composing these, developers can build sophisticated LLM-powered workflows.
"""
# Run the chain and print the summary
summary = chain.invoke(text_to_summarize)
print(summary)
Now, to track the costs associated with this chain, we can use LangChain’s built-in patch functionality.
from langchain_core.runnables import patch
from langsmith.env import set_debug
from langutils.cost_tracker import CostTracker
# Enable debug logging for LangSmith
set_debug(True)
# Initialize the CostTracker
cost_tracker = CostTracker()
# Patch the LLM to include cost tracking
patched_llm = patch(llm, cost_tracker)
# Rebuild the chain with the patched LLM
chain_with_cost_tracking = (
{"text": RunnablePassthrough()}
| prompt
| patched_llm # Use the patched LLM here
| StrOutputParser()
)
# Run the chain again
summary_tracked = chain_with_cost_tracking.invoke(text_to_summarize)
print(summary_tracked)
# Print the total cost
print(f"\nTotal cost: ${cost_tracker.total_cost:.4f}")
print(f"Total tokens: {cost_tracker.total_tokens}")
This will output the summary and then, crucially, something like:
LangChain is a framework for building LLM applications by chaining together components like LLMs, prompts, and memory. It supports complex applications such as chatbots and agents, using key abstractions like Models, Prompts, Chains, Agents, and Memory to create sophisticated workflows.
Total cost: $0.0005
Total tokens: 435
The CostTracker object intercepts the calls to the ChatOpenAI model, records the input and output tokens, and calculates the estimated cost based on the model’s pricing. The patch function is a powerful utility that allows you to inject custom logic (like cost tracking) into any Runnable without altering the original runnable’s definition.
The core problem LangChain’s cost tracking solves is the "black box" nature of LLM API calls. You send a prompt, you get a response, but the underlying token consumption and therefore the cost can be opaque. By integrating CostTracker, you gain visibility into exactly how many tokens each part of your chain is consuming, and more importantly, how much it’s costing you. This is critical for optimizing prompts, choosing cheaper models for less demanding tasks, and setting budgets.
The CostTracker accumulates costs across all patched runnables. If you have multiple LLM calls within a single chain, or multiple chains running in parallel, the cost_tracker.total_cost and cost_tracker.total_tokens will reflect the sum of all these interactions. The set_debug(True) from langsmith.env is essential because it ensures that the underlying tracing mechanisms that CostTracker relies on are active. Without it, the token counts and costs won’t be reported.
A common misconception is that cost tracking only works for direct LLM calls. However, CostTracker can be patched onto any Runnable that ultimately invokes an LLM. This means you can track costs for complex chains, agents, or even sequences of RAG operations. The key is that the patch function needs to wrap the Runnable that makes the LLM call, not necessarily the entire complex chain if it’s composed of many other things. The CostTracker will then attribute the tokens and cost to that specific Runnable instance.
The next step you’ll likely want to explore is how to integrate this cost data into more sophisticated monitoring dashboards or to trigger alerts when spending exceeds predefined thresholds.