LangSmith tracing lets you see exactly what’s happening inside your LangChain applications when they run, making it way easier to find and fix bugs.

Let’s see it in action. Imagine a simple LangChain app that takes a user’s question, uses an LLM to brainstorm some related keywords, and then uses another LLM to answer the question based on those keywords.

Here’s a snippet of the Python code that sets up this tracing:

import os
from langsmith import Client
from langsmith.trace import Trace, LangChainTracer
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

# Set up LangSmith environment variables
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_API_KEY"] = "your-langsmith-api-key"
os.environ["LANGCHAIN_PROJECT"] = "my-langchain-project"

# Initialize LangSmith tracer
tracer = LangChainTracer()
client = Client()

# Define your LLM
llm = ChatOpenAI(model="gpt-3.5-turbo")

# Define the prompt templates
keyword_prompt = ChatPromptTemplate.from_messages([
    ("system", "Generate 5 keywords related to the user's question."),
    ("human", "{question}")
])

answer_prompt = ChatPromptTemplate.from_messages([
    ("system", "Answer the question using the provided keywords. Question: {question}\nKeywords: {keywords}"),
    ("human", "{question}")
])

# Define the chain
chain = (
    {"keywords": keyword_prompt | llm, "question": RunnablePassthrough()}
    | answer_prompt
    | llm
)

# Run the chain and trace it
question = "What are the benefits of learning Python?"
with Trace() as trace:
    response = chain.invoke(question, config={"callbacks": [tracer]})
    trace.on_chain_end(response)

print(f"Answer: {response.content}")

When you run this code, LangSmith captures every step: the initial prompt, the LLM call for keywords, the LLM call for the answer, and the final output. You can then go to your LangSmith dashboard and see a detailed trace of this execution.

The core problem LangSmith tracing solves is the "black box" nature of LLM applications. Before tracing, if your LangChain app produced a bad answer, you’d be staring at code, trying to guess where the logic went wrong. Was it the prompt? The LLM’s output? The way you processed the LLM’s output? LangSmith gives you a step-by-step log, showing you the inputs and outputs at each stage of your LangChain pipeline.

Internally, LangSmith integrates with LangChain’s callback system. When you run a LangChain component (like an LLM call, a prompt rendering, or a tool execution), it fires off events. The LangChainTracer listens to these events and sends them to the LangSmith backend, where they are organized into a trace. Each trace represents a single invocation of your application.

The levers you control are primarily in how you structure your LangChain application and how you configure the tracing. You can add custom naming to your traces, tag them for easier filtering, and even log arbitrary data alongside the standard trace events. The os.environ variables are crucial for authentication and directing traces to the correct project.

You can also use LangSmith to evaluate your chain’s performance. By comparing the outputs of your chain against expected results for a set of test cases, you can identify regressions and measure improvements. This is done through LangSmith’s built-in evaluation tools, where you define datasets and run your chain against them, then score the results.

The most surprising thing about LangSmith tracing is how much detail it reveals about LLM reasoning, even for seemingly simple tasks. You can see the exact tokens generated by the LLM, the intermediate steps it took (if you’re using agents), and how the data flows between different components of your chain. This allows you to debug not just code errors, but also "reasoning errors" where the LLM might be hallucinating or misinterpreting instructions in subtle ways.

If you’re using complex chains with many parallel branches or conditional logic, understanding how LangSmith visually represents these parallel executions and how to navigate between them in the UI will be your next challenge.

Want structured learning?

Take the full Langchain course →