Integrate the Gemini API with LangChain (2026)

The Gemini API can be integrated with LangChain by configuring the ChatGoogleGenerativeAI class with your API key and desired model.

import os
from langchain_google_genai import ChatGoogleGenerativeAI

# Set your Google API key as an environment variable
# export GOOGLE_API_KEY='YOUR_API_KEY'

llm = ChatGoogleGenerativeAI(model="gemini-pro", 
                             temperature=0.7, 
                             top_p=0.9, 
                             google_api_key=os.environ.get("GOOGLE_API_KEY"))

# Example usage:
response = llm.invoke("Tell me a short story about a robot learning to love.")
print(response.content)

This code snippet initializes the ChatGoogleGenerativeAI model, which is LangChain’s interface for interacting with Gemini. You provide your google_api_key (best practice is to set it as an environment variable), specify the model you want to use (e.g., "gemini-pro"), and can tune parameters like temperature for creativity and top_p for nucleus sampling. The invoke method then sends your prompt to the Gemini API and returns the generated text.

The true power of integrating Gemini with LangChain lies in its ability to act as a sophisticated reasoning engine within a larger application. It’s not just about getting text completions; it’s about using Gemini’s understanding of context, its ability to follow instructions, and its generative capabilities to orchestrate complex workflows. Think of it as the brain of your AI agent, capable of understanding user intent, retrieving information, and generating coherent, contextually relevant responses or actions.

Let’s break down how this works in practice. LangChain provides a framework for building applications with LLMs. It offers abstractions for:

Models: As seen above, ChatGoogleGenerativeAI is the LangChain wrapper for the Gemini API.
Prompts: Templates for constructing input to the LLM, allowing for dynamic insertion of variables.
Chains: Sequences of calls to LLMs or other utilities, enabling multi-step reasoning.
Agents: LLMs that use tools (like search engines, databases, or other APIs) to decide what actions to take and in what order.
Memory: Mechanisms for persisting state between calls to an LLM, allowing for conversational context.

Consider a scenario where you want to build a chatbot that can answer questions about your company’s documentation.

1. Setting up the Environment:

First, ensure you have the necessary libraries installed:

pip install langchain langchain-google-genai

And set your API key:

export GOOGLE_API_KEY='YOUR_API_KEY'

2. Simple Question Answering:

A basic integration involves directly calling the model.

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.messages import HumanMessage

llm = ChatGoogleGenerativeAI(model="gemini-pro")

response = llm.invoke([
    HumanMessage(content="What is the capital of France?")
])

print(response.content)
# Output: Paris

3. Using Prompt Templates:

To make prompts more dynamic and reusable, LangChain offers PromptTemplate.

from langchain_google_genai import ChatGoogleGenerativeAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

llm = ChatGoogleGenerativeAI(model="gemini-pro")

prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant that translates {language}."),
    ("user", "{text}")
])

output_parser = StrOutputParser()

chain = prompt | llm | output_parser

# Example: Translate English to Spanish
print(chain.invoke({"language": "Spanish", "text": "Hello, how are you?"}))
# Output: Hola, ¿cómo estás?

# Example: Translate English to French
print(chain.invoke({"language": "French", "text": "What is the weather like today?"}))
# Output: Quel temps fait-il aujourd'hui ?

Here, we define a prompt that expects a language and text variable. The | operator chains the prompt, the LLM, and an output parser (StrOutputParser to get just the string content) together.

4. Building a Retrieval-Augmented Generation (RAG) System:

This is where the integration truly shines. RAG combines an LLM with an external knowledge base. Gemini’s strong reasoning capabilities make it ideal for understanding retrieved documents and synthesizing answers.

Imagine you have a collection of documents. You’d typically: a. Load and Chunk Documents: Split large documents into smaller, manageable pieces. b. Create Embeddings: Convert these chunks into numerical vectors using an embedding model. c. Store Embeddings: Save these vectors in a vector database (e.g., Chroma, FAISS). d. Retrieve Relevant Chunks: When a user asks a question, embed the question and find the most similar document chunks in the vector database. e. Augment Prompt: Feed the retrieved chunks along with the original question to the LLM. f. Generate Answer: The LLM uses the provided context to answer the question.

# Conceptual Example (requires setting up a vector store)
from langchain_google_genai import ChatGoogleGenerativeAI, GoogleGenerativeAIEmbeddings
from langchain_community.document_loaders import WebBaseLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain.chains.combine_documents import create_stuff_documents_chain
from langchain.chains import create_retrieval_chain

# 1. Load and chunk documents (example using a web page)
loader = WebBaseLoader("https://www.langchain.com/about")
docs = loader.load()

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
splits = text_splitter.split_documents(docs)

# 2. Create embeddings and store in a vector database
vectorstore = Chroma.from_documents(documents=splits, 
                                    embedding=GoogleGenerativeAIEmbeddings())

# 3. Create retriever
retriever = vectorstore.as_retriever()

# 4. Initialize LLM
llm = ChatGoogleGenerativeAI(model="gemini-pro")

# 5. Create retrieval chain
# This chain takes retrieved documents and the question,
# formats them into a prompt, and passes them to the LLM.
prompt = ChatPromptTemplate.from_template("""Answer the following question based only on the provided context:
<context>
{context}
</context>
Question: {input}""")

document_chain = create_stuff_documents_chain(llm, prompt)

# 6. Create the full retrieval chain
retrieval_chain = create_retrieval_chain(retriever, document_chain)

# 7. Ask a question
response = retrieval_chain.invoke({"input": "What is LangChain focused on?"})
print(response["answer"])

The most surprising aspect of these integrations is how the LLM’s ability to follow instructions, when combined with retrieved context, transforms it from a text generator into a knowledge interrogator. It’s not just predicting the next word; it’s performing a specific task (answering a question) based on provided evidence, effectively grounding its output in factual information. This is achieved by carefully crafting the prompt that includes the retrieved context, instructing the LLM to only use that context.

When you use create_stuff_documents_chain, LangChain constructs a prompt where the context variable is populated with the text of the retrieved document chunks, and input is your original question. The LLM then processes this combined input. The create_retrieval_chain orchestrates fetching the documents via the retriever and then passing them to the document_chain for synthesis.

The next step in mastering Gemini with LangChain is exploring agents, where the LLM can dynamically choose which tools to use to answer complex, multi-step queries.