Gemini’s ability to generate structured JSON output is a powerful tool, but without explicit guidance, it can produce output that looks like JSON but fails strict validation.

Let’s see this in action. Imagine we want Gemini to act as a product catalog API, returning details for a specific product ID.

Here’s a prompt and a hypothetical (but common) Gemini response:

Prompt:

You are a product catalog API. Given a product ID, return a JSON object with the product's details.
Product ID: 12345

Gemini’s (Potentially Flawed) Response:

{
  "product_id": "12345",
  "name": "Wireless Bluetooth Headphones",
  "description": "High-fidelity sound, 20-hour battery life, comfortable earcups.",
  "price": 79.99,
  "in_stock": true,
  "tags": ["audio", "electronics", "headphones"],
  "variants": [
    {
      "color": "Black",
      "sku": "HP-BT-BLK-12345"
    },
    {
      "color": "White",
      "sku": "HP-BT-WHT-12345"
    }
  ],
  "rating": 4.5
}

This looks good, right? But what if the prompt was slightly different, or Gemini’s internal state led to a subtly malformed output?

Consider this scenario: Prompt:

You are a product catalog API. Given a product ID, return a JSON object with the product's details.
Product ID: 67890

Gemini’s (Potentially Flawed) Response:

{
  "product_id": 67890,
  "name": "Smart LED Light Bulb",
  "description": "Controllable via app, millions of colors, energy efficient.",
  "price": "25.50",
  "in_stock": "yes",
  "tags": "lighting, smart home, bulbs",
  "variants": null,
  "rating": 4.8
}

This output fails strict JSON validation. product_id is an integer, not a string. price is a string, not a float. in_stock is the string "yes" instead of a boolean. tags is a single string instead of an array of strings. variants is null instead of an empty array or an object.

The problem is that Gemini, by default, is a language model focused on generating human-readable text. While it can mimic JSON structure, it doesn’t inherently understand or enforce the strict data types, formats, and constraints required by a JSON schema. It might infer types, but those inferences aren’t guaranteed to align with your application’s expectations.

The solution is to provide Gemini with a JSON schema and instruct it to generate output that conforms to that schema. This acts as a contract, defining the expected structure, data types, required fields, and even value constraints.

Let’s define a schema for our product catalog:

{
  "type": "object",
  "properties": {
    "product_id": {
      "type": "string",
      "description": "Unique identifier for the product."
    },
    "name": {
      "type": "string",
      "description": "The name of the product."
    },
    "description": {
      "type": "string",
      "description": "A detailed description of the product."
    },
    "price": {
      "type": "number",
      "format": "float",
      "description": "The price of the product."
    },
    "in_stock": {
      "type": "boolean",
      "description": "Indicates if the product is currently in stock."
    },
    "tags": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "Keywords associated with the product."
    },
    "variants": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "color": {
            "type": "string"
          },
          "sku": {
            "type": "string"
          }
        },
        "required": ["color", "sku"]
      },
      "description": "Available product variants."
    },
    "rating": {
      "type": "number",
      "format": "float",
      "minimum": 0,
      "maximum": 5,
      "description": "Customer rating of the product."
    }
  },
  "required": [
    "product_id",
    "name",
    "price",
    "in_stock"
  ]
}

Now, we incorporate this schema into our prompt. The key is to explicitly tell Gemini to adhere to the schema and to provide the schema itself.

Revised Prompt:

You are a product catalog API. Given a product ID, return a JSON object with the product's details that strictly conforms to the following JSON schema:

```json
{
  "type": "object",
  "properties": {
    "product_id": {
      "type": "string",
      "description": "Unique identifier for the product."
    },
    "name": {
      "type": "string",
      "description": "The name of the product."
    },
    "description": {
      "type": "string",
      "description": "A detailed description of the product."
    },
    "price": {
      "type": "number",
      "format": "float",
      "description": "The price of the product."
    },
    "in_stock": {
      "type": "boolean",
      "description": "Indicates if the product is currently in stock."
    },
    "tags": {
      "type": "array",
      "items": {
        "type": "string"
      },
      "description": "Keywords associated with the product."
    },
    "variants": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "color": {
            "type": "string"
          },
          "sku": {
            "type": "string"
          }
        },
        "required": ["color", "sku"]
      },
      "description": "Available product variants."
    },
    "rating": {
      "type": "number",
      "format": "float",
      "minimum": 0,
      "maximum": 5,
      "description": "Customer rating of the product."
    }
  },
  "required": [
    "product_id",
    "name",
    "price",
    "in_stock"
  ]
}

Product ID: 12345


**Gemini's (Schema-Compliant) Response:**
```json
{
  "product_id": "12345",
  "name": "Wireless Bluetooth Headphones",
  "description": "High-fidelity sound, 20-hour battery life, comfortable earcups.",
  "price": 79.99,
  "in_stock": true,
  "tags": ["audio", "electronics", "headphones"],
  "variants": [
    {
      "color": "Black",
      "sku": "HP-BT-BLK-12345"
    },
    {
      "color": "White",
      "sku": "HP-BT-WHT-12345"
    }
  ],
  "rating": 4.5
}

This output is now strictly valid against the provided schema. product_id is a string, price is a number, in_stock is a boolean, tags is an array of strings, and variants is an array of objects (or could be an empty array [] if there were no variants, which would also be valid).

The most surprising aspect of this approach is how little explicit instruction Gemini needs beyond the schema itself. You don’t need to write lengthy prose explaining data types; the schema is the explanation. The model learns to map its generative capabilities to the constraints defined by the schema, effectively treating it as a set of rules for its output.

The levers you control are entirely within the JSON schema definition. You define what constitutes valid output. This includes:

  • type: string, number, boolean, object, array, null.
  • properties: For objects, defines the expected keys and their corresponding schemas.
  • required: An array of keys that must be present in an object.
  • items: For arrays, defines the schema for each element within the array.
  • enum: Restricts a value to a specific set of allowed values (e.g., ["active", "inactive"]).
  • minLength, maxLength: For strings.
  • minimum, maximum: For numbers.
  • format: Predefined formats like date-time, email, uri, or custom ones.

After enforcing structured JSON output with schema validation, the next challenge is handling cases where Gemini cannot fulfill the request according to the schema, or when the generated output is valid but semantically incorrect for the given input.

Want structured learning?

Take the full Gemini-api course →