LLMs are surprisingly bad at spitting out valid JSON, even when you explicitly tell them to.
Let’s see what happens when we ask a model for a simple JSON object.
import openai
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": "You are a helpful assistant that formats output as JSON. Respond with a JSON object containing a single key 'message' with the value 'Hello, world!'."
},
{
"role": "user",
"content": "Give me the JSON."
}
],
response_format={"type": "json_object"}
)
print(response.choices[0].message.content)
Output:
{
"message": "Hello, world!"
}
That looks good, right? But what if the prompt is slightly more complex, or the model gets a bit "creative"?
Consider this scenario: we want to extract structured information about a user from a natural language description.
import openai
client = openai.OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": "Extract the user's name and age from the following text and return it as a JSON object with keys 'name' and 'age'. The age should be an integer."
},
{
"role": "user",
"content": "The user's name is Alice, and she is 30 years old."
}
],
response_format={"type": "json_object"}
)
print(response.choices[0].message.content)
Output:
{
"name": "Alice",
"age": 30
}
Still perfect. The response_format={"type": "json_object"} is a powerful tool provided by APIs like OpenAI’s. It instructs the model to adhere strictly to JSON syntax. When this is enabled, the API internally biases the model’s output towards generating valid JSON. It’s not just a suggestion; it’s a constraint enforced at the API level. The model is fine-tuned and has mechanisms to ensure the output conforms to JSON structure, including proper quoting of keys and string values, correct placement of commas, and valid data types.
However, the real challenge arises when you need to ensure semantic correctness and handle edge cases. The json_object format guarantees syntactic validity, but not necessarily that the content matches your exact schema or that the model won’t try to sneak in non-JSON tokens.
For instance, imagine you have a predefined JSON schema and want the LLM to fill it. The json_object flag helps ensure the output is JSON, but doesn’t inherently validate against your specific schema.
Here’s how you’d typically integrate this:
- Define your target JSON structure: This could be a Python dictionary or a JSON schema.
- Craft your prompt: Clearly instruct the model to populate this structure.
- Use
response_format={"type": "json_object"}: This is critical for syntactic correctness. - Post-process (optional but recommended): Validate the generated JSON against your schema using libraries like
jsonschemain Python.
The prompt is key. Instead of just "return JSON," be explicit: "Return a JSON object with the following structure: {'name': 'string', 'age': 'integer'}. For the user 'Bob, who is 25', the output should be {'name': 'Bob', 'age': 25}."
The json_object flag doesn’t make the model understand your schema perfectly; it makes it output syntactically valid JSON attempting to follow your instructions. If the model hallucinates a field or uses the wrong data type, the json_object flag won’t catch it. You still need robust error handling and validation on your end.
The most surprising thing is that even with json_object enabled, models can sometimes produce output that looks like JSON but isn’t strictly valid, especially with complex nesting or very long outputs, though this is becoming rarer with newer models and API implementations. If you encounter an error, it’s often because the model added a preamble or postamble, like "Here is the JSON you requested:" or "json ... ", which json_object mode is supposed to prevent.
When the model fails to produce valid JSON despite response_format={"type": "json_object"}, it’s usually because the API’s internal mechanisms for enforcing JSON output were overridden by a stronger instruction or a model hallucination. This could be a specific instruction in the prompt like "print the JSON within a markdown code block" or a more complex output that strains the model’s ability to maintain strict JSON formatting.
The next problem you’ll likely encounter is handling semantic validation: ensuring the JSON data conforms to your application’s specific requirements, not just the general rules of JSON syntax.