The most surprising truth about fine-tuning LLMs for instruction following is that the model often doesn’t "understand" instructions in the way humans do; it learns to predict the pattern of instruction-response pairs it’s shown.
Let’s see this in action. Imagine we have a model we want to teach to summarize text. We’d feed it examples like this:
[
{
"instruction": "Summarize the following text:\n\n[Long article text here]",
"input": "",
"output": "[Concise summary here]"
},
{
"instruction": "Provide a brief summary of the article below:\n\n[Another long article text]",
"input": "",
"output": "[Another concise summary]"
}
]
The instruction field is the prompt we give the model, the input field can be used for additional context (though it’s empty in this summarization example), and output is what we expect the model to generate.
The system we’re interacting with, be it a fine-tuning script or an inference API, needs to know how to format these examples into a single, continuous string that the LLM can process. This formatting is crucial. Different models, and even different versions of the same model, are trained with specific template structures. If you deviate, the model’s performance will tank.
Consider a common instruction-following template:
### Instruction:
{instruction}
### Response:
{output}
When we feed our summarization example into this template, it becomes:
### Instruction:
Summarize the following text:
[Long article text here]
### Response:
[Concise summary here]
The ### Instruction: and ### Response: tokens act as special markers, signaling to the model where the user’s request ends and where the model’s generated output should begin. This structure helps the model distinguish between the prompt and the desired completion.
When using the input field, the template might look like this:
### Instruction:
{instruction}
### Input:
{input}
### Response:
{output}
And an example would format to:
### Instruction:
Translate the following sentence from English to French.
### Input:
Hello, how are you?
### Response:
Bonjour, comment ça va?
The key is that the fine-tuning data must consistently adhere to the exact template the model expects. If the model was trained on a template that uses USER: and ASSISTANT: instead of ### Instruction: and ### Response:, it will be confused. It might start generating text after ### Input: or misunderstand the boundaries of the instruction.
The actual process of formatting often involves a simple string replacement. A Python script might iterate through your dataset (e.g., a list of dictionaries) and apply the template to each entry.
def format_example(example, template):
formatted = template.replace("{instruction}", example["instruction"])
formatted = formatted.replace("{input}", example["input"])
formatted = formatted.replace("{output}", example["output"])
return formatted
# Example usage
dataset = [
{"instruction": "What is the capital of France?", "input": "", "output": "Paris"},
{"instruction": "Convert Celsius to Fahrenheit.", "input": "100", "output": "212"}
]
template = "USER: {instruction}\n{input}\nASSISTANT: {output}"
for item in dataset:
print(format_example(item, template))
This would produce:
USER: What is the capital of France?
ASSISTANT: Paris
USER: Convert Celsius to Fahrenheit.
100
ASSISTANT: 212
The \n within the template string ensures proper line breaks, which are also part of the learned pattern.
The impact of the template extends beyond just structuring the prompt; it influences how the model learns to stop generating. The ### Response: marker, for instance, is often what the model learns to end its output with during inference, if the template is designed that way. If your template uses a different end-of-response token, the model will learn to use that instead.
One subtle but critical aspect of template design is the inclusion of a "stop token" for the model’s output. Many fine-tuning setups expect the template to end with a specific marker that the model will learn to generate. For example, if your template is USER: {instruction}\nASSISTANT: {output}, the model is implicitly trained to generate text following ASSISTANT:. However, some frameworks allow you to explicitly define a stop token, like </s> (end of sentence) or a custom token like [EOS]. This stop token is then appended to the formatted prompt before being fed to the model during inference, and the model is penalized if it generates text after that token. This is how you ensure the model doesn’t ramble on indefinitely.
Once you’ve mastered instruction formatting, the next hurdle is understanding how to construct effective prompts that leverage the model’s instruction-following capabilities for more complex reasoning tasks.