A gRPC API’s versioning strategy is fundamentally about managing the evolution of its data structures and their field identifiers, not just the service methods themselves.
Let’s say you have a UserService with a User message:
syntax = "proto3";
package user_service.v1;
message User {
int64 id = 1;
string name = 2;
}
service UserService {
rpc GetUser(GetUserRequest) returns (User);
}
message GetUserRequest {
int64 user_id = 1;
}
This defines User with id as field number 1 and name as field number 2. The package user_service.v1 is crucial.
Now, imagine you need to add a new field, email, to the User message.
The "Wrong" Way (and why it breaks things):
If you simply add email to the existing User message and deploy, you’ve broken compatibility for older clients.
syntax = "proto3";
package user_service.v1; // Still v1!
message User {
int64 id = 1;
string name = 2;
string email = 3; // Added email
}
service UserService {
rpc GetUser(GetUserRequest) returns (User);
}
message GetUserRequest {
int64 user_id = 1;
}
An old client, compiled against the original User definition, will receive a User message from the new server. It expects id and name. It might see the email field as an unknown field (which is okay for protobuf, it generally ignores unknown fields). However, if the server was expecting an old client to send data, and the client now sends email (because its generated code includes it), the server might misinterpret the data if it’s not expecting it. More critically, if you change the type or meaning of an existing field, or reuse a field number, you’re in immediate trouble.
The "Right" Way: Versioning with Package Names and Field Numbers
The standard gRPC/protobuf approach leverages two key mechanisms:
- Package Names (Namespacing): This is your primary tool for distinct API versions. You create a new package for each major version.
- Field Numbers (Serialization Stability): These numbers are the stable identifiers in the wire format. They should never be changed for existing fields, and you should generally avoid reusing them.
Let’s version our UserService to v2:
user_service/v1/user.proto:
syntax = "proto3";
package user_service.v1; // Version 1 package
message User {
int64 id = 1;
string name = 2;
}
message GetUserRequest {
int64 user_id = 1;
}
service UserService {
rpc GetUser(GetUserRequest) returns (User);
}
user_service/v2/user.proto:
syntax = "proto3";
package user_service.v2; // Version 2 package
import "user_service/v1/user.proto"; // Import the v1 definition
message User {
int64 id = 1;
string name = 2;
string email = 3; // New field, new number
}
message GetUserRequest {
int64 user_id = 1;
// Maybe a new field here too, e.g., string filter = 2;
}
service UserService {
// New service, or updated service signature
// This example shows a new service definition for v2
rpc GetUser(GetUserRequest) returns (User);
rpc GetUserByEmail(GetUserByEmailRequest) returns (User); // New method
}
message GetUserByEmailRequest {
string email = 1;
}
How this works in practice:
- Client Compilation: When a client compiles against
user_service/v1/user.proto, it generates code that understandsuser_service.v1.User. When it compiles againstuser_service/v2/user.proto, it generates code foruser_service.v2.User. These are distinct types. - Server Deployment: A server can expose both
v1andv2APIs simultaneously. Clients are directed to the API version they were compiled against. - Serialization: The field numbers (
1,2,3) are what get serialized. Becausev1andv2use different package names, the generated code for each version knows which set of field numbers corresponds to which message definition.- A
v1client sending aUsermessage will populate fields 1 and 2. - A
v2client sending aUsermessage will populate fields 1, 2, and 3. - A
v2server receiving a message from av1client will correctly deserialize fields 1 and 2 intouser_service.v1.User(or a compatible internal representation). It will ignore field 3 if it was sent by av1client. - A
v1server receiving a message from av2client will correctly deserialize fields 1 and 2. If thev2client sent field 3, thev1server will ignore it.
- A
Key Takeaways:
- Never change field numbers for existing fields. If you need to change a field’s meaning or type, it’s effectively a new field, and you should assign it a new, unused field number.
- Package names are your primary versioning mechanism.
package api.v1;vs.package api.v2;creates distinct namespaces and types. - Imports are crucial.
v2protos importv1protos to allow for schema evolution wherev2messages might containv1messages or inherit fields. - Field numbers are for wire format stability. They are how protobuf identifies data fields irrespective of the programming language or the message name.
- Service definitions can also be versioned. You might have
UserServiceV1andUserServiceV2if the service methods themselves change significantly.
The one thing that trips people up is thinking that changing a message definition within the same package name is safe. It’s not. Protobuf’s forwards and backwards compatibility relies heavily on field numbers, but the meaning and type associated with those numbers are defined by the .proto file. Different .proto files (especially those in different packages) define different messages, even if they share some field numbers.
The next thing you’ll likely encounter is how to handle deprecation and gradual rollout of new API versions.