Protocol Buffers are not just a serialization format; they’re a language-agnostic, platform-agnostic, extensible mechanism for serializing structured data that is fundamentally more efficient and robust than JSON or XML.

Let’s see this in action by defining a simple service for managing user profiles.

Imagine we have a system that needs to store and retrieve user information. We want to define the structure of this information and the operations we can perform on it in a way that’s clear and efficient, no matter what programming language our clients or servers are written in.

This is where Protocol Buffers, often shortened to Protobuf, and gRPC come in. Protobuf defines our data structures (messages) and our service interfaces (services). gRPC is a high-performance RPC (Remote Procedure Call) framework that uses Protobuf definitions to automatically generate client and server code.

Here’s a .proto file, the language Protobuf uses, that defines our UserProfile message and a UserService:

syntax = "proto3";

package user_management;

// Represents a user profile.
message UserProfile {
  int64 user_id = 1; // Unique identifier for the user.
  string first_name = 2;
  string last_name = 3;
  string email = 4;
  bool is_active = 5; // Whether the user account is currently active.
}

// Service definition for managing user profiles.
service UserService {
  // Creates a new user profile.
  rpc CreateUser(UserProfile) returns (UserProfile);

  // Retrieves a user profile by ID.
  rpc GetUser(GetUserRequest) returns (UserProfile);

  // Updates an existing user profile.
  rpc UpdateUser(UserProfile) returns (UserProfile);

  // Deactivates a user profile.
  rpc DeactivateUser(DeactivateUserRequest) returns (DeactivateUserResponse);
}

// Request message for GetUser.
message GetUserRequest {
  int64 user_id = 1;
}

// Response message for DeactivateUser.
message DeactivateUserResponse {
  bool success = 1;
  string message = 2;
}

In this .proto file:

  • syntax = "proto3"; specifies we’re using Protobuf version 3.
  • package user_management; provides a namespace to avoid naming conflicts.
  • message UserProfile { ... } defines the structure of our user data. Each field has a unique number (e.g., user_id = 1). These numbers are crucial for Protobuf’s binary encoding and must be unique within a message. They are used instead of field names in the serialized output for efficiency.
  • service UserService { ... } defines the API. It lists the methods (RPC calls) that can be made.
  • rpc CreateUser(UserProfile) returns (UserProfile); declares an RPC method named CreateUser. It takes a UserProfile message as input and returns a UserProfile message.
  • GetUserRequest and DeactivateUserResponse are separate messages used for specific RPC inputs or outputs, demonstrating that not every RPC needs to use the primary UserProfile message for both request and response.

When you compile this .proto file using the Protobuf compiler (protoc), it generates code for your chosen programming language (e.g., Python, Java, Go). This generated code includes classes for UserProfile, GetUserRequest, and DeactivateUserResponse, as well as client stubs and server base classes for UserService. You then implement the server logic by inheriting from the generated server base class and filling in the method implementations. For the client, you use the generated stub to call the remote methods as if they were local function calls.

The real power here is the efficiency. Protobuf’s binary encoding is significantly smaller and faster to parse than text-based formats like JSON. For example, serializing a UserProfile object with all fields populated might result in a byte stream around 30-50 bytes, whereas a JSON equivalent could be 100-150 bytes or more, and parsing that JSON is much more CPU-intensive.

A subtle but critical aspect of Protobuf evolution is how it handles schema changes. You can add new fields to messages without breaking existing clients or servers, as long as you don’t reuse field numbers. Old clients won’t see the new fields, and new clients will ignore fields they don’t recognize, maintaining backward and forward compatibility.

The next step after defining your services and messages is to generate the code and start implementing the server logic.

Want structured learning?

Take the full Grpc course →