Protocol Buffers使用指南

原创

Jarvan_IV 2023-08-09 19:38:53 ©著作权

文章标签 java spring 文章分类

©著作权归作者所有：来自51CTO博客作者Jarvan_IV的原创作品，请联系作者获取转载授权，否则将追究法律责任

Protocol Buffers使用指南

一、简介

protocol buffers 是一种语言无关、平台无关、可扩展的序列化结构数据的方法，它可用于（数据）通信协议、数据存储等。

Protocol Buffers 是一种灵活，高效，自动化机制的结构数据序列化方法－可类比 XML，但是比 XML 更小（3 ~ 10倍）、更快（20 ~ 100倍）、更为简单。

你可以定义数据的结构，然后使用特殊生成的源代码轻松的在各种数据流中使用各种语言进行编写和读取结构数据。你甚至可以更新数据结构，而不破坏由旧数据结构编译的已部署程序。

总结下，有以下特点

语言无关、平台无关。即 ProtoBuf 支持 Java、C++、Python 等多种语言，支持多个平台
高效。即比 XML 更小（3 ~ 10倍）、更快（20 ~ 100倍）、更为简单
扩展性、兼容性好。你可以更新数据结构，而不影响和破坏原有的旧程序

二、ProtoBuf相关概念

1、定义消息类型

syntax = "proto3";

message SearchRequest {
  string query = 1;
  int32 page_number = 2;
  int32 result_per_page = 3;
}

文件的第一行指定您正在使用proto3语法：如果不这样做，则协议缓冲区编译器将假定您正在使用proto2。这必须是文件的第一行非空，非注释行。
所述SearchRequest消息定义指定了三个字段（名称/值对），一个用于每条数据要在此类型的消息包括。每个字段都有一个名称和类型。

字段类型：

上述我们使用了两个整型和一个字符串，你也可以使用复合类型、枚举、或者其他消息类型

字段编号：

消息定义中的每个字段都有一个唯一的编号。这些字段号用于标识[消息二进制格式的]字段，一旦使用了消息类型，就不应更改这些字段号

字段规则：

消息字段可以是以下内容之一：

单数：格式正确的邮件可以包含零个或一个此字段（但不能超过一个）。这是proto3语法的默认字段规则。

repeated：在格式正确的邮件中，此字段可以重复任意次（包括零次）。重复值的顺序将保留。

在proto3中，repeated标量数字类型的字段packed默认情况下使用编码。

添加注释：

要将注释添加到.proto文件中，请使用//样式和/* ... */语法。

/* SearchRequest represents a search query, with pagination options to
 * indicate which results to include in the response. */

message SearchRequest {
  string query = 1;
  int32 page_number = 2;  // Which page number do we want?
  int32 result_per_page = 3;  // Number of results to return per page.
}

2、标量值类型

下表展示了.proto文件的类型和自动生成的文件的类型

.proto Type	Notes	C++ Type	Java/Kotlin Type[1]	Python Type[3]	Go Type	Ruby Type	C# Type	PHP Type	Dart Type
double		double	double	float	float64	Float	double	float	double
float		float	float	float	float32	Float	float	float	double
int32	Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead.	int32	int	int	int32	Fixnum or Bignum (as required)	int	integer	int
int64	Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead.	int64	long	int/long[4]	int64	Bignum	long	integer/string[6]	Int64
uint32	Uses variable-length encoding.	uint32	int[2]	int/long[4]	uint32	Fixnum or Bignum (as required)	uint	integer	int
uint64	Uses variable-length encoding.	uint64	long[2]	int/long[4]	uint64	Bignum	ulong	integer/string[6]	Int64
sint32	Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s.	int32	int	int	int32	Fixnum or Bignum (as required)	int	integer	int
sint64	Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s.	int64	long	int/long[4]	int64	Bignum	long	integer/string[6]	Int64
fixed32	Always four bytes. More efficient than uint32 if values are often greater than 228.	uint32	int[2]	int/long[4]	uint32	Fixnum or Bignum (as required)	uint	integer	int
fixed64	Always eight bytes. More efficient than uint64 if values are often greater than 256.	uint64	long[2]	int/long[4]	uint64	Bignum	ulong	integer/string[6]	Int64
sfixed32	Always four bytes.	int32	int	int	int32	Fixnum or Bignum (as required)	int	integer	int
sfixed64	Always eight bytes.	int64	long	int/long[4]	int64	Bignum	long	integer/string[6]	Int64
bool		bool	boolean	bool	bool	TrueClass/FalseClass	bool	boolean	bool
string	A string must always contain UTF-8 encoded or 7-bit ASCII text, and cannot be longer than 232.	string	String	str/unicode[5]	string	String (UTF-8)	string	string	String
bytes	May contain any arbitrary sequence of bytes no longer than 232.	string	ByteString	str	[]byte	String (ASCII-8BIT)	ByteString	string	List

3、默认值

解析消息时，如果编码的消息不包含特定的单数元素，则已解析对象中的相应字段将设置为该字段的默认值。这些默认值是特定于类型的：

对于字符串，默认值为空字符串。
对于字节，默认值为空字节。
对于布尔值，默认值为false。
对于数字类型，默认值为零。
对于[枚举]，默认值为第一个定义的枚举值，必须为0。
对于消息字段，未设置该字段。它的确切值取决于语言。

重复字段的默认值是空的（通常是使用适当语言的空列表）。

4、枚举

消息定义我们也可以使用枚举类型

message SearchRequest {
  string query = 1;
  int32 page_number = 2;
  int32 result_per_page = 3;
  enum Corpus {
    UNIVERSAL = 0;
    WEB = 1;
    IMAGES = 2;
    LOCAL = 3;
    NEWS = 4;
    PRODUCTS = 5;
    VIDEO = 6;
  }
  Corpus corpus = 4;
}

5、嵌套类型

我们也可以使用嵌套类型

message SearchResponse {
  message Result {
    string url = 1;
    string title = 2;
    repeated string snippets = 3;
  }
  repeated Result results = 1;
}

如果要在消息类型之外重用此消息，则将其称为_Parent_._Type_：

message SomeOtherMessage {
  SearchResponse.Result result = 1;
}

6、Map

如果要创建关联映射作为数据定义的一部分，则协议缓冲区提供了方便的快捷方式语法：

map<key_type, value_type> map_field = N;

其中key_type可以是任何整数或字符串类型(不能为枚举)，map 的value_type可以是任何类型的但不能为另一map(也就是说不支持嵌套)。

注意点：

map字段不能为repeated。
map是无序的，你不能根据它的顺序来处理相关业务
如果存在重复的键，则使用最后一个。如果

7、定义服务

如果要将消息类型与RPC（远程过程调用）系统一起使用，则可以在.proto文件中定义RPC服务接口，并且协议缓冲区编译器将以您选择的语言生成服务接口代码和存根。因此，例如，如果您想使用一种方法来定义RPC服务，该方法接受您的方法SearchRequest并返回SearchResponse，则可以在.proto文件中按以下方式定义它：

service SearchService {
  rpc Search(SearchRequest) returns (SearchResponse);
}

与协议缓冲区一起使用的最简单的RPC系统是gRPC：这是Google开发的与语言和平台无关的开源RPC系统。gRPC与协议缓冲区配合使用特别好，它使您可以.proto使用特殊的协议缓冲区编译器插件直接从文件中生成相关的RPC代码。

8、JSON对应

Proto3支持JSON中的规范编码，从而使在系统之间共享数据更加容易。下表中按类型对编码进行了描述。

如果JSON编码的数据中缺少某个值，或者如果该值为null，则在解析为协议缓冲区时，它将被解释为适当的[默认值]。

proto3	JSON	JSON example	Notes
message	object	`{"fooBar": v, "g": null, …}`	Generates JSON objects. Message field names are mapped to lowerCamelCase and become JSON object keys. If the `json_name` field option is specified, the specified value will be used as the key instead. Parsers accept both the lowerCamelCase name (or the one specified by the `json_name` option) and the original proto field name. `null` is an accepted value for all field types and treated as the default value of the corresponding field type.
enum	string	`"FOO_BAR"`	The name of the enum value as specified in proto is used. Parsers accept both enum names and integer values.
map<K,V>	object	`{"k": v, …}`	All keys are converted to strings.
repeated V	array	`[v, …]`	`null` is accepted as the empty list `[]`.
bool	true, false	`true, false`
string	string	`"Hello World!"`
bytes	base64 string	`"YWJjMTIzIT8kKiYoKSctPUB+"`	JSON value will be the data encoded as a string using standard base64 encoding with paddings. Either standard or URL-safe base64 encoding with/without paddings are accepted.
int32, fixed32, uint32	number	`1, -10, 0`	JSON value will be a decimal number. Either numbers or strings are accepted.
int64, fixed64, uint64	string	`"1", "-10"`	JSON value will be a decimal string. Either numbers or strings are accepted.
float, double	number	`1.1, -10.0, 0, "NaN", "Infinity"`	JSON value will be a number or one of the special string values “NaN”, “Infinity”, and “-Infinity”. Either numbers or strings are accepted. Exponent notation is also accepted. -0 is considered equivalent to 0.
Any	`object`	`{"@type": "url", "f": v, … }`	If the Any contains a value that has a special JSON mapping, it will be converted as follows: `{"@type": xxx, "value": yyy}`. Otherwise, the value will be converted into a JSON object, and the `"@type"` field will be inserted to indicate the actual data type.
Timestamp	string	`"1972-01-01T10:00:20.021Z"`	Uses RFC 3339, where generated output will always be Z-normalized and uses 0, 3, 6 or 9 fractional digits. Offsets other than “Z” are also accepted.
Duration	string	`"1.000340012s", "1s"`	Generated output always contains 0, 3, 6, or 9 fractional digits, depending on required precision, followed by the suffix “s”. Accepted are any fractional digits (also none) as long as they fit into nano-seconds precision and the suffix “s” is required.
Struct	`object`	`{ … }`	Any JSON object. See `struct.proto`.
Wrapper types	various types	`2, "2", "foo", true, "true", null, 0, …`	Wrappers use the same representation in JSON as the wrapped primitive type, except that `null` is allowed and preserved during data conversion and transfer.
FieldMask	string	`"f.fooBar,h"`	See `field_mask.proto`.
ListValue	array	`[foo, bar, …]`
Value	value		Any JSON value. Check google.protobuf.Value for details.
NullValue	null		JSON null
Empty	object	`{}`	An empty JSON object

三、ProtoBuf使用

对 ProtoBuf 的基本概念有了一定了解之后，我们来看看具体该如何使用 ProtoBuf。 第一步，创建 .proto 文件，定义数据结构，如下例1所示：

// 例1: 在 xxx.proto 文件中定义 Example1 message
message Example1 {
    optional string stringVal = 1;
    optional bytes bytesVal = 2;
    message EmbeddedMessage {
        int32 int32Val = 1;
        string stringVal = 2;
    }
    optional EmbeddedMessage embeddedExample1 = 3;
    repeated int32 repeatedInt32Val = 4;
    repeated string repeatedStringVal = 5;
}
message xxx {
  // 字段规则：required -> 字段只能也必须出现 1 次
  // 字段规则：optional -> 字段可出现 0 次或1次
  // 字段规则：repeated -> 字段可出现任意多次（包括 0）
  // 类型：int32、int64、sint32、sint64、string、32-bit ....
  // 字段编号：0 ~ 536870911（除去 19000 到 19999 之间的数字）
  字段规则 类型 名称 = 字段编号;
}

在上例中，我们定义了：

类型 string，名为 stringVal 的 optional 可选字段，字段编号为 1，此字段可出现 0 或 1 次
类型 bytes，名为 bytesVal 的 optional 可选字段，字段编号为 2，此字段可出现 0 或 1 次
类型 EmbeddedMessage（自定义的内嵌 message 类型），名为 embeddedExample1 的 optional 可选字段，字段编号为 3，此字段可出现 0 或 1 次
类型 int32，名为 repeatedInt32Val 的 repeated 可重复字段，字段编号为 4，此字段可出现任意多次（包括 0）
类型 string，名为 repeatedStringVal 的 repeated 可重复字段，字段编号为 5，此字段可出现任意多次（包括 0）

四、根据.proto文件生成代码

编译器命令如下：

protoc --proto_path=IMPORT_PATH --java_out=DST_DIR  path/to/file.proto

IMPORT_PATH指定.proto解析import指令时要在其中查找文件的目录。如果省略，则使用当前目录。可以通过--proto_path多次传递选项来指定多个导入目录。将按顺序搜索它们。-I=_IMPORT_PATH_可以用作的简写形式--proto_path。
您可以提供一个或多个输出指令：
- --cpp_out在中生成C ++代码DST_DIR。
- --java_out在中生成Java代码DST_DIR。
- --kotlin_out在中生成其他Kotlin代码DST_DIR。
- --python_out在中生成Python代码DST_DIR。
- --go_out在中生成Go代码DST_DIR。
- --ruby_out在中生成Ruby代码DST_DIR。Ruby生成的代码参考即将推出！
- --objc_out在中生成Objective-C代码DST_DIR。objective-c-generated)。
- --csharp_out在中生成C＃代码DST_DIR。
- --php_out在中生成PHP代码DST_DIR。为方便起见，如果DST_DIR结尾为.zip或.jar，编译器会将输出写入具有给定名称的单个ZIP格式的存档文件。.jar根据Java JAR规范的要求，还将为输出提供清单文件。请注意，如果输出归档文件已经存在，它将被覆盖；编译器不够智能，无法将文件添加到现有存档中。
您必须提供一个或多个.proto文件作为输入。.proto可以一次指定多个文件。尽管这些文件是相对于当前目录命名的，但是每个文件都必须位于IMPORT_PATHs之一中，以便编译器可以确定其规范名称。

// $SRC_DIR: .proto 所在的源目录文件夹
// --java_out: $DST_DIR: 生成的 java 代码的目标目录
// xxx.proto: 要针对哪个 proto 文件生成接口代码，如果生成全部把`xxx/proto/Member.proto`改为`xxx/proto/*.proto`

protoc -I=$SRC_DIR --java_out=$DST_DIR $SRC_DIR/xxx.proto
// 如下示例
protoc -I=/Users/jarvan/IdeaProjects/workspace-jarvan/spring-boot-grpc/src/main/proto --java_out=/Users/jarvan/ /Users/jarvan/IdeaProjects/workspace-jarvan/spring-boot-grpc/src/main/proto/Member.proto