Protobuf Zero-Value Trap: When Defaults Become Silent Killers of Business Logic

US stocks have three trading periods: pre-market, intra-market, and after-hours. The data push interface uses an incremental logic (to minimize bandwidth usage), sending the full dataset only once initially, and then pushing all subsequent fields as increments.

Why not use the optimal solution? It involves different project teams, some of which have been live for many years. We are newly connected, so we can only try our best to ensure compatibility.

A series of questions

Looking at the abstract alone, there might not seem to be any issues. However, bringing the system architecture into the problem-solving group has led to a series of problems. Just as one problem was resolved, a new one emerged, and this new problem stemmed from the previous ones.

Unable to identify trading period

The known issue is that the stage in the table is defined as 0 in protobuf, but due to incremental push when receiving data, the business side cannot effectively identify whether this 0 is a default value or a real business value

A layman’s understanding: Each time we receive a 0, it’s impossible to determine whether this 0 is the value of a newly set quote or the default value of Protobuf

Introduce optional

Since protobuf release 3.15, proto3 supports using the optional keyword (just as in proto2) to give a scalar field presence information

The communication protocol within the group is based on protobuf, but due to historical reasons, an older version was chosen that doesn’t support the optional keyword. Those who understand know that because protobuf was introduced from the bottom up and the project is released as a static library, upgrading the entire compilation chain would be very costly.

GCC version issue

After much effort, we devised a plan to release two different versions at the underlying level, attempting to control the propagation of compilation dependencies for the new version of protobuf. However, during compilation, we discovered that the gcc version was too low and did not support the new features of protobuf.

Commonly used server types within the group: CentOS 7, CentOS 8. The default gcc version for CentOS 7 is 4.8, and the default gcc version for CentOS 8 is 8.3. Because new features of protobuf require a gcc version above 7.4, CentOS 7 cannot be supported.

Bug 82461 - [7 Regression] Temporary required for brace-initializing (non-literal-type) member variable

After some troubleshooting, I moved the deployment and compilation servers for related services to CentOS 8, which resolved the issue

Reasonable enumeration

Looking back at the whole issue, there’s actually a simpler and more efficient solution: adjust the enumeration definition to start numbering from 1 instead of 0. This can effectively distinguish between default values and business values, avoiding the aforementioned series of problems.

Why is it more reasonable to start from 1?

In protobuf, enumeration types default to a value of 0. If we define a meaningful business value as 0 (for example, “in-play”), the receiving party cannot determine whether the received 0 is a business value or an unset default value during incremental push. However, if we start defining enumerations from 1, 0 can be reserved for a meaningless default value or an “unknown” state, and the problem is easily resolved.

Suggested practices:

When designing protobuf enums, always define 0 as a meaningless default value (such as UNKNOWN or RESERVED) Assign actual business values starting from 1, ensuring they are distinct from the default value of 0

This small adjustment not only resolved the issue of identifying trading periods, but also provided a valuable lesson for future protocol design

A financial IT programmer's tinkering and daily life musings
Built with Hugo
Theme Stack designed by Jimmy