Upgrading the GCC version caused program crashes: hidden issues due to code non-compliance.

In the same business code scenario, the program compiled and ran normally in a CentOS 7 environment. However, when switching to CentOS 8 and using an updated version of GCC for compilation, the program crashed. It’s worth noting that the issue only occurs in Release mode, while Debug mode does not exhibit any problems. This is the first time we’ve encountered a situation like this; after three days of investigation, we finally identified the root cause.

Problem Identification

After investigation, the root cause of the issue was the function lacked a return value. In Release mode, new versions of GCC perform more optimizations, which caused an unknown logic to occur within the function that originally did not have an explicit return value during execution, ultimately triggering a crash. Our conclusion is that compiler warnings should not be ignored, especially in legacy projects where some warnings may be dismissed, but it’s also important to avoid suppressing all warnings.

Environment Details

  • CentOS 7 GCC Version:
  • CentOS 8 GCC Version:

Crash Phenomena

When analyzing the stack information for program crashes, we observed the following stack details:

[New LWP 1385902]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./pstack_main`.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007ffe894b4420 in ?? ()
(gdb) bt
#0  0x00007ffe894b4420 in ?? ()
#1  0x00000000004008e9 in main ()

This stack doesn’t appear intuitive; the crash function’s stack information shows a ??, which makes troubleshooting even more complex.

Code Example

To better understand the issue, here is a minimal code example that reproduces the crash:

#include <iostream>
#include <map>

int test() {
    std::cout << "1" << std::endl;
}

int main() {
    test();
    return 0;
}

The test() function in this code clearly does not explicitly return a value, and its return type is int. According to the C++ standard, when a function is declared as an int type, it must have a return value, otherwise it may lead to undefined behavior.

Compilation Warning

In our project, the CMake script suppresses many compile-time warnings, including the following:

/root/pstack/main.cpp: In function ‘int test()’:
/root/pstack/main.cpp:7:1: warning: no return statement in function returning non-void [-Wreturn-type]

This warning indicates that the test() function does not return a value, which is the root cause of the problem. Newer versions of GCC (such as 8.5.0) may make unstable optimizations with this undefined behavior when optimizing code, potentially leading to program crashes.

Assembly Code Differences

To explain the differences in GCC compiler optimization behavior, we compared assembly code generated by different versions of GCC:

  • GCC 4.8.5 Generated Assembly Code:

  • The assembly code is relatively verbose and includes handling logic for standard output streams (such as std::cout). This indicates that the compiler performed more conservative optimizations, not optimizing excessively for the missing return value issue in the test() function, possibly to avoid a crash.

  • GCC 8.5.0 Generated Assembly Code:

  • The new version of GCC performed more optimizations, reducing the code volume. However, this optimization may have resulted in unpredictable behavior when executing functions without returning values, leading to program crashes.

Conclusion

Through this troubleshooting process, we deeply realized that in C++, function return values must be explicit, particularly when a function is declared as int, a return value must be provided. When upgrading from older versions of compilers to newer versions of GCC, more optimization and stricter warning mechanisms may be encountered. Therefore, we recommend not disabling all warnings during compilation, but rather selectively addressing them, especially common issues such as function return values and type matching. Ultimately, by adding a return value to the test() function, the problem was resolved, and the program returned to normal operation.

A financial IT programmer's tinkering and daily life musings
Built with Hugo
Theme Stack designed by Jimmy